Model including groups with both binary and continuous/ordinal data
Attachment | Size |
---|---|
twinModels.R | 9.21 KB |
I have heard that it is now possible to run a model estimating genetic and environmental effects with twin data that includes separate variables with binary and continuous (or ordinal?) coding.
I have been running a thresholds model with age and sex as covariates (which I will attach) and I would like to include the same behavioural measures from different data sets in the one model.
Two different scales have been used to collect this information (binary and continuousish), and I want to equate the A, C and E paths across datasets with the differing scales. I have been told this will necessitate the inclusion of a group for each dataset. But I am also unsure how to do this.
Is this possible and can you help? - and forgive me if I am missing some important technical detail.
Best wishes
Jane
Can be done but there's likely a problem.
Since the scales are different, there is likely to be differences in mean and variance between the two. Possibly, what you want to know is whether the proportions of A C and E are the same for the two scales?
The approach you take depends on whether the two scales were obtained from the same or different pairs of twins (i.e. the same sample for both scales or different samples). In the former case, you would not increase the number of groups, but make a multivariate model. In the latter - different samples - case you would add groups. Equating the proportions of variance could be done at least two ways. The simplest would be to use algebras to compute, e.g., A/(A+C+E) for the two measures, and to add a non-linear constraint that equates the two (all three could be cbind()'ed on each side of the ==, but since they are proportions, two would be sufficient). In the case of measures on the same sample, you would set up a multivariate analysis (say with a Cholesky model) and equate the two diagonal elements of the (now 2x2) A/(A+C+E) matrix and the C/(A+C+E) matrix. Extracting the elements from the diagonal could be done with A[1,1] type notation, so you might try A[1,1]/(A[1,1]+C[1,1}+E[1,1] == A[2,2]/(A[2,2]+C[2,2]+E[2,2]. I think. If it doesn't work then, well then we'll learn something making it do so :).
HTH
Mike
Log in or register to post comments
In reply to Can be done but there's likely a problem. by neale
I am using data from two
Unfortunately I haven't been able to find a model running both continuous and ordinal /binary data including separate groups. So what I have done is put together a mishmash of script that I though might work - but I don't really understand how the matrices will combine and the use of certain code. I will attach my script.
There is a matrix used to specify the values for each a, c and e path (cholSVnv) I'm not sure why - and I don't understand the abbreviation. I'm also unsure where and how to add additional groups. I thought perhaps as two sets of mz and dz models (one for each sample). And I'm not really up to interpreting the meaning of your comments yet Mike, sorry.
Can you please give me a tip on (1) how to add a calculation group, (2) whether I need a lower cholesky for 2 vars to represent the a, c and e paths (if that's what they are), and (3) what I need to specify in the objective function to include both variables in the model and (4) the order of how the matrices combine to provide the output.
perhaps I'm asking too much?
Jane
Log in or register to post comments
In reply to I am using data from two by ebejer
Not too much but I'm a bit pressed for time just now
If you have two univariate scripts, one for the ordinal, and one for the continuous data, I would be tempted to try combining them with something like the following
jointACE <- mxModel(aceOrd, aceCont, mxAlgebraObjective(aceOrd.Objective+aceCont.Objective))
at least this would get you started. In fact I think you are on the right track with your script as is, but you need an mxAlgebraObjective to combine the objective functions from all four groups. Something like
mxAlgebra( expression=MZ1.objective+DZ1.objective+MZ2.objective+DZ2.objective, name="minus2sumloglikelihood" ),
mxAlgebraObjective("minus2sumloglikelihood")
Adding the mxConstraint would be needed to equate the heritabilities, per previous post.
Log in or register to post comments
In reply to Not too much but I'm a bit pressed for time just now by neale
Hello again, I have found a
I have found a model that seems to combine the two separate datasets, however it returns a code red and there are two SD's that are quite large - I have played around with the starting values and the threshold and what I have in the script appears to work best. I am wanting to get some feedback on the 'reasonableness' of the model I have specified and what I might need to change to change the code red status and reduce the standard errors. Any suggestions will be greatly appreciated (I'll attach the output and script below).
Thanks,
Jane
Log in or register to post comments
In reply to Hello again, I have found a by ebejer
We have a version of the
The only way I can think of to 'equate' the standardised A,C and E estimates is through the mxConstraint function.
This does work and the results look sensible. But because we've used mxConstraint we lose the SE.
We can still get CI but it would be nice to have the se as well.
Can anyone think of an alturnate apporach?
thanks
Sarah (on behalf of Jane)
Log in or register to post comments
Thanks Mike, I'll start
j
Log in or register to post comments
Aha
Log in or register to post comments
Problem with bivariate model
I am new to openMx, and I cannot figure out what is wrong in my model. I am trying to estimate a bivariate model in which one variable is continuous and the other is binary (for which I preliminarily used mxFactor). When I try to estimate the model, I obtain an error which says that my continuous variable is not categorical and that I should use mxFactor()...
I am pretty sure something is wrong in my model specification.
Here my model. Thanks for any help!
Massimiliano
# Continuous variable
nvC <- 1 # number of continuous variables
ntvC <- nv*2 # number of total continuous variables
selVarsC <- c('varC')
# Binary variable
nvB <- 1 # number of binary variables
ntvB <- nv*2 # number of total binary variables
selVarsB <- c('varB')
# Set Starting Values
svMe <- 7 # start value for means
svTh <- 0
svPa <- .2 # start value for path coefficient
svPe <- .5 # start value for path coefficient for e
# ----------------------------------------------------------------------------------------------------------------------
# PREPARE MODEL
# Create Algebra for expected Mean Matrices
meanG <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
# Create Algebra for expected Thresholds Matrices
threshold <- mxMatrix( type="Full", nrow=1, ncol=ntvB, free=TRUE, values=svTh, labels=labVars("thre", selVarsB), name="threshold" )
# Create Matrices for Variance Components
covA <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPa,nv), labels=labLower("VA",nv), name="VA" )
covC <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPa,nv), labels=labLower("VC",nv), name="VC" )
covE <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPe,nv), labels=labLower("VE",nv), name="VE" )
# Create Algebra for expected Threshold Matrices
expThresh <- mxAlgebra( expression= threshold , name="expThresh" )
# Create Algebra for expected Variance/Covariance Matrices in MZ & DZ twins
covP <- mxAlgebra( expression= VA+VC+VE, name="V" )
covMZ <- mxAlgebra( expression= VA+VC, name="cMZ" )
covDZ <- mxAlgebra( expression= 0.5%x%VA+ VC, name="cDZ" )
expCovMZ <- mxAlgebra( expression= rbind( cbind(V, cMZ), cbind(t(cMZ), V)), name="expCovMZ" )
expCovDZ <- mxAlgebra( expression= rbind( cbind(V, cDZ), cbind(t(cDZ), V)), name="expCovDZ" )
# Create Data Objects for Multiple Groups
dataMZ <- mxData( observed=mzData, type="raw" )
dataDZ <- mxData( observed=dzData, type="raw" )
# Create Expectation Objects for Multiple Groups
expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=selVars , thresholds="threshold")
expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=selVars , thresholds="threshold")
funML <- mxFitFunctionML()
# Create Model Objects for Multiple Groups
pars <- list( meanG, threshold, covA, covC, covE, covP )
modelMZ <- mxModel( pars, covMZ, expCovMZ, dataMZ, expMZ, funML, name="MZ" )
modelDZ <- mxModel( pars, covDZ, expCovDZ, dataDZ, expDZ, funML, name="DZ" )
multi <- mxFitFunctionMultigroup( c("MZ","DZ") )
# Create Algebra for Standardization
matI <- mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I" )
invSD <- mxAlgebra( expression=solve(sqrt(I*V)), name="iSD" )
# Calculate genetic and environmental correlations
corA <- mxAlgebra( expression=solve(sqrt(I*VA))%&%VA, name ="rA" ) #cov2cor()
corC <- mxAlgebra( expression=solve(sqrt(I*VC))%&%VC, name ="rC" )
corE <- mxAlgebra( expression=solve(sqrt(I*VE))%&%VE, name ="rE" )
# Calculate Phenotypic Correlation
corP <- mxAlgebra (expression=solve(sqrt(I*V)) %*% V %*% solve(sqrt(I*V)), name="rP")
# Create Algebra for Variance Components
rowUV <- rep('UV',nv)
colUV <- rep(c('VA','VC','VE'),each=nv)
estUV <- mxAlgebra( expression=cbind(VA,VC,VE), name="UV", dimnames=list(rowUV,colUV) )
rowSV <- rep('SV',nv)
colSV <- rep(c('SA','SC','SE'),each=nv)
estSV <- mxAlgebra( expression=cbind(VA/V,VC/V,VE/V), name="SV", dimnames=list(rowSV,colSV) )
# Create Confidence Interval Objects
ciUV <- mxCI( c("UV") )
ciSV <- mxCI( c("SV") )
# Build Model with Confidence Intervals
calc <- list( matI, invSD, corA, corC, corE,corP, estUV, estSV, ciUV, ciSV )
modelACE <- mxModel( "twoACEvc", pars, modelMZ, modelDZ, multi, calc )
# ----------------------------------------------------------------------------------------------------------------------
# RUN MODEL
# Run ACE Model
fitACE <- mxRun( modelACE, intervals=T )
sumACE <- summary( fitACE )
Log in or register to post comments
In reply to Problem with bivariate model by massimiliano.orri
selVars?
dimnames=selVars
as argument tomxExpectationNormal()
, but objectselVars
is nowhere defined in your script. But the fact that your script runs at all means thatselVars
existed in your R workspace. I betselVars
, however it was defined, doesn't contain the column names for yourmxFactor
'd binary variable.Log in or register to post comments
In reply to selVars? by AdminRobK
Follow-up question
A follow up question, as I am still struggling specifying this model...I think my problem is that I don't understand how to specify the expectationObject when I have 1 continuous and 1 binary variable. The dimension of the matrices and the dimnames I specified is always wrong.
If I specify only the 2 binary variables (for twin 1 and twin2) in the dimnames option, it seems that I am missing 2 variables, because the expected covariance matrix has 4 rows and columns. I specified as follows:
varsC <- c('y1')
nvC <- 1 # number of continuous variables
ntvC <- nvC*2 # number of total continuous variables
selVarsC <- paste(varsC,c(rep("_T1",nvC),rep("_T2",nvC)),sep="")
varsB <- c('u1')
nvB <- 1 # number of binary variables
ntvB <- nvB*2 # number of total binary variables
selVarsB <- paste(varsB,c(rep("_T1",nvB),rep("_T2",nvB)),sep="")
# Create Matrices for expected Mean Matrices
meanG <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
# Create Matrices for expected Thresholds Matrices
threshold <- mxMatrix( type="Full", nrow=1, ncol=ntvB, free=TRUE, values=svTh, labels=labVars("thre",selVarsB), name="threshold" )
[...]
# Create Expectation Objects for Multiple Groups
expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB), thresholds="threshold")
expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB), thresholds="threshold")
and the error I obtains is:
Error: The expected covariance matrix associated with the expectation function in model 'MZ' is not of the same length as the 'dimnames' argument provided by the expectation function. The 'dimnames' argument is of length 2 and the expected covariance matrix has 4 rows and columns.
If I specify all my variables in the dimnames option (as follows):
expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold")
expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold")
I obtain the following error, of course:
Error: The expected means vector associated with the expectation function in model 'MZ' is not of the same length as the 'dimnames' argument provided by the expectation function. The 'dimnames' argument is of length 4 and the expected means vector has 2 columns.
Should I perhaps have different expectation Objects for continuous and binary variables?
Or how can I specify that 2 variables are referring to the mean matrix while the 2 other variables to the threshold matrix?
I looked for an exemple of the openMx documentation, but I could not resolve the issue so far.
Thanks again for any help!
Log in or register to post comments
In reply to Follow-up question by massimiliano.orri
Maybe I solved the issue
It seems that now it's running...
# Create Expectation Objects for Multiple Groups
expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB,selVarsC),
thresholds="threshold", threshnames=selVarsB)
expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB,selVarsC),
thresholds="threshold", threshnames=selVarsB)
funML <- mxFitFunctionML()
Log in or register to post comments
In reply to Maybe I solved the issue by massimiliano.orri
still not right
That's right.
Do I understand correctly that the
dimnames
you're passing tomxExpectationNormal()
are, in order, "u1_T1", "u1_T2", "y1_T1", "y2_T2"? If so, something's wrong. From the way your script is written, it should be both traits for twin #1 followed by both traits for twin #2, e.g. "u1_T1", "y1_T1", "u1_T2", "y1_T2".I also don't see anything in your script that fixes either the mean or the threshold for the binary trait, nor anything that identifies the scale (variance) of the binary trait. Just to be clear: when you analyze a threshold variable in OpenMx, the variable is treated as though it reflects an unobserved, normally distributed continuum that has been "chopped" into two or more ordered categories. What is really being modeled is that latent Gaussian trait. In the binary case, in order to identify your model, you need to assume some value for either the latent trait's mean or the value of the threshold, and you need to assume a value for some parameter that scales the latent trait. I can't really make any specific suggestions about that until I'm sure that you have the dimnames for your expectations correct. I am actually surprised that your syntax runs with this,
meanG <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
, which only has 2 columns when it ought to have 4 (i.e., the binary variables need model-expected means, too).
Log in or register to post comments