Hello,
I am trying to fit a 5-variate model. But I am not sure about the best way to do it.
I have got these results from the univariate analyses
VAR1
ADE 0.34 0.10 0.57
AE 0.42 * 0.58
VAR2
ADE 0.26 0.16 0.58
AE 0.41 * 0.59
VAR3
ACE 0.29 0.13 0.58
AE 0.45 * 0.55
CE * 0.33 0.67
VAR4
ADE 0.23 0.17 0.59
AE 0.39 * 0.61
VAR5
ACE 0.35 0.05 0.60
AE 0.41 * 0.59
CE * 0.29 0.71
Therefore, I have:
AE model for Variable 1
AE model for Variable 2
ACE model for Variable 3
ADE model for Variable 4
AE model for Variable 5
I have fitted several models but I am not sure what model should I select.
5-ACE model: BIC= -310035.618; AIC=-10692.932
5-ADE model BIC= -310065.74; AIC=--10723.10
5-AE model BIC= -310144.380; AIC= -10703.607
*The comparison between ACE and AE Is non-significant therefore I can select the AE model but the comparison between ADE and AE is significant therefore I cannot select the nested model
I have also tried to fit a 5variables-ACE-ADE model (BIC=-310104.987; AIC=-10696.909) with (C for variables 3 and 5 and D for variables 1,2 and 4).
However, I am not sure if I adapted it correctly since I am getting much lower values for C (near 0) and D in comparison with the univariate results or 5-ACE or ADE models.
*The comparison between ADCE and AE is non-significant.
Here is my adapted script.
# Select Variables for Analysis vars <- c('VAR1','VAR2','VAR3','VAR4', "VAR5" ) # list of variables names nv <- 5 # number of variables ntv <- nv*2 # number of total variables selVars <- paste(vars,c(rep(1,nv),rep(2,nv)),sep="") # Select Covariates for Analysis covVars <- c('age', "Sex1" , "Sex2") nc <- 3 # number of covariates ordData <- twinData # Select Data for Analysis mzData <- subset(ordData, Zyg==1| Zyg==3, c(selVars, covVars)) dzData <- subset(ordData, Zyg==2 | Zyg==4| Zyg==5, c(selVars, covVars)) # Set Starting Values svMe <- c(1.2,1.1,1.1, 1.2,1.2) # start value for means svPa <- .4 # start value for path coefficient svPaD <- vech(diag(svPa,nv,nv)) # start values for diagonal of covariance matrix svPe <- .8 # start value for path coefficient for e svPeD <- vech(diag(svPe,nv,nv)) # start values for diagonal of covariance matrix lbPa <- .0001 # start value for lower bounds lbPaD <- diag(lbPa,nv,nv) # lower bounds for diagonal of covariance matrix lbPaD[lower.tri(lbPaD)] <- -10 # lower bounds for below diagonal elements lbPaD[upper.tri(lbPaD)] <- NA # lower bounds for above diagonal elements # Create Labels labMe <- paste("mean",vars,sep="_") labBe <- labFull("beta",nc,1) # ------------------------------------------------------------------------------ # PREPARE MODEL # ADCE Model defAge <- mxMatrix( type="Full", nrow=1, ncol=1, free=FALSE, labels=c("data.age"), name="Age" ) pathB1 <- mxMatrix( type="Full", nrow=5, ncol=1, free=TRUE, values=.01, label=c("b11","b12", "b13","b14","b15"), name="b1" ) defSex <- mxMatrix( type="Full", nrow=1, ncol=10, free=FALSE, labels=c("data.Sex1", "data.Sex1", "data.Sex1", "data.Sex1","data.Sex1", "data.Sex2","data.Sex2","data.Sex2", "data.Sex2", "data.Sex2"), name="Sex" ) pathB2 <- mxMatrix( type="Full", nrow=1, ncol=10, free=TRUE, values=.01, label=c("b21", "b22","b23","b24","b25", "b21", "b22", "b23","b24","b25"), name="b2" ) # Create Algebra for expected Mean Matrices meanG <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=svMe, labels=labMe, name="meanG" ) expMean <- mxAlgebra( expression= meanG + cbind(t(b1%*%Age),t(b1%*%Age))+b2*Sex, name="expMean" ) # Create Matrices for Path Coefficients pathA <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=svPaD, label=labLower("a",nv), lbound=lbPaD, name="a" ) pathC <- mxMatrix( type="Full", nrow=nv, ncol=nv, free=c(F,F,T,F,T), values=c(0,0,0.3,0,0.3), label=c(NA,NA,"c11",NA,"c22"), name="c" ) pathD <- mxMatrix( type="Full", nrow=nv, ncol=nv, free=c(T,T,F,T,F), values=c(0.3,0.3,0,0.3,0), label=c("d11","d22",NA,"d33",NA), name="d" ) pathE <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=svPeD, label=labLower("e",nv), lbound=lbPaD, name="e" ) # Create Algebra for Variance Comptwonts covA <- mxAlgebra( expression=a %*% t(a), name="A" ) covC <- mxAlgebra( expression=c %*% t(c), name="C" ) covD <- mxAlgebra( expression=d %*% t(d), name="D" ) covE <- mxAlgebra( expression=e %*% t(e), name="E" ) # Create Algebra for expected Variance/Covariance Matrices in MZ & DZ twins covP <- mxAlgebra( expression=A+C+D+E, name="V" ) covMZ <- mxAlgebra( expression= A+C+D, name="cMZ" ) covDZ <- mxAlgebra( expression= 0.5%x%A+C+0.25%x%D, name="cDZ" ) expCovMZ <- mxAlgebra( expression= rbind( cbind(V, cMZ), cbind(t(cMZ), V)), name="expCovMZ" ) expCovDZ <- mxAlgebra( expression= rbind( cbind(V, cDZ), cbind(t(cDZ), V)), name="expCovDZ" ) # Create Algebra for Standardization matI <- mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I") invSD <- mxAlgebra( expression=solve(sqrt(I*V)), name="iSD") # Calculate genetic and environmental correlations corA <- mxAlgebra( expression=solve(sqrt(I*A))%&%A, name ="rA" ) #cov2cor() corC <- mxAlgebra( expression=solve(sqrt(I*C))%&%C, name ="rC" ) corD <- mxAlgebra( expression=solve(sqrt(I*D))%&%D, name ="rD" ) corE <- mxAlgebra( expression=solve(sqrt(I*E))%&%E, name ="rE" ) # Create Data Objects for Multiple Groups dataMZ <- mxData( observed=mzData, type="raw" ) dataDZ <- mxData( observed=dzData, type="raw" ) # Create Expectation Objects for Multiple Groups expMZ <- mxExpectationNormal( covariance="expCovMZ", means="expMean", dimnames=selVars ) expDZ <- mxExpectationNormal( covariance="expCovDZ", means="expMean", dimnames=selVars ) funML <- mxFitFunctionML() # Create Model Objects for Multiple Groups pars <- list(pathB1, pathB2, meanG, matI, invSD, pathA, pathC,pathD, pathE, covA, covC,covD, covE, covP, corA, corC,corD, corE) defs <- list(defAge, defSex) modelMZ <- mxModel( name="MZ", pars, defs, expMean, covMZ, expCovMZ, dataMZ, expMZ, funML ) modelDZ <- mxModel( name="DZ", pars, defs, expMean, covDZ, expCovDZ, dataDZ, expDZ, funML ) multi <- mxFitFunctionMultigroup( c("MZ","DZ") ) # Create Algebra for Variance Components rowVC <- rep('VC',nv) colVC <- rep(c('A','C','D','E','SA','SC','SD','SE'),each=nv) estVC <- mxAlgebra( expression=cbind(A,C,D,E,A/V,C/V,D/V,E/V), name="VC", dimnames=list(rowVC,colVC)) # Create Confidence Interval Objects ciADCE <- mxCI(c( "MZ.rA", "MZ.rC", "MZ.rE","MZ.A","MZ.C","MZ.E","VC")) # Build Model with Confidence Intervals modelADCE <- mxModel( "twoACDEca", pars, modelMZ, modelDZ, multi, estVC, ciADCE ) # ------------------------------------------------------------------------------ # RUN MODEL # Run ACE Model fitADCE <- mxTryHard( modelADCE, intervals=F ) sumADCE <- summary( fitADCE ) # Compare with Saturated Model mxCompare( fit, fitADCE ) # Print Goodness-of-fit Statistics & Parameter Estimates fitGofs(fitADCE) fitEsts(fitADCE) # ------------------------------------------------------------------------------ # RUN SUBMODELS modelACE <- mxModel( fitADCE, name="twoACEmodca" ) modelACE <- omxSetParameters(modelACE, labels=c("d11","d22","d33"),free=FALSE, values = 0) fitACE <- mxTryHard( modelACE, intervals=F ) mxCompare( fitADCE, fitACE ) fitGofs(fitACE) fitEsts(fitACE) # ------------------------------------------------------------------------------ # RUN SUBMODELS modelADE <- mxModel( fitADCE, name="twoADEmodca" ) modelADE <- omxSetParameters(modelADE, labels=c("c11","c22"),free=FALSE, values = 0) fitADE <- mxTryHard( modelADE, intervals=F ) mxCompare( fitADCE, fitADE ) fitGofs(fitADE) fitEsts(fitADE) # Run AE model modelAE <- mxModel( fitADCE, name="twoAEca" ) modelAE <- omxSetParameters(modelAE, labels=c("d11","d22","d33","c11","c22"),free=FALSE, values = 0) fitAE <- mxRun( modelAE, intervals=F ) mxCompare( fitADCE, fitAE ) fitGofs(fitAE) fitEsts(fitAE)
Hi Juan,
As I understand the question, it seems like you'd like advice about selecting ACE vs ADE vs AE models when you have 5 phenotypes. My recommendation (and others are welcome to join in with theirs) is to basically ignore the single-phenotype results. Rather, run the model selection based on all five phenotypes. For the 5-phenotype model to run, I'd start with the least theoretical BG model: the Cholesky. Run Cholesky models with ACE, ADE, and AE components. See which seems to be the best description of the data.
You might consider using the umx package for much of this. It's quite easy to specify these models with far less error-prone code, and you still get OpenMx running it underneath.
Cheers,
Mike
Hi! I stumbled by chance over this post and I am curious about the approach of Juan as I could not find any reference to it in the literature though the underlying problem ought to happen from time to time: What do you do in a multivariate model if one of your variables has a strong C-component while another one exhibits no C but significant non-additivity? Can you theoretically fit a multivariate model with a C-component for the former and a D-component for the latter variable as Juan tried? And either way - what would you make of the fact that parameters in the multivariate model change drastically compared to the univariate models? Does this indicate an issue with either the specification of the multivariate or the univariate model(s) or can such a shift be interpreted in a theoretically sound way?
Thanks for your help!
Tobias
Sure. You just have to be careful that the appropriate parameters are fixed to zero, for model identification.
Maybe in some cases it would. But look at it this way. If you go from fitting five monophenotype models to fitting one pentaphenotype model as Juan did, you suddenly go from decomposing the variance of one trait at a time to decomposing both the the five traits' variances and their covariances. That adds a lot of new information! Besides, you're also now estimating more free parameters from the same-size sample, so your parameter estimates will be subject to more sampling error.
Fascinating. I'm looking forward to experiment a little bit with this approach. Thanks!