Hello,
I am trying to do a univariate analysis with a binary variable. I am not sure how to treat siblings
I have created different groups MZ, DZ, SIB (which I think is the same as introduce SIB in the DZ group).
I have also checked if DZ and SIB can be equated through the saturated model and there are not significant differences when I equated means and variances between DZ and Sib
I would like to know if I can use the siblings in my analysis (correlations are a little bit different) or should I do something different to introduce siblings in my analysis?
I have read this thread "https://openmx.ssri.psu.edu/thread/4086" but I am not sure if I need to check the "twin effects" and what should I change in the script
Thank you so much
Here the script:
# Select Variables for Analysis vars <- 'APN' # list of variables names nv <- 1 # number of variables ntv <- nv*2 # number of total variables selVars <- paste(vars,c(rep(1,nv),rep(2,nv)),sep="") # Select Covariates for Analysis covVars <- c('age1',"age2", "Sex1" , "Sex2") nc <- 4 # Select Data for Analysis ordData <- twinData mzData <- subset(ordData, Zyg==1| Zyg==3, c(selVars, covVars)) dzData <- subset(ordData, Zyg==2 | Zyg==4| Zyg==5 , c(selVars, covVars)) sibData <- subset(ordData,Zyg==6| Zyg==7 | Zyg==8 , c(selVars, covVars)) mzDataF <- mzData dzDataF <- dzData mzDataF$APN1 <- mxFactor(mzDataF$APN1, levels =c(0,1)) mzDataF$APN2 <- mxFactor(mzDataF$APN2, levels =c(0,1)) dzDataF$APN1 <- mxFactor(dzDataF$APN1, levels =c(0,1)) dzDataF$APN2 <- mxFactor(dzDataF$APN2, levels =c(0,1)) sibData$APN1 <- mxFactor(sibData$APN1, levels =c(0,1)) sibData$APN2 <- mxFactor(sibData$APN2, levels =c(0,1)) # Set Starting Values svTh <- .8 # start value for thresholds svPa <- .4 # start value for path coefficient svPe <- .8 # start value for path coefficient for e lbPa <- .0001 # start value for lower bounds # ------------------------------------------------------------------------------ # PREPARE MODEL # ACE Model # Create Matrices for Covariates and linear Regression Coefficients defAge <- mxMatrix( type="Full", nrow=1, ncol=2, free=FALSE, labels=c("data.age1", "data.age2"), name="Age" ) pathB1 <- mxMatrix( type="Full", nrow=1, ncol=2, free=TRUE, values=.01, label=c("b11", "b12"), name="b1" ) defSex <- mxMatrix( type="Full", nrow=1, ncol=2, free=FALSE, labels=c("data.Sex1","data.Sex2"), name="Sex" ) pathB2 <- mxMatrix( type="Full", nrow=1, ncol=2, free=TRUE, values=.01, label=c("b21", "b22"), name="b2" ) # Create Algebra for expected Mean & Threshold Matrices meanG <- mxMatrix( type="Zero", nrow=1, ncol=ntv, name="meanG" ) expMean <- mxAlgebra( expression= meanG + b1*Age + b2*Sex, name="expMean" ) threG <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=svTh, labels="tob", name="threG" ) # Create Matrices for Path Coefficients pathA <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=svPa, label="a11", lbound=lbPa, name="a" ) pathC <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=svPa, label="c11", lbound=lbPa, name="c" ) pathE <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=svPe, label="e11", lbound=lbPa, name="e" ) # Create Algebra for Variance Components covA <- mxAlgebra( expression=a %*% t(a), name="A" ) covC <- mxAlgebra( expression=c %*% t(c), name="C" ) covE <- mxAlgebra( expression=e %*% t(e), name="E" ) # Create Algebra for expected Variance/Covariance Matrices in MZ & DZ twins covP <- mxAlgebra( expression= A+C+E, name="V" ) covMZ <- mxAlgebra( expression= A+C, name="cMZ" ) covDZ <- mxAlgebra( expression= 0.5%x%A+ C, name="cDZ" ) covSIB <- mxAlgebra( expression= 0.5%x%A+ C, name="cSIB" ) expCovMZ <- mxAlgebra( expression= rbind( cbind(V, cMZ), cbind(t(cMZ), V)), name="expCovMZ" ) expCovDZ <- mxAlgebra( expression= rbind( cbind(V, cDZ), cbind(t(cDZ), V)), name="expCovDZ" ) expCovSIB <- mxAlgebra( expression= rbind( cbind(V, cSIB), cbind(t(cSIB), V)), name="expCovSIB" ) # Constrain Variance of Binary Variables var1 <- mxConstraint( expression=diag2vec(V)==1, name="Var1" ) # Create Data Objects for Multiple Groups dataMZ <- mxData( observed=mzDataF, type="raw" ) dataDZ <- mxData( observed=dzDataF, type="raw" ) dataSIB <- mxData( observed=sibData, type="raw" ) # Create Expectation Objects for Multiple Groups expMZ <- mxExpectationNormal( covariance="expCovMZ", means="expMean", dimnames=selVars, thresholds="threG" ) expDZ <- mxExpectationNormal( covariance="expCovDZ", means="expMean", dimnames=selVars, thresholds="threG" ) expSIB <- mxExpectationNormal( covariance="expCovSIB", means="expMean", dimnames=selVars, thresholds="threG" ) funML <- mxFitFunctionML() # Create Model Objects for Multiple Groups pars <- list(pathB1, pathB2, meanG, threG, pathA, pathC, pathE, covA, covC, covE, covP) defs <- list(defAge, defSex) modelMZ <- mxModel( name="MZ", pars, defs, expMean, covMZ, expCovMZ, dataMZ, expMZ, funML ) modelDZ <- mxModel( name="DZ", pars, defs, expMean, covDZ, expCovDZ, dataDZ, expDZ, funML ) modelSIB <- mxModel( name="SIB", pars, defs, expMean, covSIB, expCovSIB, dataSIB, expSIB, funML ) multi <- mxFitFunctionMultigroup( c("MZ","DZ","SIB") ) # Create Algebra for Variance Components rowVC <- rep('VC',nv) colVC <- rep(c('A','C','E','SA','SC','SE'),each=nv) estVC <- mxAlgebra( expression=cbind(A,C,E,A/V,C/V,E/V), name="VC", dimnames=list(rowVC,colVC)) # Create Confidence Interval Objects ciACE <- mxCI( "VC" ) # Build Model with Confidence Intervals modelACE <- mxModel( "oneACEba", pars, var1, modelMZ, modelDZ,modelSIB, multi, estVC, ciACE ) # ------------------------------------------------------------------------------ # RUN MODEL # Run ACE Model fitACE <- mxRun( modelACE, intervals=T ) sumACE <- summary( fitACE )
At minimum, change the appropriate lines to this instead:
If you were to fix parameter "tw11" to zero, siblings and DZ twins would have equivalent model-expectations.
Thank you so much Rob.
I have used that in a univariate model and it works perfect!
I am also working with a tetravariate model (3 continuous variables and 1 binary).
I have made the changes and I have compared the ADE model (with TW effect) and ADE model (without TW effect) the comparison is non-significant p=0.96 and TW is less than 1% in all the variables
So, can I assume that there are not differences between sib and DZ and use Sib in my analysis?
I am also a little bit confuse about “Cholesky decomposition” and “correlated factor model” my results are from the Cholesky decomposition, right? How can I check the correlated factor model? Do I need a different script? I have searched it in Boulder materials but I am not completely sure.
Finally, can I know with this script the genetic/environmental influences shared by the four variables at the same time?
Thank you so much in advance!!!!
It would be more accurate to say that you find no evidence of twin-specific environmental effects, and therefore, you chose to treat full sibs like DZ twins in subsequent analyses.
Why do you ask? In most cases, the Cholesky and correlated-factor "models" are equivalent parameterizations of the same model. The only exception I can think of offhand is in the case of "non-scalar" or "qualitative" sex-limitation.
Yes, in the sense that it will estimate for you a 4x4 A, D, and E covariance matrix .
I thought they were different models (I am confused about it, sorry). Where can I find more information about that? more suggestions apart from Boulder’s materials and Neale’s book?
Here my results from the 4-trait script.
But I would like to know the genetic influences (also environmental influences) shared between the four variables. I mean I can know it for v1-v2, v1-v3,…… but can I know the shared genetic influences between v1, v2, v3 and v4 together?
By the way since I have a negative value for example in MZ.E[3,1] I can not know the % of the phenotypic correlation explained by genetic factors right?
Thank you so much again!!
Hmm. This is the kind of thing one typically learns from one's dissertation adviser and/or at behavior-genetics workshops (like the one in Boulder). Offhand, I can think of two articles that are somewhat relevant: Loehlin (1996) and Neale, Røysamb, & Jacobson (2006).
Oh, I see. Then, it sounds like you want to fit a common-pathway and/or independent-pathway model.
You can still estimate its contribution to the phenotypic correlation, but interpreting it as a proportion doesn't really make sense because of the negative value.
Thank you so much Rob!
I am going to work on the common-pathway and independent-pathway models.
Hello,
I am new to OpenMX, R, and ACTE modeling and am seeking feedback on my attempt had performing something similar to the OP.
In my instance, I am trying to do a univariate analysis with a continuous variable. My sample contains MZ, DZ, and siblings, and my siblings contain more than 2; in some cases they have 4 - 6 siblings. I followed the information below and incorporated siblings into my syntax. However, I received an error and I am unable to detect the source.
Any help would be greatly appreciated. Thank you!
> # Run ACE Model
> fitACE <- mxRun( modelACE, intervals=T )
Error in value[rows[[i]], cols[[i]]] <- startValue :
incorrect number of subscripts on matrix
> sumACE <- summary( fitACE )
Error in summary(fitACE) : object 'fitACE' not found
Please run
traceback()
after you see this error. What doestraceback()
give you?> traceback()
20: populateDefVarMatrix(matrix, model, defvar.row)
19: computeMatrix(matrix, model, labelsData, defvar.row, env, cache)
18: evaluateMatrix(lookup, model, labelsData, env, show, compute,
defvar.row, cache)
17: evaluateSymbol(formula, contextString, model, labelsData, env,
compute, show, outsideAlgebra, defvar.row, cache)
16: evaluateExpression(formula[[i]], contextString, model, labelsData,
env, compute, show, outsideAlgebra, defvar.row, cache)
15: evaluateExpression(formula[[i]], contextString, model, labelsData,
env, compute, show, outsideAlgebra, defvar.row, cache)
14: evaluateExpression(formula[[i]], contextString, model, labelsData,
env, compute, show, outsideAlgebra, defvar.row, cache)
13: evaluateExpression(algebra@formula, contextString, model, labelsData,
env, compute = TRUE, show, outsideAlgebra = FALSE, defvar.row,
cache)
12: computeAlgebra(algebra, model, labelsData, show = FALSE, defvar.row,
env, cache)
11: evaluateAlgebra(lookup, model, labelsData, env, show, compute,
defvar.row, cache)
10: evaluateSymbol(as.symbol(objname), objname, flatModel, labelsData,
globalenv(), compute = TRUE, show = FALSE, outsideAlgebra = FALSE,
defvar.row = 1, cache = cache)
9: eval(substitute(evaluateSymbol(x, objname, flatModel, labelsData,
globalenv(), compute = TRUE, show = FALSE, outsideAlgebra = FALSE,
defvar.row = 1, cache = cache), list(x = quote(as.symbol(objname)))))
8: eval(substitute(evaluateSymbol(x, objname, flatModel, labelsData,
globalenv(), compute = TRUE, show = FALSE, outsideAlgebra = FALSE,
defvar.row = 1, cache = cache), list(x = quote(as.symbol(objname)))))
7: evaluateMxObject(meansName, flatModel, labelsData, new.env(parent = emptyenv()))
6: updateExpectationDimnames(.Object, flatModel, labelsData)
5: genericExpConvertEntities(expectations[[i]], flatModel, namespace,
labelsData)
4: genericExpConvertEntities(expectations[[i]], flatModel, namespace,
labelsData)
3: expectationFunctionConvertEntities(flatModel, namespace, labelsData)
2: runHelper(model, frontendStart, intervals, silent, suppressWarnings,
unsafe, checkpoint, useSocket, onlyFrontend, useOptimizer,
beginMessage)
1: mxRun(modelACE, intervals = T)
OK. What do you get from
mxVersion()
?Also, does your dataset really contain columns named "age1" thru "age5" and "sex1" thru "sex5"?
Here you go.
OpenMx version: 2.14.11 [GIT v2.14.11-dirty]
R version: R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32
Default optimizer: CSOLNP
NPSOL-enabled?: Yes
OpenMP-enabled?: No
And yes, I have my covariances sex and age labeled for each child. Since the maximum number of children is 6, i created a label for each one, including sex1...sex6 and age1...age6. However, as per my coauthor, we will be moderating by sex, so I will only be controlling for age now.
Would it just be easier to standardize values by age, and avoid this?
Scratch that. We wont be moderating by sex, and will still control for sex and age still.
Your OpenMx version is 2 releases out-of-date. I think the first error message you're seeing is due to a bug, which has hopefully been fixed since version 2.14. First, try updating OpenMx, and see if the problem persists.
OK, I understand now. Just remember that if you use sex as a moderator, you need to also adjust for its main effect.
No, I think your current approach is good.
Hello,
I was able to get everything up and running. Turns out it was how I was loading my data; it should be uploaded as .DAT and as a matrix, not as excel.
Right off the bat, I tested twin environment effects by dropping the T, and while it was not significantly different, the AIC slightly improved, so I can treat DZ and siblings the same.
Question regarding that. Does that mean in the beginning, when we are creating distinct subsets of data for MZ, DZ, and siblings, do we code siblings as DZ?
conData <- twindata
mzData <- subset(conData, zyg==1, c(selVars, covVars))
dzData <- subset(conData, zyg==2 | zyg==3, c(selVars, covVars))
sibData <- subset(conData, zyg==4 | zyg==5, c(selVars, covVars))
becomes
conData <- twindata
mzData <- subset(conData, zyg==1, c(selVars, covVars))
dzData <- subset(conData, zyg==2 | zyg==3 | zyg==4 | zyg==5, c(selVars, covVars))
If this is the case, and we have instances of more than 2 siblings, would this matter?
I don't see why you'd need to change your datasets. Once you drop the T source of variance, then your model should in effect model full sibs the same way it models DZ twins, except that the full sibs in some families will be more like DZ triplets, quadruplets, quintuplets, or sextuplets. Granted, the full sibs in a family won't all have the same age the way both twins in a pair do, but that's why we adjust for the main effects of age.
Wonderful! I figured this was the case, but I wanted to make sure. I will just move forward with ACE rather than ACTE. Thank you!!