Hi, and very belated Happy New Year!
I'm an undergrad working on my dissertation, and I've managed to build my first models thanks to these forums, the Boulder Workshops, as well as Hermine's wonderful stash of scripts. I have several questions, however, just to make sure I'm doing things the right way, and if what I want to do is possible.
My main question is how the variance components of different measures known to affect reading ability (e.g., WM, VSTM) differ across the distribution when you go from readers scoring 1.5 SD below the mean to the crème de la crème.
Topical papers:
Heritability Across the Distribution: An Application of Quantile Regression
(1) Extending from a previous answer is this the correct way to parametrisize non-twin siblings? My data includes families up to nine members, I would hope to utilise these siblings up to four-members families and include a twin-specific effects due to the fact that twins have delayed language development and there seems to be a cognitive cost to being a twin. This is how I have parametrisized the twin-effect by including the paths and matrices for the variance components
# Matrices declared to store a, c, tw, and e Path Coefficients pathA <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPa, nv), labels=labLower("a", nv), name="a",) pathC <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPct, nv), labels=labLower("c", nv), name="c") pathTw <- mxMatrix(type="Lower", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPct, nv), labels=labLower("tw", nv), name="tw") pathE <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPe, nv), labels=labLower("e", nv), name="e") # Matrices generated to hold A, C, T, and E computed Variance Components covA <- mxAlgebra( expression=a %*% t(a), name="A" ) covC <- mxAlgebra( expression=c %*% t(c), name="C" ) covTw <- mxAlgebra( expression=tw %*% t(tw), name="Tw" ) covE <- mxAlgebra( expression=e %*% t(e), name="E" ) covP <- mxAlgebra( expression=A+C+Tw+E, name="V" ) StA <- mxAlgebra( expression=A/V, name="h2" ) StC <- mxAlgebra( expression=C/V, name="c2" ) StE <- mxAlgebra( expression=E/V, name="e2" ) StTw <- mxAlgebra( expression=Tw/V, name="tw2")
as well as the twin covariance matrices
covMZ <- mxAlgebra( expression= rbind( cbind(A+C+Tw+E , A+C+Tw, 0.5%x%A+C, 0.5%x%A+C), cbind(A+C+Tw , A+C+Tw+E, 0.5%x%A+C, 0.5%x%A+C), cbind(0.5%x%A+C, 0.5%x%A+C, A+C+E, 0.5%x%A+C), cbind(0.5%x%A+C, 0.5%x%A+C, 0.5%x%A+C, A+C+E)), name="ExpCovMZ" ) covDZ <- mxAlgebra( expression= rbind( cbind(A+C+Tw+E , 0.5%x%A+C+Tw, 0.5%x%A+C, 0.5%x%A+C), cbind(0.5%x%A+C+Tw , A+C+Tw+E, 0.5%x%A+C, 0.5%x%A+C), cbind(0.5%x%A+C, 0.5%x%A+C, A+C+E, 0.5%x%A+C), cbind(0.5%x%A+C, 0.5%x%A+C, 0.5%x%A+C, A+C+E)), name="ExpCovDZ" )
(2) I have managed to run univariate moderation models using both OpenMx and umx (though umx only takes twins 1 and 2 and leaves the siblings be), but is there a way to realistically run a multivariate moderation model, where the moderator (reading ability) is a continuous variable and is different for each family member? What would be necessary additions to the code to make this happen, and is there some statistical witchcraft beyond my comprehension that makes this a dodgy approach?
Thanks.
For the multivariate extension, letting
nv
be two or more so thatpathA
is a larger matrix of lower triangular Cholesky components and thencovA
is the full covariance of additive genetics for multiple variables basically gets it done. The process is similar for all the other variance components: C, E, Tw.For the moderation, you need to start using definition variables. One definition variable per moderator. Some complications can arise, but these are relatively straightforward extensions of "sex limitation" and "GxE" scripts.
I seem to have forgotten to leave my thanks, your answer was both concise and insightful.
Hi
One thing I think should be modified is the expectation that the phenotypic variance of twins differs from that of non-twins. While this might occur for various reasons, I don't see that it's necessary. Presumably, whatever they are sharing is a source of variation in the general population, and therefore should contribute to non-twins' variances but not to non-twins covariances. By analogy, separated-at-birth MZ twin pairs would typically be modeled as A+C+E variance, but only A for covariance. The idea is that the environmental events shared by raised-together MZ pairs still generate variation in each twin, they just aren't the same environmental events any more. Thus:
A second point concerns the use of paths instead of variance components. Paths force zero as a lower bound contribution of variance. This can cause statistical problems articulated by Brad Verhulst and colleagues: https://pubmed.ncbi.nlm.nih.gov/30569348 for which a solution is provided. Briefly, it is to estimate variances of the latent ACETw variables, and instead fix the path coefficients (regressions) on them to 1. This gets especially important in the multivariate case.