Hi All,

I'm trying to understand the relationship between multivariate ACE models (e.g., Cholesky model fit to two phenotypes) and multilevel behavior genetic models. Is there a formal comparison somewhere?

My sense is that multivariate ACE models conflate within and between pair variance. A and C could capture within and between effects while E captures the within pair effects. E will map right onto the within pair effect in the multilevel model (MLM). Is all this correct?

I think A and C are estimated at the within pair level in the MLM, though I saw one paper in which it looked like they were estimated at both levels. Is this correct? And, how does C estimated from a multivariate ACE model relate to the twin pair level deviance from the population mean in the MLM? I don't think they are equal but am not sure.

Any help is greatly appreciated!

-George

In response to: "My sense is that multivariate ACE models conflate within and between pair variance." I note that a multilevel implementation of an ACE model returns exactly the same estimates and log-likelihood as the multivariate method (preferably a variance component specification not a Cholesky, which has problems*. Second, note that in MZ twins, within pair variance is entirely E, whereas in DZ pairs within pair variance also includes ~50% A. Definition variables can be used to implement a single-group multilevel model for twin data. The key difference is at the data record level - one twin per line in MLM, but both twins on the same data line in a multivariate specification. The non-independence of observations in MLM is handled by internal joining of data to calculate the joint likelihood of data from both twins.

*Verhulst, B., Prom-Wormley, E., Keller, M., Medland, S., Neale, M.C. (2019). Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models. Behav Genet. 49(1):99-111.

Thanks so much for your response! I

thinkI can see how the same ACE estimates are recovered and the point about E makes complete sense. I'm trying to conceptualize the two approaches as the same model, but am confused on one point: Couldn't some factor increase the twin pair mean while not affecting similarity of the twins. This would be a factor common to twins that does not make them similar or different. In the MLM, this would presumably appear in the between pairs variance component. How is this 'captured' in a multivariate ACE model? I know we can add a twin pair mean to it. I guess I'm asking if either genetic MLMs or multivariate ACE models are genetically informative about the sort of effect I'm describing.BTW, neat paper on the estimating the variance components so that Type I errors aren't less frequent than expected.

If there is a factor that is common to both twins in a pair and has variance in the population, and you don't/can't measure it, it will contribute to between-pair variance only. That is, it will necessarily be part of

C. However, if you can and do measure it for each twin pair, you can put it into the model for the phenotypic means as a covariate. In that case, its influence will no longer be part of the model for the variance-covariance, but in the model for the means. And if that factor truly tends to increase the mean as it itself increases, then you would expect to estimate its regression slope as positive (but of course its sign depends upon what other covariates, if any, are in the model).Thanks so much for your response. Is it possible to have, and are there any examples of, cases where a predictor that varies only between predicts the twin pair mean despite non-significant C for that phenotype?

And is it possible in a multivariate ACE model with phenotypes X and Y, to regress Y, which varies only between pairs, on the A and C components for X (which varies within and between)?

Thanks again! - George

By "only between is a pairwise variable - one score per pair, right? C is conceptualized this way, as are family-level things like parents' income. It is perforce C. So in general, no, a multivariate ACE model would not be identified, in part due to the absence of statistics: instead of the usual diphenotype model where there would be separate scores for the two twins, and thus 4x4 covariance matrix with 9 statistics that have different expectations under the model (rMZ, rDZ, Variance) x 2 + within person rP + cross-correlation trait 1 - trait 2 across each of MZ and DZ pairs = 9. These statistics often get used to estimate the nine parameters of the bivariate model: A1, A2, C1, C2, E1, E2, rA, rC, and rE.

When restricted to a single variable in one of the phenotypes, one has 3 x 3 covariance matrices. This situation eliminates cross-twin cross-variable correlations (they're expected to be the same as within person rP). C is the sole source of variance for the pair measure, so its variance directly estimates C, with A and E for it both being zero. rP directly estimates the C covariance, which is also rC. rA and rE become irrelevant, and may be considered to be zero (or anything else really - they don't feature in the expectations of the model).

It is fine to regress out a pairwise variable prior to analysis - iff the measures are continuous. Regressing variables out of ordinal data up front is not generally appropriate, so I can see an advantage to keeping it in the sort of half-breed univariate/bivariate model :).

Right. So for the model with a pairwise variable, C1 <--> C2 = rC = rP?

I think I'm moving forward with the half breed model. :-)

Is the random effect for between pairs variance, u0j, in the MLM equal to A+C in the ACE model? Seems so.

Thanks you all for helping me achieve cognitive closure on these models!

The random effect for between pairs variance differs between MZ and DZ pairs. For MZ it is A+C, but for DZ it is .5A + C. We use a definition variable to implement the model in MLM land.

Got it. Yeah that's right. Thanks, Mike. You're a saint. :-)