I've been using twin models for a few years now and this is my first time running into this problem.

I am interested in testing whether emotion-specific processing is heritable. To do this, I've tried using a difference score (and am estimating whether the difference in processing between emotion X and emotion Y is heritable) and I've also tried taking a residual from regressing one emotion on another.

I've noticed that my ICCs are pretty close to zero and sometimes MZ ICC is negative while DZ ICC is positive, and sometimes it's the opposite. In most cases, the E model is the best fit in line with the ICCs not showing any strong correlations between twins. However some emotions do have significant A and/or C components. I have 400 participants with about 2/3 being MZ twins. Same-sex pairs only and no non-twin siblings.

As the title says, I'm getting negative ACE estimates where the A and C component cancel each other out. For example, one ACE model result is 0.41, -0.41, 0.99 for the ACE components, respectively. Another model is 0.70, -0.48, 0.78.

When the AE or CE models are the best fit, usually I end up with two positive components. But one significant AE model results in A = -0.18 and E = 1.18. The full ACE model for this was originally -0.35, 0.15, 1.20 and the ICCs were -0.16 for MZ twins and -0.02 for DZ.

I get similar results regardless of whether I use the difference score or the residual score, and I've used the same script multiple times without this problem (and I got the script from a behaviour genetics course I took). This is however my first time trying to assess heritability using a difference score or residual. I've also tried the ADE model and get the same pattern still, but the ACE models fit better.

What I want to know is whether negative ACE components are valid and if so, how to interpret them? If it's more likely to be a model misspecification I can share my script, but I didn't get any warnings. If there are any other posts on this I would love to read it as I couldn't find any using the search.

Thank you very much for your help!

Hi

Your data seem to have some of the weirdest twin correlations. In my experience, negative twin correlations usually arise from outliers in the data. The variance component estimates you are getting seem consistent with the correlations. One thing you mention may be the root cause, taking difference scores. In the attached diagram (yes, making a structural model for a difference score is just as simple as putting in +1 and -1 paths and no residual variance for the resulting difference) a c and e variance components for the two traits covary rA rC and rE. Let's see what the variance associated with A is for the difference score: a1^2 + a2^2 - 2*a1*a2*rA. In the event that rA is zero, it would be just a1^2 + a2^2, which would not give you odd answers. However, if rA=1, then you would have a1^2 + a2^2 - 2a1a2 = (a1-a2)(a1-a2), which is zero when a1=a2. Now, although the model doesn't predict negative genetic variance, if in reality rG is near 1.0 (and similarly for rC and rE) then sampling variation might sometimes return negative rMZ and negative rDZ.

Possibly, you have high rA rC and rE (highly correlated observations from which you are taking the difference) which leaves little variation to partition after the difference has been taken.

The bigger negative MZ than DZ correlations (-.16 MZ, -.02 DZ) might be sampling variation again - with 400 individuals, perhaps imbalance in N between MZ and DZ samples leaves you with under 100 pairs in one or other group. 100 isn't a large number for the purposes of estimating correlations, which have an SE of about sqrt(1/(N-3)) or approx .1 for N=100. The -.16 for MZ isn't strikingly far off zero by this metric.

To explore the question further, you could try setting up the full bivariate model, plug in values for a1...e2, rA rC and rE and just calculate the expected variance of the difference score, and the size of the variance components that the a1...e2 rA rC and rE parameters each generate in the outcome difference.

Thanks for explaining that in terms of the bivariate model. I'll give that a try. I think you're right that this is most likely coming from rA being close to 1 since the processing of the different emotions is very similar. I'll fit the bivariate models and hopefully clear this up.

Do you mind walking through how you get the a1^2 + a2^2 - 2*a1*a2*rA equation from the diagram? I know a1^2 and a2^2 represent the heritability of each trait, and that a1*a2*rA calculates the correlation between the traits explained by shared heritability. Main questions are why the a1^2 and a2^2 are added, and why the -2 is there for the a1a2*rA.

Also, is there a textbook or publication that outlines this that I could cite in a paper?

Thanks again for your help!

a1^2 and a2^2 are the additive-genetic variance components of each trait. They will only be (narrow-sense) heritability coefficients if the traits have been scaled to have unit variance. rA is the genetic correlation of the two traits, whereas a1*a2*rA is the genetic covariance of the two traits (since a1 and a2 are the "genetic standard deviations" of each trait). The a1^2 + a2^2 - 2*a1*a2*rA expression follows from path-tracing rules (see attachment), or from an identity for the variance of a sum.

Thanks Rob, that's a big help!

You're welcome. Glad to be of help.