Bivariate ACE model with moderator

Posted on Mon, 11/09/2020 - 03:42

EmilieRH Joined: 09/24/2020

Forums

Hi all,

I am currently working on a project on how school achievement moderates intelligence among young adults. My sample consists of 4,084 German twins aged 17 and 23-24, respectively, who have information on both GPA and intelligence. However, the twins' GPAs come from various types of secondary schools so to check whether I need to use school type as a moderator in my GxE analyses, I first want to see what happens if I run separate bivariate variance decomposition analyses for the three school types constraining their parameters equal.

In other words, I would like to set up a three-group program running the bivariate variance decomposition analyses for all three school types at once. I have tried to modify Dr. Maes’ twoACEc.R script, which uses the Cholesky parametrization to estimate a bivariate ACE model. When I just modify it to include my two variables, GPA and IQ, it produces reasonable estimates (the estimates are similar to the ones I obtain using the umxACEv function). However, when I try to take school type into account by building six mxModels (3 school types X MZ & DZ), adding them all to the multigroup function, and running the final model, some of my path coefficients are above |1| and some of my genetic and environmental correlations are negative. Can you help me figure out what the problem is in the attached script "GPA-moderation of intelligence (example).R"?

I use the standardized residuals of my two variables of interest as data input.

All the best,
Emilie

Replied on Mon, 11/09/2020 - 13:47

AdminNeale Joined: Mar 01, 2013

Standardized?

It's not really possible to help you without the data file, which I expect you cannot share due to privacy issues. However, it would be possible for you to share a similar, randomly generated data file by using the mxGenerateData() function. If you do that and the fake data(!) also show the problem, we could look at that.

My initial thoughts are:
1. You are not looking at standardized solutions because with a Cholesky parameterization the correlation cannot end up outside the range of -1 to +1.
2. It is probably a Bad Idea to use a moderator which may have been caused by the variable you are looking at. In the present case, it seems highly likely that school achievement is partly caused by intelligence. The Purcell model can accrue substantial biases if this assumption is violated.
3. Sometimes negative genetic and environmental covariances are to be expected, even when the within-person observed correlation is positive. In your case, this may be occurring because N is too small for at least one of the subgroups you have defined. However, note that a within person correlation of .1 might come about because of .2 contribution from environmental sources and -.1 from genetic sources. There is no need for the different sources to have the same sign of correlation.

Hope this helps!

Replied on Tue, 11/10/2020 - 14:43

EmilieRH Joined: Sep 24, 2020

Same problems with simulated data

Dear Dr. Neale,

I have now tried to generate a new, simulated data set ("TL_simulation.csv"). However, it seems that it is my script that is wrong because I still experience the same kind of problems with the simulated data file (see the attached script "GPA-moderation of intelligence (example).R"). With regard to your first point, I am pretty sure that I am looking at the standardized output, but, of course, something might have gone wrong when I tried to standardize my estimates such that they are not truly standardized. I am not completely sure that I understand your second point?

Best regards,
Emilie

File attachments

TL_simulation.csv

Replied on Thu, 11/12/2020 - 11:49

AdminNeale Joined: Mar 01, 2013

Use SE's for now?

I see the issue with Confidence Intervals being printed as NA. As the summary notes, with verbose=T one can see that there are active bound constraints at the solutions of the searches for the CIs. In my opinion, printing NA in these cases is not helpful, but I am not the OpenMx dictator and others disagree. If you calculate estimate +/- 1.96*se then you will probably get reasonable agreement with confidence intervals, although some may hit or exceed boundary values (e.g., 1 for a correlation). It would be interesting to see how far the not-printed CIs are from these statistics. If we could get a look at them at all...

I recommend that you remove all the lbounds! With your example script, it returned all the likelihood-based CIs very nicely:
confidence intervals: lbound estimate ubound note twoACE.US_s1[1,7] 2.7622823e-12 2.3787388e-02 0.380695606 twoACE.US_s1[1,9] 2.3730239e-01 5.2548829e-01 0.627789101 twoACE.US_s1[1,11] 3.2578623e-01 4.5072432e-01 0.557189131 twoACE.US_s1[2,7] -1.2648491e+00 -2.3501974e-01 1.284443876 twoACE.US_s1[2,9] -6.3544096e-01 5.8246561e-01 1.367774302 twoACE.US_s1[2,11] 8.9633394e-02 6.5255412e-01 1.288291329 twoACE.US_s1[2,8] 7.9002471e-14 1.3759429e-01 0.523668335 twoACE.US_s1[2,10] 2.2800910e-02 3.4215950e-01 0.526293004 twoACE.US_s1[2,12] 3.9194936e-01 5.2024622e-01 0.658898738 twoACE.US_s2[1,7] 5.3758214e-12 2.4702608e-03 0.119588156 twoACE.US_s2[1,9] 3.6653599e-01 4.7754030e-01 0.533532667 twoACE.US_s2[1,11] 4.6456539e-01 5.1998944e-01 0.579442713 twoACE.US_s2[2,7] -1.0970761e+00 -4.7811645e-02 0.451436129 twoACE.US_s2[2,9] 4.5927513e-01 9.5784097e-01 1.940133555 twoACE.US_s2[2,11] -3.4826444e-01 8.9970672e-02 0.488314704 twoACE.US_s2[2,8] 6.0666861e-14 1.0244595e-02 0.183611120 twoACE.US_s2[2,10] 3.5710450e-01 5.0418675e-01 0.560270183 twoACE.US_s2[2,12] 4.2489259e-01 4.8556865e-01 0.540932612 twoACE.US_s3[1,7] 2.4819866e-13 2.4819866e-13 0.139318811 !!! twoACE.US_s3[1,9] 2.5369645e-01 3.8204249e-01 0.447059059 twoACE.US_s3[1,11] 5.5114529e-01 6.1795751e-01 0.687954103 twoACE.US_s3[2,7] -2.3862687e-01 1.6512006e-12 0.630901884 twoACE.US_s3[2,9] 7.3582592e-01 1.2716111e+00 1.723964955 twoACE.US_s3[2,11] -7.3253015e-01 -2.7161107e-01 0.020392368 twoACE.US_s3[2,8] 6.3346161e-13 6.3346161e-13 0.116730799 !!! twoACE.US_s3[2,10] 4.2533366e-01 5.2451781e-01 0.574430672 twoACE.US_s3[2,12] 4.2009295e-01 4.7548219e-01 0.530138798

I'll bring this case up at the developers' meeting and ask them how happy everyone would be if their CIs reported as follows. At least we ought to advise eliminating some of the bounds (aka box constraints). IMO.

confidence intervals: lbound estimate ubound note twoACE.US_s1[1,7] NA 2.3783917e-02 NA !!! twoACE.US_s1[1,9] NA 5.2548942e-01 NA !!! twoACE.US_s1[1,11] NA 4.5072666e-01 0.5571892 !!! twoACE.US_s1[2,7] NA -2.3503545e-01 NA !!! twoACE.US_s1[2,9] NA 5.8247679e-01 NA !!! twoACE.US_s1[2,11] NA 6.5255865e-01 NA !!! twoACE.US_s1[2,8] NA 1.3759220e-01 NA !!! twoACE.US_s1[2,10] NA 3.4216103e-01 NA !!! twoACE.US_s1[2,12] NA 5.2024677e-01 NA !!! twoACE.US_s2[1,7] NA 2.4695322e-03 NA !!! twoACE.US_s2[1,9] NA 4.7754056e-01 NA !!! twoACE.US_s2[1,11] NA 5.1998991e-01 NA !!! twoACE.US_s2[2,7] NA -4.7800651e-02 NA !!! twoACE.US_s2[2,9] NA 9.5783528e-01 NA !!! twoACE.US_s2[2,11] NA 8.9965372e-02 NA !!! twoACE.US_s2[2,8] NA 1.0242924e-02 NA !!! twoACE.US_s2[2,10] NA 5.0418810e-01 NA !!! twoACE.US_s2[2,12] NA 4.8556898e-01 NA !!! twoACE.US_s3[1,7] NA 2.0478024e-08 NA !!! twoACE.US_s3[1,9] NA 3.8204245e-01 NA !!! twoACE.US_s3[1,11] 0.55114536 6.1795753e-01 NA !!! twoACE.US_s3[2,7] NA 1.0128763e-07 NA !!! twoACE.US_s3[2,9] NA 1.2716104e+00 NA !!! twoACE.US_s3[2,11] NA -2.7161051e-01 NA !!! twoACE.US_s3[2,8] NA 1.5286060e-08 NA !!! twoACE.US_s3[2,10] NA 5.2451759e-01 NA !!! twoACE.US_s3[2,12] 0.42009278 4.7548240e-01 NA !!!

Replied on Fri, 11/13/2020 - 04:27

EmilieRH Joined: Sep 24, 2020

Path coefficients exceed |1|

Hi again,

Removing the lbounds solved the issue with the confidence intervals.

However, my main problem is that some of the estimated path coefficients exceed |1|. Moreover, the genetic correlations for school type 1 and school type 2 are negative, which does not make sense as we are looking at the genetic influences on GPA and intelligence. Therefore, I think I have messed up something when I tried to modify Dr. Maes’ twoACEc.R script to include three bivariate ACE models, one for each type of secondary school, in one mxFitFunctionMultigroup. Can you help me figure out what the problem is?

Replied on Mon, 11/16/2020 - 11:52

AdminNeale Joined: Mar 01, 2013

Try with each of the groups separately

If you try the three samples separately, you may get a good sense of what to expect when they are analyzed jointly.

Note that an estimate of a path coefficient is just that - an estimate. Sometimes the model may be configured so that correlations greater than 1 may be seen for some of the model parameters. If you were to split the indicators of a one-factor model to make a correlated two-factor model, and allow the factors to correlate, you could see estimates of the factor correlation exceed 1. You might see this about half the time if the one-factor model was the true world model, because the method would estimate the factor correlation (it's 1.0 in the population) but sampling error might make the estimate a bit bigger or smaller.

Now, when you look at what the model predicts for the correlations/covariances (check out mxGetExpected) - even though it may have a factor correlation of 1.8 or something silly - the predicted correlations among the measures should look fairly sane, with no predicted correlations greater than 1.0 in absolute value.

When we partition a phenotypic correlation, some of the constituent correlations may be of opposite sign. This isn't that unusual, although it isn't random because we usually apply multivariate BG methods to partition known substantial phenotypic correlations. However, if you imaging applying the method to pairs of variables whose within-person correlation is zero, you would likely get some positive covariance from A, C or E, with some negative covariance generated by one or more of the remaining sources. The model fitting will essentially 'try' to get all the variances and covariances right. How well it does is measured by various goodness-of-fit statistics such as those reported by summary().

HTH!

Replied on Tue, 11/17/2020 - 12:07

EmilieRH Joined: Sep 24, 2020

Thank you!

Thank you so much for your assistance! I do not know why I had not thought of running the bivariate ACE models for the three samples separately, but it turned out that the parameter estimates actually were similar to the ones I obtain when I use my multi-group model to estimate separate parameters for the three school types at once.

Replied on Wed, 11/18/2020 - 09:41

Leo Joined: Jan 09, 2020

maybe try verifying your

maybe try verifying your results with umx. I'm not well versed in research on GPAs and cognitive ability. However, I would further investigate a negative genetic correlation to rule out errors. Control for age and sex and see if that does make a difference. Also, check out the MZ/DZ balance within each separate model (for each school type) and how many twin pairs you have available.

News

Recent Posts

Bivariate ACE model with moderator

Standardized?

Same problems with simulated data

Use SE's for now?

Path coefficients exceed |1|

Try with each of the groups separately

Thank you!

maybe try verifying your

News

Recent Posts