Viable chi square p values for ACE model comparison

Posted on
No user picture. rebecca.barron Joined: 07/01/2016
Hi there,

I'm very new to OpenMx and working with twin data (OpenMx version: 2.5.2 [GIT v2.5.2] R version: R version 3.2.4 Revised (2016-03-16 r70336) Platform: x86_64-w64-mingw32 Default optimiser: SLSQP).

I created a script based on online resources from this site, where I want to run saturated, ACE and submodels, controlling for age and gender.

As far as I can tell, the script is running, with no warnings/ errors. However, I am getting p values of 0 for the Chi square difference test for the ACE model compared to saturated. I am wondering if this is alright/ presentable?

When comparing submodels for the best fit, do we compare to saturated or to the ACE model? when comparing nested submodels to ACE, I am getting values of 1 and 0.

I am working with small numbers (45 MZ and 24 DZ pairs), and raw data (not normalised), would this be an issue?

I would just like someone to check over the script, so I can confidently present my results!

I've attached a script and an output for my BMI data.

Appreciate any help/ tips you can give! :)

Rebecca

Replied on Wed, 07/06/2016 - 10:12
Picture of user. AdminRobK Joined: 01/24/2014

I did not look at your script very closely, but something weird seems to be going on with the way you're using definition variables. It looks like all your twin pairs have the same age and sex, so it doesn't make much difference to your results. But, if that weren't the case, it WOULD matter, and further, I think the estimates you're getting of the regression coefficients are one half what they should be. The syntax

mxMatrix(type="Full",nrow=2,ncol=2,free=F,label=c("data.Age1", "data.Age2"), name="Age")

creates a 2x2 MxMatrix with the following labels matrix,

$labels
[,1] [,2]
[1,] "data.Age1" "data.Age1"
[2,] "data.Age2" "data.Age2"

You're then premultiplying this matrix by a 1x2 matrix of coefficients to get the contribution of age to the phenotypic mean. In other words, per your syntax, the effect of age on the phenotypic mean for twin #1 is bAge times twin #1's age, plus bAge times twin #2's age. And, likewise for twin #2. In general, I don't think you want to do it this way!

As far as I can tell, the script is running, with no warnings/ errors. However, I am getting p values of 0 for the Chi square difference test for the ACE model compared to saturated. I am wondering if this is alright/ presentable?

Your syntax rounds the p-values to three decimal places, so really small p-values will get rounded to zero. You could just call pchisq() directly. For instance, the comparison of the E and the ACE model would be
pchisq(46.642,df=1,lower.tail=F), yielding a p-value on the order of 10^-12. On the other hand, you're getting a p-value of 1 when comparing the ACE and AE models. Because that comparison tests the null hypothesis that the C component is zero, and because your point estimate of that component is approximately zero, a p-value of 1 seems reasonable to me.

Replied on Wed, 07/06/2016 - 11:42
No user picture. rebecca.barron Joined: 07/01/2016

In reply to by AdminRobK

Hi there,

Thanks for the prompt reply! Just to clarify the twins in a pair are the same age/ gender (eg we have male and female MZ and same sex DZ twins).

Can you recommend a better away of including age/ gender covariates? I assume if we're getting regression coefficient half of what they should be it will affect our results.

Thanks again,

Rebecca

Replied on Wed, 07/06/2016 - 12:55
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by rebecca.barron

For age (say), try redefining the relevant objects like this:

mxMatrix(type="Full", nrow=1, ncol=1, free=TRUE, values= 0.1, label=c("betaAge"), name="bAge") #<--1x1
mxMatrix(type="Full",nrow=1,ncol=2,free=F,label=c("data.Age1", "data.Age2"), name="Age") #<--1x2
mxAlgebra(bAge%x%Age, name= "AgeR") #<--Note Kronecker product operator, %x%

Do likewise for gender. When you've made these changes and re-run the models, you should get different estimates for the regression coefficients, but nothing else should change. Because age and sex are the same for both twins in a given pair in your dataset, your script happened to be correctly adjusting for age and sex in spite of the error I observed.
Replied on Thu, 07/07/2016 - 05:16
No user picture. rebecca.barron Joined: 07/01/2016

In reply to by AdminRobK

Hi,

Thank you again.

Just a query about confidence intervals. If you have a look at the output, based on results the AE is the best fit. However, the upper and lower confidence intervals for A are both negative. I presume this is incorrect and how can I fix the problem?

Rebecca

Replied on Thu, 07/07/2016 - 11:53
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by rebecca.barron

I want to present the standardised variance and I was wondering how to get the standardised confidence intervals for these, in relation to my script?

That's pretty easy, especially since you're only working with one phenotype. Add the following objects to your MxModel named "twinACE":

mxAlgebra(A+C+E, name="V"), #<--total phenotypic variance
mxAlgebra(A/V, name="a2"), #<--narrow-sense heritability (standardized A component)
mxAlgebra(C/V, name="c2"), #<--shared-environmentality (standardized C component)
mxAlgebra(E/V, name="e2"), #<--nonshared-environmentality (standardized E component)
mxCI(c("a2","c2","e2","a","c","e"))

Then, when you use mxRun(), use argument intervals=TRUE. The CIs will be visible in summary() output. In addition to those for the standardized components, they will also include CIs for the raw A, C, and E paths ("a", "c", and "e"). I notice your script calculates "Wald" CIs for the paths, using their standard errors. The CIs that OpenMx calculates are profile-likelihood intervals (a/k/a likelihood-based confidence intervals). Calculating them can be computationally demanding (and can sometimes fail), and they are asymptotically equivalent to Wald CIs, but they have several appealing theoretical properties. For one thing, they can be asymmetric, which is valuable if the point estimate is near a boundary of the parameter space. For another, their confidence limits are invariant under change-of-parameter (square the confidence limits for the raw paths to get the corresponding limits for the raw variance components). Further, they do not rely on the assumption that the point estimate is normally distributed over repeated sampling; they instead assume that the likelihood-ratio test statistic follows a chi-square distribution. I recommend reporting the profile-likelihood CIs.