Constraining Components Across Gender Results in Significant Loss of Model Fit?

Posted on
No user picture. RFrank Joined: 06/08/2012
I'm running a simple ACE model of a continuous variable in openMX. I'm running both a constrained and sex-limited model to see which provides a better fit, and then running various constrained or sex-limited submodels where a or c are dropped.

Here is what I have noticed -- in a sex-limited submodel where c is dropped, the a and e variance components and confidence intervals are extremely similar across males and females -- AND, extremely similar to the components for a constrained model where c is dropped.

However, the sex-limited model is a SIGNIFICANTLY better fit. There is no comparison. It's waaaaay better, and yet, in reporting the variance components, it looks like it's about equal to the constrained model that's more parsimonious.

I'm having trouble wrapping my mind around this conceptually. Anyone have any insights? Would it be helpful to post my syntax?

Replied on Mon, 04/27/2015 - 11:23
Picture of user. AdminRobK Joined: 01/24/2014

The models' summary() output would probably be more informative, though the syntax might be helpful, too. If you post them, attach them to your post as text files, or at least use the HTML tags if you include them in the body of your post.
Replied on Mon, 05/11/2015 - 18:53
No user picture. RFrank Joined: 06/08/2012

In reply to by AdminRobK

I do apologize for the delay - this is still a question I'm interested in. I've attached a textfile with the actual syntax for my models. In the body of my post are the summaries for my models:

twinACE2 = constrained across gender, full ACE
TwinAe2 = constrained across gender, drop C

twinACE3 = sex limited, full ACE
TwinAe3 = sex limited, drop C in both males and females

I also included a mxCompare for the TwinAe2Sum and the TwinAe3Sum, where TwinAe2 is much worse despite very similar estimates for the variance components.


> twinACE2Summ
data:
$MZm.data
happy1 happy2
Min. :-0.99600 Min. :-1.16477
1st Qu.:-0.69904 1st Qu.:-0.70367
Median :-0.61150 Median :-0.61391
Mean :-0.05889 Mean :-0.02597
3rd Qu.: 0.06369 3rd Qu.: 0.20706
Max. : 3.32268 Max. : 3.34504
NA's :59 NA's :57

$DZm.data
happy1 happy2
Min. :-0.84391 Min. :-1.01634
1st Qu.:-0.68266 1st Qu.:-0.69592
Median :-0.60506 Median :-0.61179
Mean : 0.07268 Mean : 0.09249
3rd Qu.: 0.33279 3rd Qu.: 0.37600
Max. : 3.27685 Max. : 3.26455
NA's :20 NA's :19

$MZf.data
happy1 happy2
Min. :-0.6727 Min. :-0.74258
1st Qu.:-0.3339 1st Qu.:-0.33622
Median :-0.2724 Median :-0.28683
Mean :-0.0255 Mean :-0.00853
3rd Qu.:-0.2039 3rd Qu.:-0.19122
Max. : 3.6351 Max. : 3.60820
NA's :46 NA's :45

$DZf.data
happy1 happy2
Min. :-0.77595 Min. :-0.823770
1st Qu.:-0.34847 1st Qu.:-0.342329
Median :-0.28070 Median :-0.267554
Mean : 0.05976 Mean : 0.004863
3rd Qu.:-0.19968 3rd Qu.:-0.174856
Max. : 3.62864 Max. : 3.522407
NA's :17 NA's :23

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 am11 ACE.am 1 1 6.719720e-01 0.06941471
2 cm11 ACE.cm 1 1 2.032210e-01 0.19949728
3 em11 ACE.em 1 1 6.748012e-01 0.01795741
4 meanM ACE.Mm 1 1 -9.251628e-05 0.03528906
5 meanF ACE.Mf 1 1 1.757629e-03 0.03281190

confidence intervals:
lbound estimate ubound
ACE.h2m[1,1] 2.910983e-01 0.47621338 0.5700096
ACE.c2m[1,1] 1.635266e-21 0.04355485 0.2052719
ACE.e2m[1,1] 4.299761e-01 0.48023178 0.5357458
ACE.h2f[1,1] 2.910983e-01 0.47621338 0.5700096
ACE.c2f[1,1] 1.635266e-21 0.04355485 0.2052719
ACE.e2f[1,1] 4.299761e-01 0.48023178 0.5357458

observed statistics: 2298
estimated parameters: 5
degrees of freedom: 2293
-2 log likelihood: 6144.707
saturated -2 log likelihood: NA
number of observations: 1292
chi-square: NA
p: NA
Information Criteria:
df Penalty Parameters Penalty Sample-Size Adjusted
AIC: 1558.707 6154.707 NA
BIC: -10282.223 6180.527 6164.644
CFI: NA
TLI: NA
RMSEA: NA
timestamp: 2015-05-11 17:29:11
frontend time: 0.7330432 secs
backend time: 10.84262 secs
independent submodels time: 0 secs
wall clock time: 11.57566 secs
cpu time: 11.57566 secs
openmx version number: 1.4-3060

> twinACE3Summ
data:
$MZm.data
happy1 happy2
Min. :-0.99600 Min. :-1.16477
1st Qu.:-0.69904 1st Qu.:-0.70367
Median :-0.61150 Median :-0.61391
Mean :-0.05889 Mean :-0.02597
3rd Qu.: 0.06369 3rd Qu.: 0.20706
Max. : 3.32268 Max. : 3.34504
NA's :59 NA's :57

$DZm.data
happy1 happy2
Min. :-0.84391 Min. :-1.01634
1st Qu.:-0.68266 1st Qu.:-0.69592
Median :-0.60506 Median :-0.61179
Mean : 0.07268 Mean : 0.09249
3rd Qu.: 0.33279 3rd Qu.: 0.37600
Max. : 3.27685 Max. : 3.26455
NA's :20 NA's :19

$MZf.data
happy1 happy2
Min. :-0.6727 Min. :-0.74258
1st Qu.:-0.3339 1st Qu.:-0.33622
Median :-0.2724 Median :-0.28683
Mean :-0.0255 Mean :-0.00853
3rd Qu.:-0.2039 3rd Qu.:-0.19122
Max. : 3.6351 Max. : 3.60820
NA's :46 NA's :45

$DZf.data
happy1 happy2
Min. :-0.77595 Min. :-0.823770
1st Qu.:-0.34847 1st Qu.:-0.342329
Median :-0.28070 Median :-0.267554
Mean : 0.05976 Mean : 0.004863
3rd Qu.:-0.19968 3rd Qu.:-0.174856
Max. : 3.62864 Max. : 3.522407
NA's :17 NA's :23

The final iterate satisfies the optimality conditions to the accuracy requested, but the sequence of iterates has not yet converged. NPSOL was terminated because no further improvement could be made in the merit function (Mx status GREEN).

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 am11 ACE.am 1 1 8.366214e-01 0.04156916
2 cm11 ACE.cm 1 1 5.351316e-07 0.38932071
3 em11 ACE.em 1 1 7.824401e-01 0.02938613
4 af11 ACE.af 1 1 4.062740e-01 0.09921859
5 cf11 ACE.cf 1 1 3.784636e-01 0.09152652
6 ef11 ACE.ef 1 1 5.708559e-01 0.02097595
7 meanM ACE.Mm 1 1 4.382081e-04 0.04150923
8 meanF ACE.Mf 1 1 8.345160e-04 0.02694799

confidence intervals:
lbound estimate ubound
ACE.h2m[1,1] 3.230728e-01 5.334272e-01 0.6024054
ACE.c2m[1,1] 4.188448e-22 2.182420e-13 0.1816084
ACE.e2m[1,1] 3.975946e-01 4.665728e-01 0.5448360
ACE.h2f[1,1] 1.697180e-02 2.602750e-01 0.5137549
ACE.c2f[1,1] 2.252041e-03 2.258618e-01 0.4275499
ACE.e2f[1,1] 4.414490e-01 5.138632e-01 0.5965823

observed statistics: 2298
estimated parameters: 8
degrees of freedom: 2290
-2 log likelihood: 5995.339
saturated -2 log likelihood: NA
number of observations: 1292
chi-square: NA
p: NA
Information Criteria:
df Penalty Parameters Penalty Sample-Size Adjusted
AIC: 1415.339 6011.339 NA
BIC: -10410.099 6052.651 6027.239
CFI: NA
TLI: NA
RMSEA: NA
timestamp: 2015-05-11 17:33:11
frontend time: 0.6750391 secs
backend time: 17.71001 secs
independent submodels time: 0 secs
wall clock time: 18.38505 secs
cpu time: 18.38505 secs
openmx version number: 1.4-3060

> TwinAe2Sum
data:
$MZm.data
happy1 happy2
Min. :-0.99600 Min. :-1.16477
1st Qu.:-0.69904 1st Qu.:-0.70367
Median :-0.61150 Median :-0.61391
Mean :-0.05889 Mean :-0.02597
3rd Qu.: 0.06369 3rd Qu.: 0.20706
Max. : 3.32268 Max. : 3.34504
NA's :59 NA's :57

$DZm.data
happy1 happy2
Min. :-0.84391 Min. :-1.01634
1st Qu.:-0.68266 1st Qu.:-0.69592
Median :-0.60506 Median :-0.61179
Mean : 0.07268 Mean : 0.09249
3rd Qu.: 0.33279 3rd Qu.: 0.37600
Max. : 3.27685 Max. : 3.26455
NA's :20 NA's :19

$MZf.data
happy1 happy2
Min. :-0.6727 Min. :-0.74258
1st Qu.:-0.3339 1st Qu.:-0.33622
Median :-0.2724 Median :-0.28683
Mean :-0.0255 Mean :-0.00853
3rd Qu.:-0.2039 3rd Qu.:-0.19122
Max. : 3.6351 Max. : 3.60820
NA's :46 NA's :45

$DZf.data
happy1 happy2
Min. :-0.77595 Min. :-0.823770
1st Qu.:-0.34847 1st Qu.:-0.342329
Median :-0.28070 Median :-0.267554
Mean : 0.05976 Mean : 0.004863
3rd Qu.:-0.19968 3rd Qu.:-0.174856
Max. : 3.62864 Max. : 3.522407
NA's :17 NA's :23

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 am11 ACE.am 1 1 0.7042320230 0.02411151
2 em11 ACE.em 1 1 0.6722360079 0.01707699
3 meanM ACE.Mm 1 1 0.0004698117 0.03518650
4 meanF ACE.Mf 1 1 0.0019849251 0.03272445

confidence intervals:
lbound estimate ubound
ACE.h2m[1,1] 0.4706395 0.5232325 0.5715719
ACE.c2m[1,1] 0.0000000 0.0000000 0.0000000
ACE.e2m[1,1] 0.4284281 0.4767675 0.5293605
ACE.h2f[1,1] 0.4706395 0.5232325 0.5715719
ACE.c2f[1,1] 0.0000000 0.0000000 0.0000000
ACE.e2f[1,1] 0.4284281 0.4767675 0.5293605

observed statistics: 2298
estimated parameters: 4
degrees of freedom: 2294
-2 log likelihood: 6144.962
saturated -2 log likelihood: NA
number of observations: 1292
chi-square: NA
p: NA
Information Criteria:
df Penalty Parameters Penalty Sample-Size Adjusted
AIC: 1556.962 6152.962 NA
BIC: -10289.131 6173.618 6160.912
CFI: NA
TLI: NA
RMSEA: NA
timestamp: 2015-05-11 17:36:12
frontend time: 0.804045 secs
backend time: 4.429254 secs
independent submodels time: 0 secs
wall clock time: 5.233299 secs
cpu time: 5.233299 secs
openmx version number: 1.4-3060

> TwinAe3Sum
data:
$MZm.data
happy1 happy2
Min. :-0.99600 Min. :-1.16477
1st Qu.:-0.69904 1st Qu.:-0.70367
Median :-0.61150 Median :-0.61391
Mean :-0.05889 Mean :-0.02597
3rd Qu.: 0.06369 3rd Qu.: 0.20706
Max. : 3.32268 Max. : 3.34504
NA's :59 NA's :57

$DZm.data
happy1 happy2
Min. :-0.84391 Min. :-1.01634
1st Qu.:-0.68266 1st Qu.:-0.69592
Median :-0.60506 Median :-0.61179
Mean : 0.07268 Mean : 0.09249
3rd Qu.: 0.33279 3rd Qu.: 0.37600
Max. : 3.27685 Max. : 3.26455
NA's :20 NA's :19

$MZf.data
happy1 happy2
Min. :-0.6727 Min. :-0.74258
1st Qu.:-0.3339 1st Qu.:-0.33622
Median :-0.2724 Median :-0.28683
Mean :-0.0255 Mean :-0.00853
3rd Qu.:-0.2039 3rd Qu.:-0.19122
Max. : 3.6351 Max. : 3.60820
NA's :46 NA's :45

$DZf.data
happy1 happy2
Min. :-0.77595 Min. :-0.823770
1st Qu.:-0.34847 1st Qu.:-0.342329
Median :-0.28070 Median :-0.267554
Mean : 0.05976 Mean : 0.004863
3rd Qu.:-0.19968 3rd Qu.:-0.174856
Max. : 3.62864 Max. : 3.522407
NA's :17 NA's :23

The final iterate satisfies the optimality conditions to the accuracy requested, but the sequence of iterates has not yet converged. NPSOL was terminated because no further improvement could be made in the merit function (Mx status GREEN).

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 am11 ACE.am 1 1 0.8366215095 0.04156926
2 em11 ACE.em 1 1 0.7824398828 0.02938617
3 af11 ACE.af 1 1 0.5668069445 0.02706902
4 ef11 ACE.ef 1 1 0.5590466843 0.01919378
5 meanM ACE.Mm 1 1 0.0004380475 0.04150877
6 meanF ACE.Mf 1 1 0.0019770976 0.02664580

confidence intervals:
lbound estimate ubound
ACE.h2m[1,1] 0.4551683 0.5334274 0.6024054
ACE.c2m[1,1] 0.0000000 0.0000000 0.0000000
ACE.e2m[1,1] 0.3975946 0.4665726 0.5448317
ACE.h2f[1,1] 0.4332632 0.5068925 0.5727278
ACE.c2f[1,1] 0.0000000 0.0000000 0.0000000
ACE.e2f[1,1] 0.4272722 0.4931075 0.5667368

observed statistics: 2298
estimated parameters: 6
degrees of freedom: 2292
-2 log likelihood: 5999.27
saturated -2 log likelihood: NA
number of observations: 1292
chi-square: NA
p: NA
Information Criteria:
df Penalty Parameters Penalty Sample-Size Adjusted
AIC: 1415.27 6011.270 NA
BIC: -10420.50 6042.254 6023.195
CFI: NA
TLI: NA
RMSEA: NA
timestamp: 2015-05-11 17:37:36
frontend time: 0.87605 secs
backend time: 6.660381 secs
independent submodels time: 0 secs
wall clock time: 7.536431 secs
cpu time: 7.536431 secs
openmx version number: 1.4-3060

> mxCompare(TwinAe3Fit, TwinAe2Fit)
base comparison ep minus2LL df AIC diffLL diffdf p
1 Twin3AE 6 5999.270 2292 1415.270 NA NA NA
2 Twin3AE Twin2AE 4 6144.962 2294 1556.962 145.6922 2 2.308593e-32

Replied on Tue, 05/12/2015 - 10:38
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by RFrank

I think the explanation is that the raw variance components are simply larger for males than for females, to the degree that equating them to be equal induces substantial misfit. If you look at the point estimates from the sex-limited AE model, then (I'm assuming a single phenotype) the additive-genetic variance for males would be about 0.84^2 = 0.71, and the nonshared-environmental variance for males would be about 0.78^2 = 0.61. But for females, additive-genetic variance would be about 0.57^2 = 0.32 and the nonshared-environmental variance would be about 0.56^2 = 0.31.

On the other hand, the standardized variance components (the variance proportions) do appear to be similar for both sexes. For instance, in the sex-limited AE model, males would have a narrow-sense heritability of about 0.71/(0.71+0.61) = 0.54, and females would have a narrow-sense heritability of about 0.32/(0.32+0.31) = 0.51, which (allowing for rounding error) are close to the point estimates of your MxAlgebras, displayed in the output for the confidence intervals.

So, the standardized variance components are similar by sex, but the raw variance components differ by sex; in other words, heritability is similar by sex, but males have greater overall phenotypic variance.

BTW, I notice that your summary() output says you're using OpenMx version 1.4-3060. Does mxVersion() also say you have version 1.4-3060, and if so, are you using R version 3.1.0 or later? If yes, you should DEFINITELY upgrade your version of OpenMx to version 1.4-3475, or better yet, version 2.0 or 2.0.1. There's a reason we have a yellow-highlighted warning about this on the front page of the OpenMx website.

Replied on Tue, 05/12/2015 - 13:58
No user picture. RFrank Joined: 06/08/2012

In reply to by AdminRobK

Ah, I see! Thank you for this very cogent explanation!

Yes, I recently noticed this warning, and much appreciate the heads up. I noticed this yesterday and I think I'm using R 3.0.3 -- that should mean that my output is okay, right? But I need to go ahead and upgrade both OpenMX and R to the latest versions. (Does Open Mx 2.0 run with versoins of R lower than 3.1?)

Replied on Tue, 05/12/2015 - 14:23
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by RFrank

If you're using OpenMx 1.4-3060 with R 3.0.3, you should be fine. If you want to upgrade to OpenMx 2.0.x, you'll be able to, since it's compatible with (I think) R 3.0.2 and later.
Replied on Tue, 05/12/2015 - 14:28
No user picture. RFrank Joined: 06/08/2012

In reply to by RFrank

A follow-up question as I contemplate on this: is it clearly correct to say that it would be better to present a univariate ACE model where gender isn't addressed at all rather than a constrained one that is substantially misfit? I think the answer would be clearly yes. And then would it be fair to say that it's not as unequivocally better to present a sex-limited ACE than a univarate ACE that doesn't address gender at all? Unlike a constrained model with a very poor fit, it's not that the narrow-sense heritability estimates in this generic univariate ACE should be questioned or that they necessarily disproportionately reflect one gender over another, but just that they don't tell the whole story of what's going on in the background compared to the sex-limited.

(Just thinking about how to make analytic plans in the future.)

Replied on Wed, 05/13/2015 - 09:44
Picture of user. neale Joined: 07/31/2009

In reply to by RFrank

It is possible to fit a model that specifies the same proportions of variance, but a different total variance for the two sexes. Notation got scrambled at some point (possibly my fault) but this has been called a scalar sex difference model. Let's call it a variance difference model. To implement it, I would constrain am=af cm=cf and em=ef, and add a diagonal matrix with starting values equal to one. We only want to scale twin 1 in the OS DZ group if Twin 1 is male, so we need two matrices to "wrap" (i.e., pre and post-multiply) the previous expected covariance matrix with using the quadratic %&% operator. Something like


sdDiff <- mxMatrix("Diag", 2, 2, labels=rep("sdScale",2), free=T, values=1, name=sdScale)
sdOS <- mxMatrix("Diag", 2, 2, labels=c("sdScale",NA), free=c(T,F), values=1, name=sdOS)
mxAlgebra( sdDiff %&% whateverYouHadForMZMales, name="whateverMZMaleCovWasCalledBefore")
mxAlgebra( sdDiff %&% whateverYouHadForDZMales, name="whateverDZMaleCovWasCalledBefore")
mxAlgebra( sdOS %&% whateverYouHadForOSDZandAssumingTwin1isMale, name="whateverDZOSCovWasCalledBefore")

In a multivariate analysis, we'd probably want to do this a bit differently to simplify specification, e.g., using a diag(2) matrix to Kronecker product the variance differences matrices.

Replied on Thu, 05/14/2015 - 18:36
No user picture. RFrank Joined: 06/08/2012

In reply to by AdminRobK

Thank you for these extremely helpful insights! I didn't see your comment until after I saw Dr. Neale's. I updated my code to reflect his suggestions. My sample indeed does not have opposite sex twin pairs, so I modified it for the bivariate case.

It looks to me like the output is as it is supposed to! The raw variance is different across gender but the standardized variance components are the same. I've attached my output here just to confirm that I've put these suggestions into place correctly. (Note that I used a different variable which is why the results are way different than my original post.) Thanks again!