# NAN's as standard error

9 posts / 0 new
Offline
Joined: 03/29/2012 - 09:40
NAN's as standard error

Hi.
I am running a bivariate sex-limitation model with an age variable as a moderator on both means and variances/covariances. All models (full and nested) seem to work pretty well, estimates are reasonable and there is no error while converging (not even code GREEN). However, when I look at the standard erros, I can see that some of them turn out to be NAN's. What could be a reason of it and how can I fix it?

Thank you beforehand.
Julia

Offline
Joined: 07/31/2009 - 15:14
Hard to say

Without seeing the output in detail, it's pretty difficult to say why this has happened. Normally it might reflect under-identification of the model, but the lack of other diagnostics from optimization suggests that it is not. A few things to check

Are any of the parameters at their bounds?

Are the gradients near zero? Check in the fitted model object (which I'll call fittedModel here) with fittedModel@output$gradient I suspect that the calculated Hessian (output$calculatedHessian) looks a bit odd as well. You could look at its eigenvalues with

mxEval(eigen(fittedModel@output$calculatedHessian),fittedModel) If any of the eigenvalues are zero or negative, there is obviously a problem, perhaps stemming from under identification or boundary conditions One thing you might try is to refit the model from its solution, to see if the gradients get smaller, and the NaN's clear up. fittedModel2 <- mxRun(fittedModel) Offline Joined: 03/29/2012 - 09:40 Hi Mike, Thank you for you Hi Mike, Thank you for you reply. None of the parameters to be estimated have boundary values. But there indeed seems to be a problem with eigenvalues. Two of them were negative (although all diagonal elements of Hessian were positive and gradients were small): > Age2EqualHeightFit@output$gradient
am11          am21          am22          cm11          cm21          cm22          em11          em21          em22       amMod11       amMod21       amMod22       cmMod11
-1.122870e-03 -1.237641e-03  1.707861e-03 -6.748615e-04  1.345188e-04 -6.742563e-04 -6.320216e-04 -8.992094e-04  1.245466e-03  1.434835e-03 -8.626526e-04 -6.477156e-05 -8.695995e-04
cmMod21       cmMod22       emMod11       emMod21       emMod22   heightmeanM   weightmeanM    betaModMP1    betaModMP2    betaMod2P1   betaMod2MP2          af11          af21
6.148197e-05 -3.626750e-05 -8.803194e-04 -4.461401e-04  1.387117e-04 -1.161187e-03  2.858647e-05 -1.060539e-03 -1.806863e-04  4.771032e-04 -5.274610e-04 -1.980755e-03  1.789363e-03
af22          cf11          cf21          cf22          ef11          ef21          ef22       afMod11       afMod21       afMod22       cfMod11       cfMod21       cfMod22
-1.705792e-03 -1.292334e-04 -3.007552e-04 -7.127426e-04 -3.551370e-04  5.873707e-04 -2.404748e-04  7.272490e-04 -4.315048e-04 -7.615971e-04  1.430941e-03 -4.446400e-04 -3.206717e-04
efMod11       efMod21       efMod22   heightmeanF   weightmeanF    betaModFP1    betaModFP2   betaMod2FP2
-1.649077e-03  0.000000e+00 -1.469744e-03  1.684205e-04  5.093047e-04  1.184870e-03  2.028324e-04  1.245997e-03 
> diag(Age2EqualHeightFit@output$calculatedHessian) am11 am21 am22 cm11 cm21 cm22 em11 em21 em22 amMod11 amMod21 amMod22 cmMod11 cmMod21 cmMod22 emMod11 emMod21 emMod22 heightmeanM 2385.43855 446.25449 618.70357 121.57020 42.85306 74.38829 3094.20981 260.97436 757.93768 1999.29675 388.88075 521.36922 225.61561 73.83541 83.97514 4191.12740 328.71957 692.43717 1543.42476 weightmeanM betaModMP1 betaModMP2 betaMod2P1 betaMod2MP2 af11 af21 af22 cf11 cf21 cf22 ef11 ef21 ef22 afMod11 afMod21 afMod22 cfMod11 cfMod21 494.87906 1396.98944 453.57463 7029.29868 1158.60257 3125.31498 521.32441 734.14229 224.85520 68.99024 86.22692 4937.08868 346.52302 969.64112 2954.36567 502.06249 716.57757 383.38246 110.76349 cfMod22 efMod11 efMod21 efMod22 heightmeanF weightmeanF betaModFP1 betaModFP2 betaMod2FP2 75.22758 6920.85078 490.98650 961.82776 1908.58993 576.90319 1842.77287 584.21827 1484.84240 > mxEval(eigen(Age2EqualHeightFit@output$calculatedHessian), Age2EqualHeightFit)
$values [1] 8619.195716 8027.120433 5011.943965 4828.560306 3652.827066 2965.539319 2439.374194 2040.002692 1909.280972 1887.465043 1568.978182 1429.407210 1414.147226 1392.439492 1321.954465 1209.877815 1063.231914 1028.387112 [19] 855.471722 689.470690 667.402007 505.723903 481.788298 469.447431 455.717385 371.851451 329.285816 303.815932 265.032984 248.148131 234.617721 205.030319 181.580511 162.688532 149.369719 127.531434 [37] 121.046768 80.215224 62.553567 41.763661 39.530890 24.836921 20.440747 13.613198 7.167862 -4.881082 -8.160895 When I refitted the model starting from the original model solution, the problem with NaN's was still there (I attach summary outputs for both models): > Age2EqualHeightFit2@output$gradient
am11          am21          am22          cm11          cm21          cm22          em11          em21          em22       amMod11       amMod21       amMod22       cmMod11       cmMod21       cmMod22       emMod11
-4.955332e-03 -2.787743e-03 -2.197187e-04 -3.377597e-03 -4.898177e-04 -1.486063e-03 -5.069344e-03 -2.108801e-03 -1.136539e-03 -9.095739e-04 -2.089097e-03 -1.895921e-03 -2.587429e-03 -8.310529e-04 -1.392584e-03 -5.755695e-03
emMod21       emMod22   heightmeanM   weightmeanM    betaModMP1    betaModMP2    betaMod2P1   betaMod2MP2          af11          af21          af22          cf11          cf21          cf22          ef11          ef21
-2.009866e-03 -1.943104e-03 -4.568592e-03 -1.747756e-03 -4.006359e-03 -2.064840e-03 -4.506982e-03 -2.869849e-03 -6.961889e-03 -7.177681e-05 -4.108148e-03 -2.738653e-03 -8.392767e-04 -1.583976e-03 -6.517562e-03 -9.059191e-04
ef22       afMod11       afMod21       afMod22       cfMod11       cfMod21       cfMod22       efMod11       efMod21       efMod22   heightmeanF   weightmeanF    betaModFP1    betaModFP2   betaMod2FP2
-2.852476e-03 -2.634315e-03 -1.791248e-03 -3.041133e-03 -3.791626e-03 -1.184226e-03 -1.398853e-03 -8.701540e-03 -1.668872e-03 -3.977844e-03 -3.511259e-03 -1.314040e-03 -2.527218e-03 -1.916431e-03 -1.625743e-03
> diag(Age2EqualHeightFit2@output$calculatedHessian) am11 am21 am22 cm11 cm21 cm22 em11 em21 em22 amMod11 amMod21 amMod22 cmMod11 cmMod21 cmMod22 emMod11 emMod21 emMod22 heightmeanM 2385.94300 447.17453 618.52388 128.16410 46.74357 78.90563 3095.84240 274.32086 758.24178 2036.36399 394.32259 535.98857 238.05728 84.01778 92.57487 4209.88195 361.54460 690.35582 1543.42482 weightmeanM betaModMP1 betaModMP2 betaMod2P1 betaMod2MP2 af11 af21 af22 cf11 cf21 cf22 ef11 ef21 ef22 afMod11 afMod21 afMod22 cfMod11 cfMod21 494.87903 1405.61255 463.60206 7045.22101 1166.33280 3126.23691 525.13732 734.70751 223.42529 62.45366 86.58439 4941.50213 357.01748 970.89067 2949.16649 514.24203 691.10318 383.08442 111.21603 cfMod22 efMod11 efMod21 efMod22 heightmeanF weightmeanF betaModFP1 betaModFP2 betaMod2FP2 71.89960 6915.00156 496.59099 981.79701 1908.58788 576.91151 1837.66380 586.17848 1513.17637 > mxEval(eigen(Age2EqualHeightFit2@output$calculatedHessian), Age2EqualHeightFit2)
\$values
[1] 8621.781552 8046.352379 5025.469679 4826.962360 3654.576721 2965.701121 2444.974007 2038.558554 1933.447459 1881.578111 1574.044965 1444.397052 1434.723289 1405.800788 1317.506114 1211.857018 1063.401905 1036.633376
[19]  859.073819  707.254807  663.989849  505.777275  486.935662  473.635529  469.038981  380.132321  331.879040  308.068325  271.714133  269.291484  251.186070  197.245198  182.993207  163.838670  160.495495  146.937096
[37]  123.205381   85.347551   61.500005   47.613701   40.039015   31.089793   24.097518   11.178646    3.314551   -1.618250  -22.407141

However, when I refitted model one more time (Age2EqualHeightFit3 <- mxRun(Age2EgualHeightFit2)) NaN's disappeared without any improvement of -2LL and still one eigenvalue was negative. Is it reasonable to refit the model so many times? Would it still be a problem of having one negative eigenvalue when having all SE's as non-NaN's? Maybe I was originally wrong by setting observed means and variances as starting values?

Offline
Joined: 07/31/2009 - 15:14
Sounds like identification issue

When you say it is a bivariate sex-limitation model, you mean it is of the ACE variety? If so, there is some concern about your statement that none of the parameters have bounds. Even in the univariate case this could be a problem if you have DZ opposite sex pairs. For example, the DZOS covariance would be .5 am * af + cm * cf. Now, suppose that - as often is the case - the DZOS observed covariance is a bit lower than it would be expected to be, based on the covariances of the DZM and DZF pairs. It would be possible for the optimizer to find a negative value of am and a positive value of af, which would leave the expected covariances of DZM and DZF pairs unchanged, yet reduce the DZOS covariance by am * af. This problem would be exacerbated if the genetic correlation between males and females (the .5 here) is also allowed to be less than .5. Note that a model with both rG and rC free would not be identified because opposite sex MZ pairs almost never occur in nature (some pairs discordant for Turner's syndrome being an exception). Anyway, if you give a lower bound of zero to the am and af paths if you have set up a covariance-style sex-limitation model, or to the diagonals only of the Cholesky matrices if you are using this approach (output suggests you are, but please see http://www.vipbg.vcu.edu/vipbg/Articles/twinres-2006-sexlimgxe.pdf for other problems with it) the problem may disappear. It is possible that the fit of the model will deteriorate a bit, because it can no longer 'cheat' by setting, e.g., am positive and af negative.

Offline
Joined: 07/31/2009 - 14:25
script?

maybe post the script so the problem (and the model :-) ) can be identified?

Offline
Joined: 03/29/2012 - 09:40
script

Would greatly appreciate if you could spot the problem.

Offline
Joined: 07/31/2009 - 15:14
Bound C as well?

It is necessary to bound c's as well because they also contribute to covariance across DZOS pairs, so

# Matrices declared to store a, c, and e Path Coefficients
pathAm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValM, lbound=c(0, NA, 0), labels=amLabs, name="am" )
pathCm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValM, labels=cmLabs, name="cm" )
pathEm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValM, labels=emLabs, name="em" )

pathAf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValF, lbound=c(0, NA, 0), labels=afLabs, name="af" )
pathCf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValF, labels=cfLabs, name="cf" )
pathEf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValF, labels=efLabs, name="ef" )

should become

# Matrices declared to store a, c, and e Path Coefficients
pathAm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValM, lbound=c(0, NA, 0), labels=amLabs, name="am" )
pathCm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValM, lbound=c(0, NA, 0), labels=cmLabs, name="cm" )
pathEm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValM, labels=emLabs, name="em" )

pathAf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValF, lbound=c(0, NA, 0), labels=afLabs, name="af" )
pathCf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValF, lbound=c(0, NA, 0), labels=cfLabs, name="cf" )
pathEf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValF, labels=efLabs, name="ef" )

However, please see the paper on sex-limitation models because there are issues with Cholesky decomposition in this context. For example, changing the order of the variables may change the fit of the model!

In addition, the moderation of all paths in the Cholesky is perhaps not the right way to go. I did present a paper at the twin meeting some years ago (2007 I think) describing this issue, but mea culpa I have not published it. It seems to me that moderating the variance of the latent variables rather than the paths makes more sense, and generalizes more naturally to the multivariate case.

Possibly, you are suffering from empirical under identification. Suppose that there is no evidence for shared environment in the data. Attempting to moderate a variance component which is essentially zero would quite possibly generate under identification symptoms. You did say that none of the parameters are at their bounds, but the problem may exist with the covariance components. If you are moderating say an off-diagonal element of the A matrix but there is no evidence for additive genetic factors contributing to covariance between the traits, the moderator parameter may be only weakly identified. As a way forward you might consider a slightly less full model, in which some of the moderator effects, and perhaps some of the variance components are fixed at zero.

Lastly, you might even be able to make an educated guess as to which parameters are problematic. If you look at the eigenvectors corresponding to the smallest (and negative) eigenvalues, the largest values in the eigenvector would indicate the parameters most associated with this component of the (co)variation in the parameters.

Sorry to go on at such length, but neither your model nor your problem is very simple. I hope these comments help!

Offline
Joined: 03/29/2012 - 09:40
Thank you very much for such

Thank you very much for such a detailed answer, Mike! I guess I was really not aware in the beginning of complexity of this model.
I ran the analysis separately for males and females in order to find non-significant parameters and pass these constraints to the full model. You were right: there were a couple of paths that could be eliminated together with their moderation; in addition moderation of a few extra paths did not seem to be significant. All this, of course, will simplify the model from now on. However, analysis of only same-sex pairs indicated that the genetic correlation between the traits is different for males and females, pointing I guess on that I have to try a non-scalar sex-limitation model. Another headache :)

Offline
Joined: 07/31/2009 - 15:14

As discussed in the BG paper, I think the best way to proceed is to constrain the (genetic, shared environmental) covariances between traits to be equal for males and females. If the correlations are unequal, then the Cholesky model is in pretty bad shape because effectively the factors are not the same thing in the two sexes.

I sympathize, data often have a habit of behaving badly...