Hi.
I am running a bivariate sex-limitation model with an age variable as a moderator on both means and variances/covariances. All models (full and nested) seem to work pretty well, estimates are reasonable and there is no error while converging (not even code GREEN). However, when I look at the standard erros, I can see that some of them turn out to be NAN's. What could be a reason of it and how can I fix it?
Thank you beforehand.
Julia
Without seeing the output in detail, it's pretty difficult to say why this has happened. Normally it might reflect under-identification of the model, but the lack of other diagnostics from optimization suggests that it is not. A few things to check
Are any of the parameters at their bounds?
Are the gradients near zero? Check in the fitted model object (which I'll call fittedModel here) with fittedModel@output$gradient
I suspect that the calculated Hessian (output$calculatedHessian) looks a bit odd as well. You could look at its eigenvalues with
If any of the eigenvalues are zero or negative, there is obviously a problem, perhaps stemming from under identification or boundary conditions
One thing you might try is to refit the model from its solution, to see if the gradients get smaller, and the NaN's clear up.
Hi Mike,
Thank you for you reply. None of the parameters to be estimated have boundary values. But there indeed seems to be a problem with eigenvalues. Two of them were negative (although all diagonal elements of Hessian were positive and gradients were small):
When I refitted the model starting from the original model solution, the problem with NaN's was still there (I attach summary outputs for both models):
However, when I refitted model one more time (Age2EqualHeightFit3 <- mxRun(Age2EgualHeightFit2)) NaN's disappeared without any improvement of -2LL and still one eigenvalue was negative. Is it reasonable to refit the model so many times? Would it still be a problem of having one negative eigenvalue when having all SE's as non-NaN's? Maybe I was originally wrong by setting observed means and variances as starting values?
When you say it is a bivariate sex-limitation model, you mean it is of the ACE variety? If so, there is some concern about your statement that none of the parameters have bounds. Even in the univariate case this could be a problem if you have DZ opposite sex pairs. For example, the DZOS covariance would be .5 am * af + cm * cf. Now, suppose that - as often is the case - the DZOS observed covariance is a bit lower than it would be expected to be, based on the covariances of the DZM and DZF pairs. It would be possible for the optimizer to find a negative value of am and a positive value of af, which would leave the expected covariances of DZM and DZF pairs unchanged, yet reduce the DZOS covariance by am * af. This problem would be exacerbated if the genetic correlation between males and females (the .5 here) is also allowed to be less than .5. Note that a model with both rG and rC free would not be identified because opposite sex MZ pairs almost never occur in nature (some pairs discordant for Turner's syndrome being an exception). Anyway, if you give a lower bound of zero to the am and af paths if you have set up a covariance-style sex-limitation model, or to the diagonals only of the Cholesky matrices if you are using this approach (output suggests you are, but please see http://www.vipbg.vcu.edu/vipbg/Articles/twinres-2006-sexlimgxe.pdf for other problems with it) the problem may disappear. It is possible that the fit of the model will deteriorate a bit, because it can no longer 'cheat' by setting, e.g., am positive and af negative.
maybe post the script so the problem (and the model :-) ) can be identified?
Would greatly appreciate if you could spot the problem.
It is necessary to bound c's as well because they also contribute to covariance across DZOS pairs, so
should become
However, please see the paper on sex-limitation models because there are issues with Cholesky decomposition in this context. For example, changing the order of the variables may change the fit of the model!
In addition, the moderation of all paths in the Cholesky is perhaps not the right way to go. I did present a paper at the twin meeting some years ago (2007 I think) describing this issue, but mea culpa I have not published it. It seems to me that moderating the variance of the latent variables rather than the paths makes more sense, and generalizes more naturally to the multivariate case.
Possibly, you are suffering from empirical under identification. Suppose that there is no evidence for shared environment in the data. Attempting to moderate a variance component which is essentially zero would quite possibly generate under identification symptoms. You did say that none of the parameters are at their bounds, but the problem may exist with the covariance components. If you are moderating say an off-diagonal element of the A matrix but there is no evidence for additive genetic factors contributing to covariance between the traits, the moderator parameter may be only weakly identified. As a way forward you might consider a slightly less full model, in which some of the moderator effects, and perhaps some of the variance components are fixed at zero.
Lastly, you might even be able to make an educated guess as to which parameters are problematic. If you look at the eigenvectors corresponding to the smallest (and negative) eigenvalues, the largest values in the eigenvector would indicate the parameters most associated with this component of the (co)variation in the parameters.
Sorry to go on at such length, but neither your model nor your problem is very simple. I hope these comments help!
Thank you very much for such a detailed answer, Mike! I guess I was really not aware in the beginning of complexity of this model.
I ran the analysis separately for males and females in order to find non-significant parameters and pass these constraints to the full model. You were right: there were a couple of paths that could be eliminated together with their moderation; in addition moderation of a few extra paths did not seem to be significant. All this, of course, will simplify the model from now on. However, analysis of only same-sex pairs indicated that the genetic correlation between the traits is different for males and females, pointing I guess on that I have to try a non-scalar sex-limitation model. Another headache :)
As discussed in the BG paper, I think the best way to proceed is to constrain the (genetic, shared environmental) covariances between traits to be equal for males and females. If the correlations are unequal, then the Cholesky model is in pretty bad shape because effectively the factors are not the same thing in the two sexes.
I sympathize, data often have a habit of behaving badly...