Hi all,
I am running a bivariate moderation model where the phenotype of interest if dichotomous and the moderator is continuous (centered and scaled, but highly skewed). The estimates that I am getting are way off and don't match to the analysis without the moderation, this is in particular true for the correlation estimates. That is, when I compare estimates at the zero level of moderator with the estimates when there is no moderation in the model, they are very different. For example, rE=-0.27 at M=0 and rE=0.06 if moderation is not modelled; rP=-0.21 at M=0 and rP=0.17 if no moderation (also 0.17 observed from the data).
I have three moderators that we are interested in and such discrepancies are observed for two of them.
What could be a reason for that? I'm a bit lost and don't know how to proceed further, whether I can trust the results of the moderation model or not.
A bit of background: I am running ADE model based on the previous results, but CI for D are very wide and D could be dropped out of the model without significant deterioration of the fit. We decided to keep it in the model for now due to the reviews we got for our previous results (since CI's are large).
For one of the moderators, these estimates are off if D is present, but become consistent with the previous analysis if D is dropped. For the other two moderators, dropping D did not change rE and rP values.
Variance estimates for the main phenotype are consistent and in agreement with the previous analysis if D is dropped out of the model.
I did try both Purcell's bivariate moderation model (based on the Cholesky decomposition) and correlated factor solution with moderation of the paths, and they both consistently give weird estimates for the two of three moderators.
Also, there seem to be no significant moderation of any of the paths.
If anyone has any insight and idea what is going on there and why estimates at M=0 don't match with the estimates when there is no moderation, I would be very grateful!!!
Thank you in advance!
Julia
I wouldn't be able to even guess without at least seeing the script you're working with.
summary()
output might help, too.Here are the scripts that I use and the output from full moderation models and models without any moderation. When having lowsupport_s as moderator, rE and rP are not consistent across Purcell moderation model and CF moderation model and are different from the main effects model (and from the observed phenotypic correlation). When removing D, they at least become consistent across Purcell and CF models, but still not equal to the no moderation results (not to the observed rP). And only when removing all the moderation, rP estimated is in agreement with rP observed.
Thank you for looking at it!
In the Purcell model, have you tried removing the lower bounds on the unmoderated terms of the path coefficients? I notice there's an active bound at the solution:
It seems to me that the bounds shouldn't matter, since you mean-centered the moderator, and you identify the liability scale by placing a constraint on the variance at the moderator's zero point. Still, I'm not 100% sure about that. Notice that the point estimate above is actually negative, meaning that the bound is slightly violated. If the bound weren't there, would the optimizer try to push that parameter into the negative region?
Note that you don't need to put
var_constraint
into both the MZ and DZ MxModels. It suffices to put it into only one of them.Have you tried using a different optimizer?
Consider replacing
mxRun()
withmxTryHardOrdinal()
.In the correlated-factors model, why is 'Rd' fixed to -0.7?
I tried to remove lower bounds for the path estimates, and the value went just below zero, but very little:
I also tried another optimizer as you suggested, and SLSQP produced nearly identical results, whereas CSOLNP was just hanging for 10 minutes and I had to terminate the session.
As for rD=-0.7 in the CF model, this was based on the best model in terms of AIC when trying different rD values in the range from -1 to 0 with 0.1 step (previous analysis indicated negative rD).
Also thank you for note about putting the variance constraint just once into the model. I actually think that in some other scripts I had it inside the global model. Don't know why I changed it here.
Did the fit value appreciably improve?
It's encouraging that SLSQP's results agree with NPSOL's. Did you try CSOLNP with the MxConstraint in both the MZ and DZ models? There is a known issue with CSOLNP and redundant equality constraints. In the next OpenMx release, CSOLNP will at least not freeze uninterruptibly when there are redundant equalities.
But why isn't it a free parameter?
Have you tried
mxTryHardOrdinal()
?Have you tried mxTryHardOrdinal()?
Yes, I did. The the results are still the same with no improvement in the fit.
The fit value was left unchanged.
Yes, I put the constraint just into the MZ model and tried all the optimizers. Here are the fit indices:
Cholesky moderation (Purcell)
NPSOL: -2LL=12907.98 AIC= -2100.015
SLSQP: -2LL=12907.98, AIC=-2100.016
CSONLP: -2LL=12907.98, AIC=-2100.016
CF moderation
NPSOL: -2LL =12910.81, AIC = -2105.186
SLSQP: -2LL=12910.84, AIC=-2105.156 (Mx Status Red)
CSONLP: -2LL=12910.84, AIC=-2105.156
I thought that rA and rD (just like rA and rC) could not be estimated simultaneously, could they?
Well, it looks as though the optimizers really are finding the solution. I agree that the results seem odd, but I guess there's something wrong with our intuition!
I'm not used to thinking of moderation in terms of the correlated-factors parameterization, so I could be mistaken here, but I don't see any reason why they couldn't be estimated simultaneously. After all, you were able to estimate a cross-path for D, 'dC', in the Cholesky-parameterized model, right? It should be possible to get a correlated-factors solution equivalent to the Cholesky solution. But, it's possible that the correlated-factors parameterization is harder to optimize.
Yes, what is puzzling me is that the rP estimated are so different from rP observed! And that estimates at M=0 are totally different from estimates with no moderation. What can be the explanation here? Can we trust the results of moderation here.
The reason why we try moderation in terms of CF is the paper by Rathouz PJ et al:
Rathouz PJ, Van Hulle CA, Rodgers JL, Waldman ID, Lahey BB. Specification, testing, and interpretation of gene-by-measured-environment interaction models in the presence of gene-environment correlation. Behav Genet. 2008;38(3):301–315. doi:10.1007/s10519-008-9193-4
There they say that CF model has more power to detect moderation because it has fewer parameters to estimate. Since our power is quite limited due to a low number of twin pairs and due to a not very prevalent dichotomous outcome, we thought to give CF a try. It doesn't seem to provide any evidence of moderation either, but its fit is better, although the correlation estimates are as weird as in the Purcell's model (for two out of three moderators that we tested).
It's been a few years since I read that Rathouz et al. paper, so there's a good chance I'm mistaken in what I posted about the correlated-factors parameterization.
Hello
I just run the model posted by Julia(bivChol_Moderation.txt), but here are something wrong when there is no moderation in the model:
Error: The job for model 'MainEffects' exited abnormally with the error message: fit is not finite (Ordinal covariance is not positive definite in data 'DZ.data' row 13703 (loc1))
In addition: Warning message:
In model 'MainEffects' Optimizer returned a non-zero status code 10. Starting values are not feasible. Consider mxTryHard()
However, I don't know the exzact reason, could you give any questions.
Thanks!
Hello
I just run the model posted by Julia(bivChol_Moderation.txt), but here are something wrong when there is no moderation in the model:
Error: The job for model 'MainEffects' exited abnormally with the error message: fit is not finite (Ordinal covariance is not positive definite in data 'DZ.data' row 13703 (loc1))
In addition: Warning message:
In model 'MainEffects' Optimizer returned a non-zero status code 10. Starting values are not feasible. Consider mxTryHard()
However, I don't know the exzact reason, could you give any questions.
Thanks!
Evidently, you need better start values for the free parameters. Did you modify the block of syntax that sets the start values? Values that worked for Julia might not work well with your dataset.
Search this website for "start values". You'll find plenty of advice and discussion about the topic, e.g., this thread. Another thing you could try is to replace
mxRun()
withmxTryHardOrdinal()
in your script, and see if that helps. I can't really offer any more-specific advice without more details from you.I still can’t run my model.
Firstly, I run the following syntax, which induce some warning and error messages .
Many of Std.Error are NA in ACEmodModel .
As for MainEffectsModel(without moderator),
Secondly, I run mxGetExpected(). However, I just don’t know how to reset the start values.
I also replace mxRun() with MainEffectsFit <- mxTryHardOrdinal(MainEffectsModel). Though the error of NA are removed, the model without moderator didn’t work.
I read the suggestions you posted, but I failed to use Nelder-Mead implementation. I might be a little stupid, so I need your help.
Please!
First off, only put
var_constraint
in one of the MZ or DZ MxModels, not both. That won't matter if you're using NPSOL, but it will be a problem if you use CSOLNP (or Nelder-Mead).It looks like the optimizer is reaching a solution where the Hessian (as calculated) isn't positive-definite (status code 5). Your phenotype is a threshold trait, and due to the limited accuracy of the algorithm for the multivariate-normal probability integral, sometimes code 5 can occur even when the optimizer has found a minimum. Therefore, you'll want to find the solution with the smallest fitfunction value, even if it has status code 5. Try requesting more attempts from
mxTryHardOrdinal()
via argumentextraTries
, e.g.extraTries=30
.I suggest running the main-effects model before the moderation model. Use
free=FALSE
when creatingmodPathA
,modPathC
, andmodPathE
. Then, the first MxModel you run will be the main effects model. Then, create the moderation model from the fitted main-effects model, and useomxSetParameters()
to free the moderation parameters.What were you trying to do? Use
omxSetParameters()
to change free parameter values.If you want to try it, let me suggest some syntax:
Then, put
plan
into themxModel()
statement forMainEffectsModel
, assuming you are creating and runningMainEffectsModel
first. To clear the custom compute plan from an MxModel and go back to the default compute plan, domodel@compute <- NULL
.I change the order of models, and fit the main-effect model first. I also delete the var_constraint in MZ MxModel(of course try this in DZ MxModel). However, there still error massage:
It seems the model didn’t run. I want to Try requesting more attempts from mxTryHardOrdinal() via argument extra Tries, e.g. extra Tries=30.
Then, I put plan into the mxModel() statement for MainEffectsModel, errors
As for Try requesting more attempts from mxTryHardOrdinal() via argument extraTries, e.g. extra Tries=30, even 90, errors still.
In the main effect model, start values of moderator Path Coefficients are 0, isn’t it? Do I need to change the start values of a, c, and e Path Coefficients.
First of all,
mxTryHardOrdinal()
uses random-number generation. You can get different results each time you run it, which is especially likely if your model is hard to optimize for whatever reason. If you want reproducible results, precede each call tomxTryHardOrdinal()
with something to set the random-number generator's seed, e.g.set.seed(1)
.Huh? But it did!
mxTryHardOrdinal()
reported "Solution found!".OK, maybe the custom compute plan was a bad idea.
Yes, and since they're fixed, they will remain at 0 during optimization.
You don't have to change them, but maybe there is a better choice of start values than what you're currently using. Sorry I don't have any specific advice about that.
Hi Julia!
A few comments:
I think that the expectation - that the estimate of A when the moderator value is zero should be the same as when no moderation is specified in the model - is incorrect. I've not read the thread in detail, but basically the estimate of A with no moderation is an average over all levels of Ac + Am * M (A constant plus A moderated * M). Depending on the distribution of M (is it symmetric around zero?), and the size of Am, you should not generally expect Ac = A.
For starting values in moderated models, I usually start moderator effect at zero. Also, if the moderator is something like age in years, it can help a lot to rescale to age in centuries to approximate a 0-1 range for the moderator value.
Finally, as you may already know, with binary data some models are not identified. See Medland, S.E., Neale, M.C., Eaves, L.J., Neale, B.M. (2009). A note on the parameterization of purcell's G x E model for ordinal and binary Data. Behavior Genetics, 39 (2): 220-229.
I've tried to re-phrase the opening message to understand what's going on.
> I am running a bivariate GxE model with a dichotomous DV, and a continuous but skewed moderator not shared by twins
Is that right? Below, you mention two moderators - that's a valiant script :-)
> Estimates at the zero level of moderator with the estimates when there is no moderation in the model differ.
Non-significant variables happily take (random, drop-able without significant loss of fit) values.
> I am running ADE model even though D can be dropped out of the model but power is low.
I'm going to ask below: what's the N here?
> For one of the moderators...
Hang on... so two moderators?
> For the other two moderators
I'd say "for the other moderator"
What's your n? With a DV, and two moderators, all loading on the DV via three components, and having three moderated loadings on the DV, and moderating three influences on the DV, this will be a hard model to get stable estimates on.
AdminNeale, tbates: Any advice for xiyuesenlinyu (not the OP)? They're the reason I emailed the list about this thread.
Thanks!
Now, the model runs well. Another problem, the p-value=1 between the main effect model and model with moderator.
If I'm reading the table correctly, then I think the model with 18 free parameters didn't converge. A model with 18 free parameters should have a minus2LL no greater than a model with 15 parameters.
I don't personally find
mxCompare()
very useful. Would you mind posting thesummary()
output you get from both fitted models, as well as your output frommxVersion()
?There were no error messages when the extraTries = 10 in the ACEmodModel(the full model with moderator). However, the following warning message appeared when I set extraTries = 30. And the warning messages disappeared when extraTries = 80. As for the output of mxCompare(ACEmodFit,MainEffectsFit) ,there were no change.
mxVersion()
Mr Robert
Could you give some suggestions for the output of P-value=1?
Please!
Please post your current script, preferably as an attachment.
ok, thank you!
Try replacing this (line 249),
, with this,
. That way, you'll be starting the moderation model from the solution for the main-effects model.
When a sub model with fewer parameters fits better than its supermodel, the fit of the supermodel has to be wrong, because it should fit at least as well as the sub model. In such situations, poor starting values for the supermodel are often the culprit. I suggest that you take the fitted sub model and free up the parameters (or stop having two parameters equated, estimate them separately) necessary to turn the sub model into the supermodel. Then start fitting the supermodel from the sub model’s solution. At the least, the fit of the supermodel should not get any worse than that of the sub model (which is where the supermodel optimization is beginning).
Thanks for your suggestions, but I can’t understand well!
“I suggest that you take the fitted sub model and free up the parameters (or stop having two parameters equated, estimate them separately) necessary to turn the sub model into the supermodel.”
Did you mean to reset the three parameters like those
modPathA = mxMatrix( "Lower", nrow=nv, ncol=nv, free=c(F,T,T), values=pathModVal, labels=aModLabs, name="aMod" )
modPathC = mxMatrix( "Lower", nrow=nv, ncol=nv, free=c(F,T,T), values=pathModVal, labels=cModLabs, name="cMod" )
modPathE = mxMatrix( "Lower", nrow=nv, ncol=nv, free=c(F,T,T), values=pathModVal, labels=eModLabs, name="eMod" )
As for “or stop having two parameters equated, estimate them separately”. Though I can estimate them separately(other submodels), I need to estimate all of them in full model.
Another question, the -2LL are negative in some models, are they ok?
Hello, the model can be run now. however, I got puzzling output: the differences of -2LL are significant between the two models, and an odd p-value(0) for mxcompare().
Could you give some suggestions to me?
Thank you!
A chi-square test statistic of 23061.93 on 6df has a p-value computationally equal to zero:
I’m sorry, I’m new for OpenMx, even for R. I have no idea for the output. Could you tell me what does that mean with a p-value=0? An Inappropriate model or poor model fitting? Is the model reliable? What can I do for improvement?
You're comparing the moderation model to the main-effects model. A nonempty proper subset of the free parameters in the moderation model is fixed in the main-effects model, thus, the main-effects model is said to be a "nested submodel" of the moderation model. The p-value of 0 is telling you that the main-effects model provides much, much worse fit to the data than the moderation model. Or, to put it another way, you can reject the null hypothesis that the six moderation parameters are all equal to zero.
Thanks for your help. I think that P-value <0.05 between two models may be significant, which suggests the moderated effects. My concern is p=0(just right), which seems not quite right.
The output of
pchisq()
is underflowing to zero. For the sake of perspective, the 99.99th percentile of a chi-square distribution on 6df is about 27.9:This issue is the same: p = 1 being reported as p =0) : https://github.com/OpenMx/OpenMx/issues/131
Also this issue where RMSEA and TLI are misreported) https://github.com/OpenMx/OpenMx/issues/221
I don't agree with your use of the word "incorrectly". The p = 0 being discussed in this thread is being reported computationally correctly. Mathematically, the p-value is nonzero, but that value is too small to be represented as a double-precision floating-point value. For that matter--Regarding issue #131, p was also being reported computationally correctly. But there, we decided to sacrifice computational correctness for user experience in an edge case.