Hi,
I want to compare a 2-class growth mixture model to a latent growth curve model in individually-varying time points framework (i.e., definition variables), and here is the code and error message and OpenMx version:
> mxCompare(Weight2, Weight1)
base comparison ep
1 Growth Mixture Model, fixed knots 31
2 Growth Mixture Model, fixed knots Estimate a fixed knot 15
minus2LL df AIC diffLL diffdf p
1 4775.720 1464 1847.720 NA NA NA
2 4859.904 1480 1899.904 84.18401 16 2.906173e-11
> mxCompare(Weight2, Weight1, boot = T)
> mxCompare(Weight2, Weight1, boot = T, replications = 10)
Error in collectStatistics1(otherStats, ref, other, bootPair) :
Fewer than 3 replications are available.
> mxVersion()
OpenMx version: 2.9.9.248 [GIT v2.9.9-248-gd602399]
R version: R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0
MacOS: 10.13.6
Default optimiser: CSOLNP
NPSOL-enabled?: Yes
OpenMP-enabled?: Yes
May I know how to address this issue? Thank you in advance!
You run mxCompare twice:
Why? And why only 10 replications the second time?
Thanks for your reply. The
Thanks for your reply. The default setting of replication is 400 and I let it run overnight but I got nothing. So for the second number, I just tried a smaller replication number to see if I can get something.
What do you mean that you "got nothing?" By using the previousRun argument, you can run replication incrementally. You can run 10 then run another 15 without losing the first 10.
I'm curious if this was ever resolved. I also "get nothing" except an error when running the following, where m1 and m2 are two mixture models:
Looks like your models are not converging. Take a look at
attr(tmp@results, 'bootData')
. There will be a list of data.frames, one for each model. Take a look at thestatusCode
column.Thank you for responding. Since the call to mxCompare() stops with an error, the object tmp is never created, so I cannot examine its attributes. The individual models do appear to have converged.
I suspect you'll need to change mxOption 'Status OK' from its on-load default. This is what that option looks like on load:
For example, if your models involve thresholds, you might want to set the option as follows:
Note that this is only a fix if your models are failing to converge in most of the bootstrap replications (which is possible, even if they do converge when fitted to your real dataset).
> Since the call to mxCompare() stops with an error, the object tmp is never created, so I cannot examine its attributes.
Gah, that's super annoying. I removed that code. Can you install v2.19.5-26-g8a688cd29 ?
Thank you sincerely. I am experiencing serious difficulties installing the package from GitHub; will come back to this problem when I have a bit more time to set everything up correctly and debug.
I think I know why this fails. mxAutoStart requires mxData() inside each component of a mixture model. mxCompare(boot = TRUE), by contrast, fails when there are mxData() inside of each component of the mixture model. Thus, one can currently have either autoStarts, or mxCompare, but not both. Please see the reproducible example in the attached file. Thank you for your support.
I bet the problem is here,
When you start both mixtures at the same point in the parameter space, the optimizer has trouble getting anywhere because it isn't clear that there is more than 1 mixture component.
Dear Joshua, I respectfully believe that this interpretation can be excluded, for multiple reasons:
1) The model does arrive at the correct solution for the 2-class mixture, despite having the same starting values for both classes. (Correct solution as validated by Mplus and Mclust).
2) When I manually provide the same starting values for both mixtures, but also put a mxData() object inside both classes, mxCompare(boot = TRUE) also fails. Moreover, when I reverse the order of the models, mxCompare succeeds, but I notice that the number of degrees of freedom is incorrect. Cases from all the datasets are added up.
3) When I manually provide the same starting values for both mixtures, and do not put a mxData() object inside both classes, mxCompare(boot = TRUE) works. Thus proving that putting mxData() in the submodels breaks mxCompare(boot = TRUE)
See attached script for a demonstration.
Additionally, manually removing the mxData() from the submodels after running mxRun() leads to mxCompare(boot = TRUE) passing without errors.
Of course you are correct. If there is more than one mxData in the model then each mxData is bootstrap resampled independently. So instead of a mixture model, you are fitting a nonsensical multiple group model.
I don't think there is a bug here. That's how it works by design.
But Joshua, isn't
mxCompare()
doing a bootstrap likelihood-ratio test? The bootstrap LRT doesn't re-sample the actual data. It generates data parametrically under the null model, and fits both the null and alternative models to each generated dataset. The p-value is then the proportion of replications in which the LRT statistic exceeds the real-data LRT statistic.Remember, mixture models were the whole motivation behind implementing the bootstrap LRT in OpenMx
> It generates data parametrically under the null model
Yeah, but that's not an important distinction. I agree that my diagnosis was not quite correct.
If you run
mxGenerateData
you will see that it only finds at the top-level mxData and ignores the mxData in the mixture components. If the mixture components contain an mxData then they will use it instead of traversing up the model tree to the top-level mxData. So the result is similar to what I suggested before, the mixture components never see the bootstrapped datasets and just fit against the same datasets over and over again.Thank you for the detailed explanation; it makes perfect sense that the individual classes should not contain mxData(). I guess it does not make that much sense that this should break mxCompare(boot = TRUE) though, as the presence of mxData() in classes should be irrelevant to the BLRT.
Secondly, I wonder how to set starting values for a mixture model using mxAutoStart() without adding mxData() to the individual classes. Do you think mxAutoStart() should be able to handle such cases where mxData() is only present in the top-level mxModel()?
In other words,
mxCompare()
withboot=TRUE
doesn't actually use the MxData objects. It only needs the real-data -2logL, and an expectation object.Edit: To be clear, both the null and alternative models need to have valid expectation objects.
If you do
debug(OpenMx:::mxGenerateData)
and then callmxCompare(..., boot=TRUE)
then you will see thatmxGenerateData
is used (for the null model). Maybe I misunderstood something when we discussed the implementation. Rob, are you sure it can be made to work withoutmxData
?I'm looking over the code for
mxGenerateData()
, and yeah, I'm pretty sure that it can work without the models containing any MxData objects. It doesn't need raw data, just the expected sufficient statistics under the null model. Of course, what exactlymxGenerateData()
does depends upon the definition of methodgenericGenerateData()
for the relevant expectation class...which I notice only has definitions for MxExpectationNormal, MxExpectationLISREL, MxExpectationRAM, and MxExpectationStateSpace. There is no definition for MxExpectationMixture.if no mxData() is needed for mixture models, maybe it would be useful to disable the check for mxData() for mixture models altogether?