You are here

Model fails to terminate

6 posts / 0 new
Last post
mdewey's picture
Joined: 01/21/2011 - 13:24
Model fails to terminate

I am running a series of models on the same dataset. I have eight binary manifest variables, one latent variable and twelve groups. I am fixing the means and variances of the manifests and estimating the thresholds. The series of models differ in the following way.
1 - common parameter (lambda) for all manifest variables but separate thresholds (tau - eight values)
2 - common lambda but separate tau for all items in each group (96 taus)
3 - separate lambda for each group (12) and separate tau (96)
4 - separate lambda for each manifest (8) with 8 tau
5 - separate lambda for each manifest (8) with 96 tau
6 - separate lambda for each manifest for each group (96) with 96 tau

I developed and tested the models using only two groups and apart from some code green messages from the optimiser all ran well and I got sensible looking results. However when I run the models on the full dataset only model 1 terminates. I turned on output of checkpoints and I observe that all goes well for a while but then it stops writing to the checkpoint file but does not terminate either. Looking at the last couple of checkpoints it seems that the value of the objective has settled down although the parameter values are not that close. A further irritation is that on my Windows box it also freezes RGui for some reason. On my Linux box it just sits there. The behaviour of stopping writing to the checkpoint file seems the same under both OS.

I can make the scripts available. The dataset is not mine and I would prefer to email it rather than post it openly. But perhaps someone has an idea of what I should try next

Ryne's picture
Joined: 07/31/2009 - 15:12
Those are some big models for

Those are some big models for a ML problem. I'm surprised that model 4 had the problem. Things I would try:
-Run R from the terminal rather than the GUI. The GUI is a little more sensitive to memory issues.
-Run 64-bit if you can, especially in Windows to avoid OS-enforced memory limits.
-On the Windows box, increase the memory limit using the memory.limit function.
-Get very very good starting values, perhaps from running each group independently. I might try this, then use those starting values in model 6 because I know that the starting values are exactly right.
-If it is a memory issue, maybe set the iteration limit low (below where the model was hanging based on the checkpoint file) and run it repeatedly with new values? Memory issues may keep you from running 500 iterations, but maybe you can run 50 iterations 10 times.
-Specify with definition variables rather than groups? That tends to be more efficient, though I've only really explored this for continuous data problems. Hopefully another dev team member with more back-end experience can verify this.

Regardless, these are big models that will likely take a long time to run and be subject to memory issues. Each binary variable requires 3 dimensions of integration, so processing time will increase by a factor of 8 (really, 2^3) for each new variable. You're doing 12 simultaneous and interdependent 2^24 integrations, which is a lot. Switching from groups to definition variables may help some by at least making the problem 1 integration rather than 12 constrained ones.

mspiegel's picture
Joined: 07/31/2009 - 15:24
I would like to take a look

I would like to take a look at this script. Is it appropriate to use the fakeData() function to generate some synthetic data: That assumes that the synthetic data has the same behavior on OpenMx as the original data.
One debugging tool would be to set the checkpoint iterations to "minutes" instead of "iterations". However, I suspect that if the optimizer is stuck at some iteration, then we may not reach the checkpoint mechanism at the end of the loop. (OpenMx is currently single-threaded, so there is no independent checkpoint thread). You can change the checkpoint iterations with:

model <- mxOption(model, "Checkpoint Units", "minutes")

before the model is run.

mdewey's picture
Joined: 01/21/2011 - 13:24
Thank you Ryne and Michael

Thank you Ryne and Michael for the comments. Certainly much food for thought there. My collaborators are temporarily willing to look at things group by group but I would like to be able to do the complete dataset in one pass if possible.

Michael, I will email you the dataset and scripts when I get home this evening or tomorrow morning (I am in timezone UTC+0). (I have found your email address from your home page.) I am currently using the default for checkpointing which seems to be every ten minutes.

mspiegel's picture
Joined: 07/31/2009 - 15:24
Turning off standard errors

Turning off standard errors and accurate Hessian calculations have resolved the issue for these specific scripts.

model <- mxOption(model, "Standard Errors", "No")
model <- mxOption(model, "Calculate Hessian", "No")

Maybe we need to revisit this issue and insert some sort of time-out for these calculations?

mdewey's picture
Joined: 01/21/2011 - 13:24
Thanks for all the help on

Thanks for all the help on this one. A combination of turning off the calculation of standard errors and getting better starting values has given a solution to the models (I still have the most complex one to finish). Inevitably I have more questions and comments.

1 - I assume the Hessian is calculated once at the end so I will never be able to get it? This is not a deal-breaker just curiosity
2- I only realised after a while the implications of mxRun returning an object of class MxModel. This is a nifty feature but since it is quite unlike what the usual R fitting functions do I think a paragraph in the manual would be good, perhaps in an appendix.
3 - I have not worked out how to combine the separate models using the feature mentioned in 2 but omxSetParameters was my friend here. Again this would be a useful thing to flag in the manual.

Thanks again.