Hi,
I am fitting a mixture model for a genetic association test that includes sibships of variable sizes, with observed and missing genotypes; the model specification is based on the Acemix2.R script.
I would like to know if it is possible to obtain the correct number of observed statistics without being necessary to specify it manually in the summary() function.
Also, the mixture models for 4 sibs are very slow (the sibship=4 mixture is a 27 component mixture). Could you give me some suggestions on how to optimize the code for running the scripts faster?
Thanks,
camelia
With regards to improving performance on the Acemix2.R script: OpenMx (currently) does not perform common subexpression elimination. Which means if you see the same subexpression in multiple algebras, then that subexpression is calculated multiple times. You should pull out common subexpressions into separate algebras. Remember than "%%" binds tighter than "+", so pull out the "%%" first. Also, if the script is taking a long time to run make sure to turn on checkpointing. Here are some common subexpressions in that script:
Camelia
You make a good point - openMx should automatically detect that the same data are being used in different mxModel() mxData() commands, but only score the observed statistics ONCE. Something for Michael S to look at, methinks.
As far as making them run faster, you could try turning off standard errors and Hessian calculation with an mxOption
model <- mxOption(model, "Standard Errors", "No")
model <- mxOption(model, "Calculate Hessian", "No")
I would have thought the Hessian sufficient, as Standard Errors need the Hessian...
Of course if you want the errors, this won't help.
Oh, that's not a bug it's a feature. I didn't notice that part of the script. The correct solution is to add the mxData() statement to the outer "twinACE" model, and delete the mxData() statements in the MZ and DZ submodels. Data trickles down from parent to child, when the child model has not specified a data set.
Wow, cool, I didn't know that! No doubt it will speed things up a bit not to have scads of copies of the same data.
your suggestions were very helpful, thank you!
camelia