If cross-validation available in OpenMx

Posted on
No user picture. Veronica_echo Joined: 02/23/2018

Hi everyone,

I would like to use k-fold cross-validation to select the # of latent classes in finite mixture models. I am wondering if some functions in OpenMx support it. And for the cross-validation part, the only things I need are -2ll and information criteria such as AIC, BIC..., and status codes (i.e. the indicator of convergence). Is there any way to speed up this process? Thanks in advance.

Replied on Mon, 01/28/2019 - 23:42
No user picture. Veronica_echo Joined: 02/23/2018

In reply to by AdminRobK

Dr. Kirk, thanks for your reply. I am writing cross-validation to select the number of latent classes now. That is, I will run a CV for two classes, three classes, etc., which is time-consuming. OpenMx does allow parallel computation, I am wondering if I can separate my program and assign each part to different cores. For example, for running CV for 2-/3-/4- classes, may I assign 2-classes CV to one core, 3-classes to another core, and the 4-classes to the third core? If so, would you mind letting me know how to set it up? If not, could you kindly give me advice in terms of enhancing computational efficiency? Thanks in advance.

Replied on Tue, 01/29/2019 - 10:25
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Veronica_echo

Have you been increasing mxOption "Number of Threads" from its on-load default of 2? That would increase the number of threads the OpenMx backend can use for the parts of its code that are written with multithreading, during a single call to `mxRun()`. Also, if you're using the SLSQP optimizer and `mxFitFunctionML()`, you might be able to speed things up a bit if you pass argument `rowwiseParallel=FALSE` to `mxFitFunctionML()`.

You may be able to divvy up your cross-validation computing among three different physical cores using package 'snowfall' or something similar. I've never done that myself, but we have a test script in our test suite that that does it, and might be useful as an example.

Replied on Tue, 01/29/2019 - 16:01
No user picture. Veronica_echo Joined: 02/23/2018

In reply to by AdminRobK

Thanks for your kind reply! For the mxOption "Number of Threads", do you mean this one Sys.setenv(OMP_NUM_THREADS = cores)?

In my setting, I let cores be 2, 4, 6, 8, ..., up to 18 and did a pilot to see which one is the fastest. Unfortunately, the used time made me more confused. I've listed the elapsed time for each scenario:
user system elapsed
core2 159779.012 68.722 90475.270
core4 608046.029 61.886 202908.281
core6 still running
core8 248999.112 194.954 71871.313
core10 291588.438 614.642 82519.852
core12 410614.392 1825.893 186519.219
core14 290376.741 1378.912 144157.770
core16 303809.29 1419.18 148126.37
core18 still running

What I want is to reduce the elapsed time, but this time seems not to drop when I increased the number of used cores. Do you have suggestions in terms of the selection of threads number? Thank you.

I am going to try the snowfall package.

Replied on Tue, 01/29/2019 - 17:02
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Veronica_echo

For the mxOption "Number of Threads", do you mean this one Sys.setenv(OMP_NUM_THREADS = cores)?

Perhaps that could work if you do it before you load OpenMx into R's workspace, but what I had in mind was something like `mxOption(NULL,"Number of Threads",8)`.

In my setting, I let cores be 2, 4, 6, 8, ..., up to 18 and did a pilot to see which one is the fastest. Unfortunately, the used time made me more confused. I've listed the elapsed time for each scenario:

Without more information, I can't really interpret those results. Based on our testing in the past, we find steady improvements in running time as we increase the number of threads from 1 to 11, with diminishing returns beyond 11. Those tests were conducted when what was being done in multithreading was evaluation of the loglikelihoods of each row of the raw dataset (i.e., analyzing raw data with `mxFitFunctionML()`. In contrast, if you're using SLSQP and argument `rowwiseParallel=FALSE` to `mxFitFunctionML()`, you should see little, if any, improvement once the number of threads exceeds the number of free parameters in your model.

Something else I just thought of: if you're confident that your MxModel won't raise errors during the call to `mxRun()`, you could try passing argument `unsafe=TRUE` to `mxRun()`, which will cause OpenMx to skip a lot of the input checking that it does in the frontend.

Replied on Tue, 01/29/2019 - 21:22
No user picture. Veronica_echo Joined: 02/23/2018

In reply to by AdminRobK

I attached a screenshot to show the way I wrote the line and required the package. I think I did it before loading OpenMx.

I am wondering if the "best" number of threads would be affected by the performance of nodes. I am asking this since some nodes of the cluster which I can access seems not to compute efficiently and much slower than others (50+ hrs vs. 10 hrs for the same codes).

I am not so confident that my model won't raise an error. The primary process is like this: I generate data for 2 latent classes, and run a 5-fold cross-validation to select the "best" one among the mixture model with 2-, 3-, and 4-latent classes. I am pretty sure that the 2-class model would work well, but 3-/4-latent classes model may report errors/warnings, etc. The computational burden of my codes definitely lies in this cross-validation. However, from this part, I only need several basic information: point estimates, status code, information criteria, and -2loglikelihood. Can I close some "unused" features to shorten the required time?

Replied on Wed, 01/30/2019 - 11:01
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Veronica_echo

I am wondering if the "best" number of threads would be affected by the performance of nodes. I am asking this since some nodes of the cluster which I can access seems not to compute efficiently and much slower than others (50+ hrs vs. 10 hrs for the same codes).

Maybe? Are you sure that all the nodes run identical hardware? Are you competing with other users for CPU time and memory more on some nodes versus others? Are you running the exact same job on different nodes and/or with differing numbers of threads (e.g., are you using the same random-number-generator seed)?

I am not so confident that my model won't raise an error. The primary process is like this: I generate data for 2 latent classes, and run a 5-fold cross-validation to select the "best" one among the mixture model with 2-, 3-, and 4-latent classes. I am pretty sure that the 2-class model would work well, but 3-/4-latent classes model may report errors/warnings, etc.

If you can successfully run a 2-, 3- and 4-class model with one randomly generated dataset (and the default `unsafe=FALSE`), then you should be able to run all 3 with any random dataset and `unsafe=TRUE`. Using `unsafe=TRUE` only skips sanity checks that happen in the frontend at runtime; warnings, and errors that might happen during optimization, are irrelevant.

The computational burden of my codes definitely lies in this cross-validation. However, from this part, I only need several basic information: point estimates, status code, information criteria, and -2loglikelihood. Can I close some "unused" features to shorten the required time?

Are you using a custom compute plan? You could skip some compute steps that are in the default compute plan. For instance, this,

plan <- omxDefaultComputePlan()
plan$steps <- c(plan$steps$GD, plan$steps$RE)

, would skip anything related to numeric derivatives. The disadvantage is that the status code you'd get would be optimizer-specific, and based on numeric derivatives calculated "quickly" by the optimizer (as opposed to calculated "carefully" by OpenMx after the optimizer is finished).

I suppose another possible speed-up would be to use an analytic gradient of the fitfunction.

Replied on Mon, 10/03/2022 - 08:14
Picture of user. maugavilla Joined: 10/19/2021

Hi Veronica

I am working on a similar problem with mixture models. Did you find a good solution? And would you willing to share an example code?

Thanks