Many many submodels (how S4 builds objects and a request to run lists of models)

I'm trying to run the same model on 500 (or some other large n) datasets with the independence flag set to TRUE. I'm doing this by creating one model per dataset, building a list of mxModels, then putting this list into a single mxModel for optimization. However, I'm spending most of my time the fourth line of this code:
singScore <- transformFactorScores(spRes, 1, "mu", "sigma", "epsilon")
singScore@independent <- TRUE
singSM <- replicate(un, singScore)
singParScore <- mxModel("Singletons", singSM)
singRes <- mxRun(singParScore)
Lines 1-3 combine for less than .40 seconds, and the optimization of all 500 models (line 5) takes 40 or 50 seconds. However, the fourth line takes 3 minutes. However, if I split the fourth and fifth lines into 5 separate models of 100 submodels each, total time drops to 70 seconds for everything, and 60ish seconds for 10 models with 50 submodels each.
I believe this has to do with how S4 builds objects and interacts with apply statements. When I tell mxModel to add these 500 submodels, my understanding of S4 is that they are added one at a time, essentially creating holder with 1 submodel, then holder with 2, etc.
My questions are:
-is there a better way to benefit from parallelism for this approach?
-is there a smart way to determine exactly how to balance this S4 slowdown with paralellism?
-here's the feature request: is it a worthwide endeavor to allow mxRun to optimize and return a list of models? is an mxList worth the function crawl?
To clarify and improve the
Log in or register to post comments
In reply to To clarify and improve the by Ryne
Try doParallel package?
Log in or register to post comments
In reply to To clarify and improve the by Ryne
omxLapply
So:
singScore <- transformFactorScores(spRes, 1, "mu", "sigma", "epsilon")
singSM <- replicate(un, singScore)
singRes <- omxLapply(singSM, mxRun)
That way you can avoid building and flattening the large container model.
omxLapply will take advantage of snowfall if you have it installed on your machine so you can benefit from multiprocess parallelism. If snowfall is not installed, it will run sequentially. And, of course, you can still use thread-level parallelism if you're running a FIML model.
Log in or register to post comments
In reply to omxLapply by tbrick
I completely forgot about
Log in or register to post comments