Many many submodels (how S4 builds objects and a request to run lists of models)

Posted on Wed, 03/20/2013 - 15:18

Ryne Joined: 07/31/2009

Forums

So I'm following up on some of my factor score work and happened across and old question I had regarding holder or parent models for embarassingly many submodels.

I'm trying to run the same model on 500 (or some other large n) datasets with the independence flag set to TRUE. I'm doing this by creating one model per dataset, building a list of mxModels, then putting this list into a single mxModel for optimization. However, I'm spending most of my time the fourth line of this code:

singScore <- transformFactorScores(spRes, 1, "mu", "sigma", "epsilon") singScore@independent <- TRUE singSM <- replicate(un, singScore) singParScore <- mxModel("Singletons", singSM) singRes <- mxRun(singParScore)

Lines 1-3 combine for less than .40 seconds, and the optimization of all 500 models (line 5) takes 40 or 50 seconds. However, the fourth line takes 3 minutes. However, if I split the fourth and fifth lines into 5 separate models of 100 submodels each, total time drops to 70 seconds for everything, and 60ish seconds for 10 models with 50 submodels each.

I believe this has to do with how S4 builds objects and interacts with apply statements. When I tell mxModel to add these 500 submodels, my understanding of S4 is that they are added one at a time, essentially creating holder with 1 submodel, then holder with 2, etc.

My questions are:
-is there a better way to benefit from parallelism for this approach?
-is there a smart way to determine exactly how to balance this S4 slowdown with paralellism?
-here's the feature request: is it a worthwide endeavor to allow mxRun to optimize and return a list of models? is an mxList worth the function crawl?

Replied on Thu, 03/21/2013 - 10:38

Ryne Joined: Jul 31, 2009

To clarify and improve the

To clarify and improve the last question, how could I lapply mxRun and still benefit from parallelism?

Log in or register to post comments

Replied on Wed, 04/17/2013 - 11:30

neale Joined: Jul 31, 2009

Try doParallel package?

I've used doParallel with some success in the past.

Log in or register to post comments

Replied on Thu, 04/18/2013 - 09:33

tbrick Joined: Jul 31, 2009

omxLapply

If you don't need to have them all in one master model, why not use omxLapply to apply mxRun to the list of submodels?

So:
singScore <- transformFactorScores(spRes, 1, "mu", "sigma", "epsilon") singSM <- replicate(un, singScore) singRes <- omxLapply(singSM, mxRun)

That way you can avoid building and flattening the large container model.
omxLapply will take advantage of snowfall if you have it installed on your machine so you can benefit from multiprocess parallelism. If snowfall is not installed, it will run sequentially. And, of course, you can still use thread-level parallelism if you're running a FIML model.

Log in or register to post comments

Replied on Thu, 04/18/2013 - 10:34

News

Recent Posts

Many many submodels (how S4 builds objects and a request to run lists of models)

To clarify and improve the

Try doParallel package?

omxLapply

I completely forgot about

News

Recent Posts