You are here

Parallelization of the objective function for definition variables

9 posts / 0 new
Last post
MaximilianStefan's picture
Joined: 12/04/2018 - 20:46
Parallelization of the objective function for definition variables
AttachmentSize
Binary Data MWE.R3.9 KB

We are currently trying to benchmark our SEM software against OpenMx for a SEM with unique definition variables per person.
With ML estimation, because there is a unique model-implied covariance matrix per person, which has to be inverted per person, I assumed that parallelizing the objective function should improve performance drastically. However, changing the number of threads does not change the performance - so I assume I probably did something wrong. I followed this wiki pages instructions:
https://openmx.ssri.psu.edu/wiki/speed-parallel-running-and-efficiency

omxDetectCores() # returns 8
getOption('mxOptions')$"Number of Threads" # returns 2
mxOption(model= yourModel, key="Number of Threads", value= (omxDetectCores() - 1)) #does not change time to fit the model, regardless of the value I pass

A minimal working example is attached.

Also, the same wiki page says

Streamlining estimation
If you have a large complex model, it may take hours to run by default. You can often speed up your model by turning off computation of elements that are only needed on a final evaluation, like the Hessian and standard errors."

How can I do that?

We are also thankfull for any further suggestions on how to improve the performance.

Best,
Maximilian

Leo's picture
Leo
Offline
Joined: 01/09/2020 - 14:36
Just some trouble shooting:

Just some trouble shooting:
- does parallelization, in general, work? try umx_check_parallel in the umx package
- if not: try it using other operating systems, i.e. linux or mac os

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
enable diagnostics

Did you try running your model with

mxOption(key="Parallel diagnostics", value="Yes")

?

AdminRobK's picture
Online
Joined: 01/24/2014 - 12:15
Windows; semantics

What is your mxVersion() output? Specifically, are you running your script under MS Windows? Neither CRAN nor we build multithreaded Windows binaries of OpenMx.

Also, this line:

mxOption(model= growthCurveModel, key="Number of Threads", 
         value= (omxDetectCores() - 1)) 
#does not change time to fit the model, 
#regardless of the value I pass

If you provide an MxModel object for argument model, then mxOption() returns an MxModel object with the appropriate option set or cleared, as the case may be (see the man page for mxOption()). Your script needs to store the output of that line's call to mxOption() in an object. Or even more simply, just set the option globally by providing NULL for argument model.

Leo's picture
Leo
Offline
Joined: 01/09/2020 - 14:36
I did not realize that as I

I did not realize that as I mostly use Linux. Is there a way to get multi-threaded performance on Windows? I only found this thread:
https://openmx.ssri.psu.edu/wiki/speed-parallel-running-and-efficiency

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
windows

> Is there a way to get multi-threaded performance on Windows?

Not yet. We're waiting for gcc support.

MaximilianStefan's picture
Joined: 12/04/2018 - 20:46
Windows; Semantics

Thanks for the very fast and helpful suggestions. I am using Windows, so this should explain why it is not working - I will try it on Linux next week and ask again if I still can't get it to work. (Also the comment about semantics by AdminRobK is helpful; I somehow just assumed mxOption changes the option in place for the model passed)

Regarding what is written about the hessian and standard error computation: Does this just refer to setting the mxOptions "Calculate Hessian" and "Standard Errors" or is there more to it?

We really would like to get the best performance OpenMx can do, but we are by no means experienced users, so if somebody has further ideas which options to try to speed up OpenMx (for this kind of models or in general), we would be thankful.
I thought setting "RAM Max Depth" to 1 could help in this case (because our longest directed path is of length one), but it drastically decreased performance. Maybe this is because all parameters in the A matrix are either fixed or definition variables?
In terms of optimizer choice we are just going to try the tree options, but in terms of further optimizer settings we don't know if we could take an educated guess about what could improve performance.

Best,

Maximilian

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
RAM Max Depth

> I thought setting "RAM Max Depth" to 1 could help in this case (because our longest directed path is of length one), but it drastically decreased performance.

The default setting should give you optimal performance. The main reason you might reduce "RAM Max Depth" is if you knew a priori that you had cycles of regressions.

> Maybe this is because all parameters in the A matrix are either fixed or definition variables?

The only thing that matters is whether A matrix entries are zero or not. It doesn't matter why they are non-zero.

AdminRobK's picture
Online
Joined: 01/24/2014 - 12:15
some remarks
(Also the comment about semantics by AdminRobK is helpful; I somehow just assumed mxOption changes the option in place for the model passed)

Unfortunately, R is just not designed for that kind of "in-place" modification of a user-created object. The user can, however, modify the R workspace's options list without use of the assignment operator. That's what happens if you provide NULL for argument model in a call to mxOption(), and that's how I usually set mxOptions.

Regarding what is written about the hessian and standard error computation: Does this just refer to setting the mxOptions "Calculate Hessian" and "Standard Errors" or is there more to it?

If I understand the question, then no, setting those two options is all there is to it.

In terms of optimizer choice we are just going to try the tree options, but in terms of further optimizer settings we don't know if we could take an educated guess about what could improve performance.

I don't have any suggestions for tuning optimizer settings other than what's already been discussed in this thread. I will remark that, in my experience, whenever the 3 gradient-based optimizers all converge to the same solution, NPSOL does so with fewer objective-function evaluations.

Is there a way to get multi-threaded performance on Windows?

If your Windows system is configured to build R packages from source, you could modify src/Makevars.win.in in your clone of the OpenMx source repository in order to enable OpenMP. With the correct configuration, you will successfully install an OpenMP-enabled Windows build of OpenMx. However, it will be nearly useless to you: trying to do anything multithreaded with it will eventually crash R. Something like 5 years ago, I tried to figure out what why it crashes. I stepped through some multithreaded loop code in the GNU Debugger, and watched as the code overran the bounds of an array before my very eyes. That was when the R developer toolchain for Windows used gcc 4.9; at the time, we assumed the crashing was a compiler bug. But with the release of R 4.0, the toolchain adopted gcc 8. Unfortunately, it is still not possible to compile a multithreaded Windows build of OpenMx that is actually thread-safe (I tried over the summer).

Edit: the current toolchain uses gcc 8, not 9.