Performance regression of OpenMx for simple CFA model

Posted on Fri, 11/12/2021 - 06:17

MaximilianStefan Joined: 12/04/2018

Forums

I specified a simple CFA model (5 factors, 20 Items each), simulated data from it (25 observations for each parameter) and fitted the model in lavaan and OpenMx. Surprisingly, OpenMx takes about 100x longer to fit the model (same starting values in lavaan and OpenMx). I attached an MWE.
The results on my machine are
lavaan: 0.16 seconds/31 iterations
OpenMx: 13.2 seconds/135 iterations

So 5x more iterations and 100x more time needed. I tried different starting values, but the overall picture does not change. OpenMx also tells me there were 29917 evaluations (fit@output$evaluations), which seems way to high. I assume I did something wrong in specifying the OpenMx model - maybe it is using numeric gradients instead of analytic ones?

File attachments

obs_cov.rda

lavmod.rda

Replied on Fri, 11/12/2021 - 10:54

AdminNeale Joined: Mar 01, 2013

A bug probably

Thank you for the post and the working minimal example. At a quick look it seems that the attempts to switch off the Standard Error calculations are not being heeded, since Yes and No result in the same timings. So we are working on it.

Thanks again!

Replied on Fri, 11/12/2021 - 13:47

AdminRobK Joined: Jan 24, 2014

correction

At a quick look it seems that the attempts to switch off the Standard Error calculations are not being heeded, since Yes and No result in the same timings

On closer inspection, that turned out not to be so.

Replied on Fri, 11/12/2021 - 11:05

jpritikin Joined: May 23, 2012

performance

OpenMx doesn't have analytic gradients for RAM models. That's not implemented yet. It looks like OpenMx takes a lot of time estimating the gradient. You can get somewhat better performance with multiple threads by getting rid of `mxOption(NULL, "Default optimizer", "NPSOL")` and using the default SLSQP.

Replied on Fri, 11/12/2021 - 11:35

MaximilianStefan Joined: Dec 04, 2018

Thanks for the helpful

Thanks for the helpful comments - is there a way to specify my model in a different way to get analytic gradients? Or is it just out of scope at the moment?

Another question, SLSQP related: the OpenMx 2.0 psychometrica Paper it says "the open-source NLopt family of optimizers is now selectable" - does this refer to SLSQP? Is there a way to also use the other optimizers from NLopt? Because in our Julia package, we also provide the possibility to use NLopt, and I observed that LBFGS often is faster.

Replied on Fri, 11/12/2021 - 11:56

jpritikin Joined: May 23, 2012

more info

> is there a way to specify my model in a different way to get analytic gradients?

Nope. Analytic gradients are only implemented for multivariate normal models with no latent variables, IFA, and a few other cases. We hope to add gradients for RAM in the future.

> does this refer to SLSQP?

Yes, SLSQP is the BFGS optimizer from NLOPT with some fixes. I've tried to submit these fixes back to the NLOPT project, but upstream seems dormant.

> Is there a way to also use the other optimizers from NLopt?

Not without hacking the C++ code. I tried the LBGFS code from NLopt and it didn't work well for many of the models in our test suite. However, I could imagine that it would work well for some models.

Replied on Fri, 11/12/2021 - 12:18

MaximilianStefan Joined: Dec 04, 2018

Ah, thats interesting - is

Ah, thats interesting - is somebody at the moment working on the analytic gradients for RAM it or is it just a ToDo for the future? Because I recently added them to our package, and I am still thinking about how to optimize the computation. If someone of your team has already put some thought into it, that would be very interesting.

Replied on Fri, 11/12/2021 - 12:58

jpritikin Joined: May 23, 2012

analytic gradients for RAM

It's a ToDo item. Nobody is working on it, but our team published [Efficient Hessian computation using sparse matrix derivatives in RAM notation](https://pure.mpg.de/rest/items/item_2096834/component/file_2096833/cont…)

Replied on Fri, 11/12/2021 - 13:19

MaximilianStefan Joined: Dec 04, 2018

Thanks!

Okay, that answers all of my questions - thanks again! I looked at this paper and implemented it (only for the gradients), but there are many considerations involved - for example, I spend half a day trying to figure out how to multiple a sparse vector with only ones with a dense Matrix in the most efficient way... At the end, I now use a completely different way that is very julia specific. However, I am still interestet in also implementing it in a more general way, so if somebody eventually is going to do it, I would be happy to hear from him/her.

Replied on Mon, 11/15/2021 - 07:38

tbates Joined: Jul 31, 2009

sparse matrices.

yeah "lavaan is near instant" is high up the list of priorities users share.
re sparse matrices, the "Matrix" package supports those, and also seems to transparently support upgrading matrices to sparse as needed. Your rapid implementation sounds intriguing!

i = c(1,3:8); j <- c(2,9,6:10); x <- 7 * (1:7) A = sparseMatrix(i, j, x = x)) # 8 x 10 "dgCMatrix" B= matrix(rnorm(80), 10,8) A %*% B

8 x 8 Matrix of class "dgeMatrix"
           [,1]        [,2]       [,3]      [,4]       [,5]       [,6]       [,7]        [,8]
[1,]  -2.709123  -0.5456542 -3.5437548  5.251409  -8.308948   6.293863  -6.399327    4.967028
[2,]   0.000000   0.0000000  0.0000000  0.000000   0.000000   0.000000   0.000000    0.000000
[3,]  -5.266561 -14.5482652 -0.5334075  4.415528   6.581861  26.175564   4.131042  -12.213783
[4,]  23.121014  22.4814728  0.2270105 17.234296 -11.527732   1.274841  11.982547  -40.122509
[5,] -14.316227   5.7110630 25.3229669 18.008147  -4.401374 -38.642979 -51.525559   14.052480
[6,] -11.444286  88.8746804  3.3778392 23.710228  42.430779  -1.565650 -42.546348   45.501898
[7,] -15.799684 -43.6447957 -1.6002225 13.246583  19.745582  78.526693  12.393126  -36.641350
[8,] -26.397878  37.7386355 13.8257015 73.119657  25.321075  -1.106326 -29.633466 -102.010822


A
8 x 10 sparse Matrix of class "dgCMatrix"
                             
[1,] . 7 . . .  .  .  .  .  .
[2,] . . . . .  .  .  .  .  .
[3,] . . . . .  .  .  . 14  .
[4,] . . . . . 21  .  .  .  .
[5,] . . . . .  . 28  .  .  .
[6,] . . . . .  .  . 35  .  .
[7,] . . . . .  .  .  . 42  .
[8,] . . . . .  .  .  .  . 49

Replied on Mon, 11/15/2021 - 11:33

MaximilianStefan Joined: Dec 04, 2018

re: Sparse Matrices

We also have that in julia (SparseArrays.jl), but it seemed to my like the tasks here are even more special. So for example, when taking the derivative of the A Matrix in RAM notation w.r.t. a parameter, you not only end up with a sparse matrix, but a very sparse square matrix with only ones. If you then multiply this with a dense Matrix, I think most implementations of sparse matrices will just convert the sparse matrix to a dense one and perform the multiplication. But maybe I am overthinking it...^^

Replied on Fri, 06/21/2024 - 10:47

lf-araujo Joined: Nov 25, 2020

This difference is due to

This difference is due to starting values, speed matches if you also auto-start values in OpenMx:
> fit <- mxAutoStart(fit) > fit <- mxRun(fit) Running untitled1 with 210 parameters > fit@output$iterations [1] 10 > fit@output$evaluations [1] 3001 > fit@output$backendTime Time difference of 3.155973 secs

Replied on Fri, 06/21/2024 - 10:48

lf-araujo Joined: Nov 25, 2020

I meant, improves... not

I meant, improves... not matches.

Replied on Fri, 06/21/2024 - 17:36

tbates Joined: Jul 31, 2009

should add autoStart to umxRAM

Hmm: That's a good user tip. I should also add autostart to umxRAM, at least. And maybe the twin models.

umx_time("start") fit = mxAutoStart(model) umx_time("stop") # auto start costs 1 second fit = mxRun(fit) umx_time(fit) # runtime drops to 1.3s (from 13s without autostart) fit = mxRun(model) # 13s

News

Recent Posts

Performance regression of OpenMx for simple CFA model

A bug probably

correction

performance

Thanks for the helpful

more info

Ah, thats interesting - is

analytic gradients for RAM

Thanks!

sparse matrices.

re: Sparse Matrices

This difference is due to

I meant, improves... not

should add autoStart to umxRAM