Performance regression of OpenMx for simple CFA model

Posted on
No user picture. MaximilianStefan Joined: 12/04/2018
I specified a simple CFA model (5 factors, 20 Items each), simulated data from it (25 observations for each parameter) and fitted the model in lavaan and OpenMx. Surprisingly, OpenMx takes about 100x longer to fit the model (same starting values in lavaan and OpenMx). I attached an MWE.
The results on my machine are
lavaan: 0.16 seconds/31 iterations
OpenMx: 13.2 seconds/135 iterations

So 5x more iterations and 100x more time needed. I tried different starting values, but the overall picture does not change. OpenMx also tells me there were 29917 evaluations (fit@output$evaluations), which seems way to high. I assume I did something wrong in specifying the OpenMx model - maybe it is using numeric gradients instead of analytic ones?

Replied on Fri, 11/12/2021 - 10:54
Picture of user. AdminNeale Joined: 03/01/2013

Hi

Thank you for the post and the working minimal example. At a quick look it seems that the attempts to switch off the Standard Error calculations are not being heeded, since Yes and No result in the same timings. So we are working on it.

Thanks again!

Replied on Fri, 11/12/2021 - 13:47
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by AdminNeale

At a quick look it seems that the attempts to switch off the Standard Error calculations are not being heeded, since Yes and No result in the same timings

On closer inspection, that turned out not to be so.

Replied on Fri, 11/12/2021 - 11:05
Picture of user. jpritikin Joined: 05/23/2012

OpenMx doesn't have analytic gradients for RAM models. That's not implemented yet. It looks like OpenMx takes a lot of time estimating the gradient. You can get somewhat better performance with multiple threads by getting rid of `mxOption(NULL, "Default optimizer", "NPSOL")` and using the default SLSQP.
Replied on Fri, 11/12/2021 - 11:35
No user picture. MaximilianStefan Joined: 12/04/2018

Thanks for the helpful comments - is there a way to specify my model in a different way to get analytic gradients? Or is it just out of scope at the moment?

Another question, SLSQP related: the OpenMx 2.0 psychometrica Paper it says "the open-source NLopt family of optimizers is now selectable" - does this refer to SLSQP? Is there a way to also use the other optimizers from NLopt? Because in our Julia package, we also provide the possibility to use NLopt, and I observed that LBFGS often is faster.

Replied on Fri, 11/12/2021 - 11:56
Picture of user. jpritikin Joined: 05/23/2012

> is there a way to specify my model in a different way to get analytic gradients?

Nope. Analytic gradients are only implemented for multivariate normal models with no latent variables, IFA, and a few other cases. We hope to add gradients for RAM in the future.

> does this refer to SLSQP?

Yes, SLSQP is the BFGS optimizer from NLOPT with some fixes. I've tried to submit these fixes back to the NLOPT project, but upstream seems dormant.

> Is there a way to also use the other optimizers from NLopt?

Not without hacking the C++ code. I tried the LBGFS code from NLopt and it didn't work well for many of the models in our test suite. However, I could imagine that it would work well for some models.

Replied on Fri, 11/12/2021 - 12:18
No user picture. MaximilianStefan Joined: 12/04/2018

Ah, thats interesting - is somebody at the moment working on the analytic gradients for RAM it or is it just a ToDo for the future? Because I recently added them to our package, and I am still thinking about how to optimize the computation. If someone of your team has already put some thought into it, that would be very interesting.
Replied on Fri, 11/12/2021 - 12:58
Picture of user. jpritikin Joined: 05/23/2012

It's a ToDo item. Nobody is working on it, but our team published [Efficient Hessian computation using sparse matrix derivatives in RAM notation](https://pure.mpg.de/rest/items/item_2096834/component/file_2096833/content)
Replied on Fri, 11/12/2021 - 13:19
No user picture. MaximilianStefan Joined: 12/04/2018

Okay, that answers all of my questions - thanks again! I looked at this paper and implemented it (only for the gradients), but there are many considerations involved - for example, I spend half a day trying to figure out how to multiple a sparse vector with only ones with a dense Matrix in the most efficient way... At the end, I now use a completely different way that is very julia specific. However, I am still interestet in also implementing it in a more general way, so if somebody eventually is going to do it, I would be happy to hear from him/her.
Replied on Mon, 11/15/2021 - 07:38
Picture of user. tbates Joined: 07/31/2009

yeah "lavaan is near instant" is high up the list of priorities users share.
re sparse matrices, the "Matrix" package supports those, and also seems to transparently support upgrading matrices to sparse as needed. Your rapid implementation sounds intriguing!


i = c(1,3:8); j <- c(2,9,6:10); x <- 7 * (1:7)
A = sparseMatrix(i, j, x = x)) # 8 x 10 "dgCMatrix"
B= matrix(rnorm(80), 10,8)
A %*% B

8 x 8 Matrix of class "dgeMatrix"
           [,1]        [,2]       [,3]      [,4]       [,5]       [,6]       [,7]        [,8]
[1,]  -2.709123  -0.5456542 -3.5437548  5.251409  -8.308948   6.293863  -6.399327    4.967028
[2,]   0.000000   0.0000000  0.0000000  0.000000   0.000000   0.000000   0.000000    0.000000
[3,]  -5.266561 -14.5482652 -0.5334075  4.415528   6.581861  26.175564   4.131042  -12.213783
[4,]  23.121014  22.4814728  0.2270105 17.234296 -11.527732   1.274841  11.982547  -40.122509
[5,] -14.316227   5.7110630 25.3229669 18.008147  -4.401374 -38.642979 -51.525559   14.052480
[6,] -11.444286  88.8746804  3.3778392 23.710228  42.430779  -1.565650 -42.546348   45.501898
[7,] -15.799684 -43.6447957 -1.6002225 13.246583  19.745582  78.526693  12.393126  -36.641350
[8,] -26.397878  37.7386355 13.8257015 73.119657  25.321075  -1.106326 -29.633466 -102.010822


A
8 x 10 sparse Matrix of class "dgCMatrix"
                             
[1,] . 7 . . .  .  .  .  .  .
[2,] . . . . .  .  .  .  .  .
[3,] . . . . .  .  .  . 14  .
[4,] . . . . . 21  .  .  .  .
[5,] . . . . .  . 28  .  .  .
[6,] . . . . .  .  . 35  .  .
[7,] . . . . .  .  .  . 42  .
[8,] . . . . .  .  .  .  . 49

Replied on Mon, 11/15/2021 - 11:33
No user picture. MaximilianStefan Joined: 12/04/2018

In reply to by tbates

We also have that in julia (SparseArrays.jl), but it seemed to my like the tasks here are even more special. So for example, when taking the derivative of the A Matrix in RAM notation w.r.t. a parameter, you not only end up with a sparse matrix, but a very sparse square matrix with only ones. If you then multiply this with a dense Matrix, I think most implementations of sparse matrices will just convert the sparse matrix to a dense one and perform the multiplication. But maybe I am overthinking it...^^
Replied on Fri, 06/21/2024 - 10:47
No user picture. lf-araujo Joined: 11/25/2020

This difference is due to starting values, speed matches if you also auto-start values in OpenMx:

> fit <- mxAutoStart(fit)
> fit <- mxRun(fit)
Running untitled1 with 210 parameters
> fit@output$iterations
[1] 10
> fit@output$evaluations
[1] 3001
> fit@output$backendTime
Time difference of 3.155973 secs
Replied on Fri, 06/21/2024 - 17:36
Picture of user. tbates Joined: 07/31/2009

Hmm: That's a good user tip. I should also add autostart to umxRAM, at least. And maybe the twin models.


umx_time("start")
fit = mxAutoStart(model)
umx_time("stop") # auto start costs 1 second
fit = mxRun(fit)
umx_time(fit) # runtime drops to 1.3s (from 13s without autostart)
fit = mxRun(model) # 13s