Attachment | Size |
---|---|
![]() | 58.95 KB |
![]() | 301 bytes |
![]() | 1.88 KB |
I specified a simple CFA model (5 factors, 20 Items each), simulated data from it (25 observations for each parameter) and fitted the model in lavaan and OpenMx. Surprisingly, OpenMx takes about 100x longer to fit the model (same starting values in lavaan and OpenMx). I attached an MWE.
The results on my machine are
lavaan: 0.16 seconds/31 iterations
OpenMx: 13.2 seconds/135 iterations
So 5x more iterations and 100x more time needed. I tried different starting values, but the overall picture does not change. OpenMx also tells me there were 29917 evaluations (fit@output$evaluations), which seems way to high. I assume I did something wrong in specifying the OpenMx model - maybe it is using numeric gradients instead of analytic ones?
Hi
Thank you for the post and the working minimal example. At a quick look it seems that the attempts to switch off the Standard Error calculations are not being heeded, since Yes and No result in the same timings. So we are working on it.
Thanks again!
On closer inspection, that turned out not to be so.
OpenMx doesn't have analytic gradients for RAM models. That's not implemented yet. It looks like OpenMx takes a lot of time estimating the gradient. You can get somewhat better performance with multiple threads by getting rid of
mxOption(NULL, "Default optimizer", "NPSOL")
and using the default SLSQP.Thanks for the helpful comments - is there a way to specify my model in a different way to get analytic gradients? Or is it just out of scope at the moment?
Another question, SLSQP related: the OpenMx 2.0 psychometrica Paper it says "the open-source NLopt family of optimizers is now selectable" - does this refer to SLSQP? Is there a way to also use the other optimizers from NLopt? Because in our Julia package, we also provide the possibility to use NLopt, and I observed that LBFGS often is faster.
> is there a way to specify my model in a different way to get analytic gradients?
Nope. Analytic gradients are only implemented for multivariate normal models with no latent variables, IFA, and a few other cases. We hope to add gradients for RAM in the future.
> does this refer to SLSQP?
Yes, SLSQP is the BFGS optimizer from NLOPT with some fixes. I've tried to submit these fixes back to the NLOPT project, but upstream seems dormant.
> Is there a way to also use the other optimizers from NLopt?
Not without hacking the C++ code. I tried the LBGFS code from NLopt and it didn't work well for many of the models in our test suite. However, I could imagine that it would work well for some models.
Ah, thats interesting - is somebody at the moment working on the analytic gradients for RAM it or is it just a ToDo for the future? Because I recently added them to our package, and I am still thinking about how to optimize the computation. If someone of your team has already put some thought into it, that would be very interesting.
It's a ToDo item. Nobody is working on it, but our team published Efficient Hessian computation using sparse matrix derivatives in RAM notation
Okay, that answers all of my questions - thanks again! I looked at this paper and implemented it (only for the gradients), but there are many considerations involved - for example, I spend half a day trying to figure out how to multiple a sparse vector with only ones with a dense Matrix in the most efficient way... At the end, I now use a completely different way that is very julia specific. However, I am still interestet in also implementing it in a more general way, so if somebody eventually is going to do it, I would be happy to hear from him/her.
yeah "lavaan is near instant" is high up the list of priorities users share.
re sparse matrices, the "Matrix" package supports those, and also seems to transparently support upgrading matrices to sparse as needed. Your rapid implementation sounds intriguing!
We also have that in julia (SparseArrays.jl), but it seemed to my like the tasks here are even more special. So for example, when taking the derivative of the A Matrix in RAM notation w.r.t. a parameter, you not only end up with a sparse matrix, but a very sparse square matrix with only ones. If you then multiply this with a dense Matrix, I think most implementations of sparse matrices will just convert the sparse matrix to a dense one and perform the multiplication. But maybe I am overthinking it...^^