Hey,

I am currently implementing a new model specification language for longitudinal panel models. As "backend", I use OpenMx. Everything works but it is painfully slow.

The model specification language uses mean and kernel functions to describe a model on a Gaussian process. The mean function gets some variables as input (for example time) and returns the model implied mean for that input (for example time point). The covariance function gets two inputs (for example time point 1, and time point 2) and returns the model implied covariance for that input (for example covariance between time point1 and time point 2). Thus, model specification in this language consists of formulating parametric mean and covariance functions. The latent growth curve model (with no correlation between intercept and slope) looks like this in this language:

$$m(t;[\mu_{Int},\mu_{Slope}])=\mu_{Int}+\mu_{Slope} \quad k(s,t;[\sigma_{Int}^2,\sigma_{Slope}^2,\sigma_\epsilon^2])=\sigma_{Int}^2+st\sigma_{Slope}^2+\delta(s-t)\sigma_\epsilon^2$$

I use the same parameter names here as in (https://openmx.ssri.psu.edu/docs/openmx/latest/TimeSeries_Path.html). $\delta(x)$ is 1 if $x=0$ and $0$ otherwise.

The model can be defined like that

gppModel('muInt+muSlope*t','varInt+varSlope*t*t!+omxApproxEquals(t,t!,0.0001)*error',myData).

To transform this into an OpenMx model, I use mxExpectationNormal and construct the appropriate covariance matrix and mean vector. That is, every entry of mean vector is an algebra, which is really only 'muI+muS*t' where t is replaced with the appropriate input, which is in the data set (myData). I use the same idea to create the appropriate algebra for every entry of the covariance matrix.

This all works fine. I get the same results using this approach as using the regular path specification approach for all models. However, it is painfully slow (like a couple of thousands times slower than the regular approach). It is also much slower than a matlab implementation of this language I did, which uses a completely different software as backend. I started profiling the code and it seems that, not surprisingly, a lot of time is spent evaluating the algebras. **Any ideas how to make this faster?**

I am not surprised that it is running slow, but it would help diagnosis if you could attach a complete code example that demonstrates the performance issue.

Interesting. Could you give an example of what one of these MxAlgebras looks like, syntax-wise (which might be a dumb question, but I want to make sure I understand what your program does), and how many distinct such algebras need to be computed? Also, about how large are the needed covariance matrices?

Thanks for the swift replies! I was not expecting this. You can run example code doing that

As a reminder, to find the folder where R saves the demos you can do

The issue is that line 19 in 'example2LGCM.R'

is much faster than line 25:

Line 19 fits the LGCM specified via RAM (path specification) matrices and line 25 fits the LGCM specified via the mean and kernel functions (see line 24):

The algebras look like this for the mean vector

and like this for the covariance matrix

There is one for each entry of the mean vector and the covariance matrix. In this case, the covariance matrix is 20x20. With 20x20 the fitting is still a matter of a few seconds. However, as you increase the number of time points and thus the size of the covariance matrix the fitting becomes annoyingly slow.

Why not say

instead of doing it elementwise with

etc? The fewer algebra ops you use, the better the performance. OpenMx doesn't optimize these expression trees. You have to do it yourself.

Your comment goes in the direction I thought about: Making the algebras less redundant.

Probably, I don't comprehend how algebras work in OpenMx well enough to understand your comment. I thought I would need the index because while

for mus2

So what's your question?

Ah, now I understand your comment. Yes, for this example it is a great idea to vectorize! However, I would really like the user to be able to supply the model in this univariate fashion. Thus, to take full advantage of the vectorization idea I would need to parse all univariate expressions to their vectorized equivalent, right? Seems like a lot of work but I might be wrong.

Could an alternative approach be to partially evaluate the algebras once (evaluate muI, muS) and then only repeat the step that is necessary (t)?

It sounds like we need some performance improvements. Suppose you vectorize. Does it help enough? It is actually fast when written the way I suggested? It may not help. Can you provide a larger model (maybe 50-100 observations per row) so it will be easier to find hotspots in performance measurement? I assume you routinely use models this large?

Probably, it is not crucial for now. However, I think this model language is especially suitable for experience sampling data with the number of measurements > 100.

Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).

I could not test that because I got an error:

I did this to my model:

If if now do

I get the error messeage

The formula in evaluateExpression that causes the error is

where 11 is the starting value for muS and c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,

11, 12, 13, 14, 15, 16, 17, 18, 19) is t. Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?

I pushed v2.8.3-30-g07b72268b to fix this.

I increase nP to 20.

After that, the bulk of the difference is frontend time,

vs

I'm not surprised that algebra processing in the frontend is slow. Furthermore, I don't know how to speed it up. But I suspect we don't actually need to do any algebra processing in the frontend. It's mostly there to provide good error messages. So I suggest we reimplement the sanity checking in the backend. But that's a tedious and thankless project.

One possible approach is to incrementally make the unsafe=TRUE option to mxRun skip more of the checking. That might be the best short-term solution. Julian is wrapping the OpenMx API anyway so there isn't really a concern about models being invalid.

I took an initial stab at this approach, v2.8.3-37-g9c643b866. When enabled, it reduced frontend time by more than 1 second compared to what I posted above. That's not as big a savings as I was hoping for, but it's a start.

Thanks a lot for your this suggestion! I have been silent for while because I have issues switching to the source version of OpenMx. Once I resolved those I will rejoin this discussion.

Both measures (vectorization and unsafe) seem to decrease the time. However, as you write, the effect is not as big as one would hope. It's still <.01 seconds for the regular specification and >4 seconds for the example with nP=20=nT.

I would like to go bring back the idea of partial algebra evaluation. As far as I understand, we spend a lot of time evaluating Algebras. If we take the mean function as an example for simplicity. In principle, we would only need to evaluate 'muI+muS' once and then only replace the corresponding t values for every entry. The same is true for the covariance matrix. So, I would think that a lot of time could be saved using this kind of partial algebra evaluation strategy, which is quite popular in functional programming languages. However, for using it, I probably would need to implement it in OpenMx, right?

For now, I might just accept that it is slow and see if it becomes a big problem. I thought that there might be an easy fix but that does not seem to be the case. Quite amazing how much time you guys, especially Joshua, put into helping me. I really appreciate that!

> I would like to go bring back the idea of partial algebra evaluation.

First you have to benchmark. If you optimize without benchmarking then you don't know whether you changes improved things or not and you may target the wrong code for optimization.

You could use the Krönecker product operator,

`%x%`

, instead of the elementwise multiplication operator.Sorry, I don't follow what you're saying here. But I have a suggestion. Define

`muI`

and`muS`

like this:. Then, simply define

`expMean`

as, which is a 1x3 matrix. You can probably do something similar to define the expected covariance matrix as a single MxAlgebra, without a separate MxAlgebra for each element.

So, this is essentially the same suggestion then Joshua made but with additionally making muI and muS matrices, right? If Joshua's suggestion works, that seems unnecessary, doesn't it?

Initially, I thought that Joshua's suggestion

wouldn'twork, because I didn't realize that you could use the * operator to multiply matrices by scalars in MxAlgebras now. In OpenMx 1.x, you certainly couldn't do that, but I found out from Joshua that it has been possible in OpenMx 2.x for a long time.Yeah, ignore this. I was confused.

Ignore #6 was what I meant.