Performance issues many algebras

Posted on Tue, 12/05/2017 - 13:32

jkarch Joined: 03/15/2011

Forums

Hey,

I am currently implementing a new model specification language for longitudinal panel models. As "backend", I use OpenMx. Everything works but it is painfully slow.

The model specification language uses mean and kernel functions to describe a model on a Gaussian process. The mean function gets some variables as input (for example time) and returns the model implied mean for that input (for example time point). The covariance function gets two inputs (for example time point 1, and time point 2) and returns the model implied covariance for that input (for example covariance between time point1 and time point 2). Thus, model specification in this language consists of formulating parametric mean and covariance functions. The latent growth curve model (with no correlation between intercept and slope) looks like this in this language:

$$m(t;[\mu_{Int},\mu_{Slope}])=\mu_{Int}+\mu_{Slope} \quad k(s,t;[\sigma_{Int}^2,\sigma_{Slope}^2,\sigma_\epsilon^2])=\sigma_{Int}^2+st\sigma_{Slope}^2+\delta(s-t)\sigma_\epsilon^2$$
I use the same parameter names here as in (http://openmx.psyc.virginia.edu/docs/openmx/latest/TimeSeries_Path.html). $\delta(x)$ is 1 if $x=0$ and $0$ otherwise.

The model can be defined like that

gppModel('muInt+muSlope*t','varInt+varSlope*t*t!+omxApproxEquals(t,t!,0.0001)*error',myData).

To transform this into an OpenMx model, I use mxExpectationNormal and construct the appropriate covariance matrix and mean vector. That is, every entry of mean vector is an algebra, which is really only 'muI+muS*t' where t is replaced with the appropriate input, which is in the data set (myData). I use the same idea to create the appropriate algebra for every entry of the covariance matrix.

This all works fine. I get the same results using this approach as using the regular path specification approach for all models. However, it is painfully slow (like a couple of thousands times slower than the regular approach). It is also much slower than a matlab implementation of this language I did, which uses a completely different software as backend. I started profiling the code and it seems that, not surprisingly, a lot of time is spent evaluating the algebras. **Any ideas how to make this faster?**

Replied on Tue, 12/05/2017 - 13:52

jpritikin Joined: May 23, 2012

more complete code

I am not surprised that it is running slow, but it would help diagnosis if you could attach a complete code example that demonstrates the performance issue.

Replied on Tue, 12/05/2017 - 13:57

AdminRobK Joined: Jan 24, 2014

Interesting. Could you give

Interesting. Could you give an example of what one of these MxAlgebras looks like, syntax-wise (which might be a dumb question, but I want to make sure I understand what your program does), and how many distinct such algebras need to be computed? Also, about how large are the needed covariance matrices?

Replied on Wed, 12/06/2017 - 07:19

jkarch Joined: Mar 15, 2011

Thanks for the swift replies!

Thanks for the swift replies! I was not expecting this. You can run example code doing that

devtools::install_github("karchjd/gppmr@v0.1.1") demo('example2LGCM',package='gppmr')

As a reminder, to find the folder where R saves the demos you can do

system.file("demo", package ="gppmr")

The issue is that line 19 in 'example2LGCM.R'
19: semModel <- mxRun(semModel,silent = TRUE)

is much faster than line 25:
25: gpModelFit <- gppFit(gpModel)

Line 19 fits the LGCM specified via RAM (path specification) matrices and line 25 fits the LGCM specified via the mean and kernel functions (see line 24):

24: gpModel <- gppModel('muI+muS*t','varI+covIS*(t+t!)+varS*t*t!+omxApproxEquals(t,t!,0.0001)*sigma',myData)

The algebras look like this for the mean vector
gpModel$omx$mus1 mxAlgebra 'mus1' $formula: muI + muS * t[1, 1] $result: (not yet computed) <0 x 0 matrix> dimnames: NULL

and like this for the covariance matrix

mxAlgebra 'cov1,1' $formula: varI + covIS * (t[1, 1] + t[1, 1]) + varS * t[1, 1] * t[1, 1] + omxApproxEquals(t[1, 1], t[1, 1], 1e-04) * sigma $result: (not yet computed) <0 x 0 matrix> dimnames: NULL

There is one for each entry of the mean vector and the covariance matrix. In this case, the covariance matrix is 20x20. With 20x20 the fitting is still a matter of a few seconds. However, as you increase the number of time points and thus the size of the covariance matrix the fitting becomes annoyingly slow.

Replied on Wed, 12/06/2017 - 09:17

jpritikin Joined: May 23, 2012

combine algebras?

Why not say

mu=muI+muS * t

instead of doing it elementwise with

mu1=muI+muS * t[1,1]

etc? The fewer algebra ops you use, the better the performance. OpenMx doesn't optimize these expression trees. You have to do it yourself.

Replied on Thu, 12/07/2017 - 08:18

jkarch Joined: Mar 15, 2011

Seems like a good idea

Your comment goes in the direction I thought about: Making the algebras less redundant.

Probably, I don't comprehend how algebras work in OpenMx well enough to understand your comment. I thought I would need the index because while
gpModel$omx$mus1 mxAlgebra 'mus1' $formula: muI + muS * t[1, 1] $result: (not yet computed) <0 x 0 matrix> dimnames: NULL
for mus2
gpModel$omx$mus2 mxAlgebra 'mus2' $formula: muI + muS * t[1, 2] $result: (not yet computed) <0 x 0 matrix> dimnames: NULL

Replied on Thu, 12/07/2017 - 08:52

jpritikin Joined: May 23, 2012

huh

So what's your question?

Replied on Fri, 12/08/2017 - 05:51

jkarch Joined: Mar 15, 2011

Now I get it

Ah, now I understand your comment. Yes, for this example it is a great idea to vectorize! However, I would really like the user to be able to supply the model in this univariate fashion. Thus, to take full advantage of the vectorization idea I would need to parse all univariate expressions to their vectorized equivalent, right? Seems like a lot of work but I might be wrong.

Could an alternative approach be to partially evaluate the algebras once (evaluate muI, muS) and then only repeat the step that is necessary (t)?

Replied on Fri, 12/08/2017 - 08:55

jpritikin Joined: May 23, 2012

does vectorizing help?

It sounds like we need some performance improvements. Suppose you vectorize. Does it help enough? It is actually fast when written the way I suggested? It may not help. Can you provide a larger model (maybe 50-100 observations per row) so it will be easier to find hotspots in performance measurement? I assume you routinely use models this large?

Replied on Mon, 12/11/2017 - 08:25

jkarch Joined: Mar 15, 2011

Good comments.

1. Is it really necessary to make it faster?

Probably, it is not crucial for now. However, I think this model language is especially suitable for experience sampling data with the number of measurements > 100.

2. Can you provide a model with more time points?

Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).

3. Is it actually faster?

I could not test that because I got an error:

I did this to my model:

gpModel2$omx$expMean <- NULL gpModel2$omx <- mxModel(gpModel2$omx,mxAlgebra(muI + muS * t, name="expMean"))

If if now do
mxRun(gpModel2$omx)
I get the error messeage

Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during the evaluation of 'expMean' in model 'GPPM' : non-conformable arrays

The formula in evaluateExpression that causes the error is
as.matrix(11) * as.matrix(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19))

where 11 is the starting value for muS and c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19) is t. Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?

Replied on Mon, 12/11/2017 - 09:59

jpritikin Joined: May 23, 2012

various

1. Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during

I pushed v2.8.3-30-g07b72268b to fix this.

2. Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).

I increase nP to 20.

After that, the bulk of the difference is frontend time,

frontend time: 3.222098 secs
backend time: 0.4592214 secs

frontend time: 0.02226472 secs
backend time: 0.08440924 secs

I'm not surprised that algebra processing in the frontend is slow. Furthermore, I don't know how to speed it up. But I suspect we don't actually need to do any algebra processing in the frontend. It's mostly there to provide good error messages. So I suggest we reimplement the sanity checking in the backend. But that's a tedious and thankless project.

Replied on Tue, 12/12/2017 - 16:17

jpritikin Joined: May 23, 2012

unsafe

One possible approach is to incrementally make the unsafe=TRUE option to mxRun skip more of the checking. That might be the best short-term solution. Julian is wrapping the OpenMx API anyway so there isn't really a concern about models being invalid.

Replied on Wed, 12/13/2017 - 13:57

jpritikin Joined: May 23, 2012

some progress

I took an initial stab at this approach, v2.8.3-37-g9c643b866. When enabled, it reduced frontend time by more than 1 second compared to what I posted above. That's not as big a savings as I was hoping for, but it's a start.

Replied on Thu, 12/14/2017 - 09:35

jkarch Joined: Mar 15, 2011

Thanks a lot!

Thanks a lot for your this suggestion! I have been silent for while because I have issues switching to the source version of OpenMx. Once I resolved those I will rejoin this discussion.

Replied on Fri, 12/15/2017 - 11:41

jkarch Joined: Mar 15, 2011

I agree some progress

Both measures (vectorization and unsafe) seem to decrease the time. However, as you write, the effect is not as big as one would hope. It's still <.01 seconds for the regular specification and >4 seconds for the example with nP=20=nT.

I would like to go bring back the idea of partial algebra evaluation. As far as I understand, we spend a lot of time evaluating Algebras. If we take the mean function as an example for simplicity. In principle, we would only need to evaluate 'muI+muS' once and then only replace the corresponding t values for every entry. The same is true for the covariance matrix. So, I would think that a lot of time could be saved using this kind of partial algebra evaluation strategy, which is quite popular in functional programming languages. However, for using it, I probably would need to implement it in OpenMx, right?

For now, I might just accept that it is slow and see if it becomes a big problem. I thought that there might be an easy fix but that does not seem to be the case. Quite amazing how much time you guys, especially Joshua, put into helping me. I really appreciate that!

Replied on Fri, 12/15/2017 - 12:22

jpritikin Joined: May 23, 2012

benchmark

> I would like to go bring back the idea of partial algebra evaluation.

First you have to benchmark. If you optimize without benchmarking then you don't know whether you changes improved things or not and you may target the wrong code for optimization.

Replied on Mon, 12/11/2017 - 13:40

AdminRobK Joined: Jan 24, 2014

Krönecker

Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?

You could use the Krönecker product operator, %x%, instead of the elementwise multiplication operator.

Replied on Thu, 12/07/2017 - 10:38

AdminRobK Joined: Jan 24, 2014

suggestion

Sorry, I don't follow what you're saying here. But I have a suggestion. Define muI and muS like this:
#ncol=3 because there are 3 timepoints in this demo: mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mui",name="muI") mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mus",name="muS")
. Then, simply define expMean as
mxAlgebra(muI + muS * t, name="expMean")
, which is a 1x3 matrix. You can probably do something similar to define the expected covariance matrix as a single MxAlgebra, without a separate MxAlgebra for each element.

Replied on Fri, 12/08/2017 - 05:56

jkarch Joined: Mar 15, 2011

Clarify

So, this is essentially the same suggestion then Joshua made but with additionally making muI and muS matrices, right? If Joshua's suggestion works, that seems unnecessary, doesn't it?

Replied on Fri, 12/08/2017 - 11:13

AdminRobK Joined: Jan 24, 2014

Initially, I thought that

Initially, I thought that Joshua's suggestion wouldn't work, because I didn't realize that you could use the * operator to multiply matrices by scalars in MxAlgebras now. In OpenMx 1.x, you certainly couldn't do that, but I found out from Joshua that it has been possible in OpenMx 2.x for a long time.

Replied on Fri, 12/08/2017 - 05:57

jkarch Joined: Mar 15, 2011

Ignore

Yeah, ignore this. I was confused.

Replied on Fri, 12/08/2017 - 05:58

jkarch Joined: Mar 15, 2011

Ignore

Ignore #6 was what I meant.

News

Recent Posts

Performance issues many algebras

more complete code

Interesting. Could you give

Thanks for the swift replies!

combine algebras?

Seems like a good idea

huh

Now I get it

does vectorizing help?

Good comments.

various

unsafe

some progress

Thanks a lot!

I agree some progress

benchmark

Krönecker

suggestion

Clarify

Initially, I thought that

Ignore

Ignore

News

Recent Posts