You are here

Performance issues many algebras

22 posts / 0 new
Last post
jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Performance issues many algebras

Hey,

I am currently implementing a new model specification language for longitudinal panel models. As "backend", I use OpenMx. Everything works but it is painfully slow.

The model specification language uses mean and kernel functions to describe a model on a Gaussian process. The mean function gets some variables as input (for example time) and returns the model implied mean for that input (for example time point). The covariance function gets two inputs (for example time point 1, and time point 2) and returns the model implied covariance for that input (for example covariance between time point1 and time point 2). Thus, model specification in this language consists of formulating parametric mean and covariance functions. The latent growth curve model (with no correlation between intercept and slope) looks like this in this language:

$$m(t;[\mu_{Int},\mu_{Slope}])=\mu_{Int}+\mu_{Slope} \quad k(s,t;[\sigma_{Int}^2,\sigma_{Slope}^2,\sigma_\epsilon^2])=\sigma_{Int}^2+st\sigma_{Slope}^2+\delta(s-t)\sigma_\epsilon^2$$
I use the same parameter names here as in (https://openmx.ssri.psu.edu/docs/openmx/latest/TimeSeries_Path.html). $\delta(x)$ is 1 if $x=0$ and $0$ otherwise.

The model can be defined like that

gppModel('muInt+muSlope*t','varInt+varSlope*t*t!+omxApproxEquals(t,t!,0.0001)*error',myData).

To transform this into an OpenMx model, I use mxExpectationNormal and construct the appropriate covariance matrix and mean vector. That is, every entry of mean vector is an algebra, which is really only 'muI+muS*t' where t is replaced with the appropriate input, which is in the data set (myData). I use the same idea to create the appropriate algebra for every entry of the covariance matrix.

This all works fine. I get the same results using this approach as using the regular path specification approach for all models. However, it is painfully slow (like a couple of thousands times slower than the regular approach). It is also much slower than a matlab implementation of this language I did, which uses a completely different software as backend. I started profiling the code and it seems that, not surprisingly, a lot of time is spent evaluating the algebras. Any ideas how to make this faster?

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
more complete code

I am not surprised that it is running slow, but it would help diagnosis if you could attach a complete code example that demonstrates the performance issue.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Interesting. Could you give

Interesting. Could you give an example of what one of these MxAlgebras looks like, syntax-wise (which might be a dumb question, but I want to make sure I understand what your program does), and how many distinct such algebras need to be computed? Also, about how large are the needed covariance matrices?

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Thanks for the swift replies!

Thanks for the swift replies! I was not expecting this. You can run example code doing that

devtools::install_github("karchjd/gppmr@v0.1.1")
demo('example2LGCM',package='gppmr')

As a reminder, to find the folder where R saves the demos you can do

system.file("demo", package ="gppmr")

The issue is that line 19 in 'example2LGCM.R'

19:  semModel <- mxRun(semModel,silent = TRUE)

is much faster than line 25:

 25: gpModelFit <- gppFit(gpModel)

Line 19 fits the LGCM specified via RAM (path specification) matrices and line 25 fits the LGCM specified via the mean and kernel functions (see line 24):

24:  gpModel <- gppModel('muI+muS*t','varI+covIS*(t+t!)+varS*t*t!+omxApproxEquals(t,t!,0.0001)*sigma',myData)

The algebras look like this for the mean vector

 gpModel$omx$mus1
mxAlgebra 'mus1' 
$formula:  muI + muS * t[1, 1] 
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL

and like this for the covariance matrix

mxAlgebra 'cov1,1' 
$formula:  varI + covIS * (t[1, 1] + t[1, 1]) + varS * t[1, 1] * t[1, 1] + omxApproxEquals(t[1, 1], t[1, 1], 1e-04) * sigma 
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL

There is one for each entry of the mean vector and the covariance matrix. In this case, the covariance matrix is 20x20. With 20x20 the fitting is still a matter of a few seconds. However, as you increase the number of time points and thus the size of the covariance matrix the fitting becomes annoyingly slow.

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
combine algebras?

Why not say

mu=muI+muS * t

instead of doing it elementwise with

mu1=muI+muS * t[1,1]

etc? The fewer algebra ops you use, the better the performance. OpenMx doesn't optimize these expression trees. You have to do it yourself.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Seems like a good idea

Your comment goes in the direction I thought about: Making the algebras less redundant.

Probably, I don't comprehend how algebras work in OpenMx well enough to understand your comment. I thought I would need the index because while

gpModel$omx$mus1
mxAlgebra 'mus1' 
$formula:  muI + muS * t[1, 1] 
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL

for mus2

gpModel$omx$mus2
mxAlgebra 'mus2' 
$formula:  muI + muS * t[1, 2] 
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL
jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
huh

So what's your question?

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Now I get it

Ah, now I understand your comment. Yes, for this example it is a great idea to vectorize! However, I would really like the user to be able to supply the model in this univariate fashion. Thus, to take full advantage of the vectorization idea I would need to parse all univariate expressions to their vectorized equivalent, right? Seems like a lot of work but I might be wrong.

Could an alternative approach be to partially evaluate the algebras once (evaluate muI, muS) and then only repeat the step that is necessary (t)?

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
does vectorizing help?

It sounds like we need some performance improvements. Suppose you vectorize. Does it help enough? It is actually fast when written the way I suggested? It may not help. Can you provide a larger model (maybe 50-100 observations per row) so it will be easier to find hotspots in performance measurement? I assume you routinely use models this large?

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Good comments.
  1. Is it really necessary to make it faster?

Probably, it is not crucial for now. However, I think this model language is especially suitable for experience sampling data with the number of measurements > 100.

  1. Can you provide a model with more time points?

Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).

  1. Is it actually faster?

I could not test that because I got an error:

I did this to my model:

gpModel2$omx$expMean <- NULL
gpModel2$omx <- mxModel(gpModel2$omx,mxAlgebra(muI + muS * t, name="expMean"))

If if now do

mxRun(gpModel2$omx)

I get the error messeage

Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during the evaluation of 'expMean' in model 'GPPM' : non-conformable arrays

The formula in evaluateExpression that causes the error is

as.matrix(11) * as.matrix(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, 19))

where 11 is the starting value for muS and c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19) is t. Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
various
  1. Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during

I pushed v2.8.3-30-g07b72268b to fix this.

  1. Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).

I increase nP to 20.

After that, the bulk of the difference is frontend time,

frontend time: 3.222098 secs 
backend time: 0.4592214 secs 

vs

frontend time: 0.02226472 secs 
backend time: 0.08440924 secs 

I'm not surprised that algebra processing in the frontend is slow. Furthermore, I don't know how to speed it up. But I suspect we don't actually need to do any algebra processing in the frontend. It's mostly there to provide good error messages. So I suggest we reimplement the sanity checking in the backend. But that's a tedious and thankless project.

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
unsafe

One possible approach is to incrementally make the unsafe=TRUE option to mxRun skip more of the checking. That might be the best short-term solution. Julian is wrapping the OpenMx API anyway so there isn't really a concern about models being invalid.

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
some progress

I took an initial stab at this approach, v2.8.3-37-g9c643b866. When enabled, it reduced frontend time by more than 1 second compared to what I posted above. That's not as big a savings as I was hoping for, but it's a start.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Thanks a lot!

Thanks a lot for your this suggestion! I have been silent for while because I have issues switching to the source version of OpenMx. Once I resolved those I will rejoin this discussion.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
I agree some progress

Both measures (vectorization and unsafe) seem to decrease the time. However, as you write, the effect is not as big as one would hope. It's still <.01 seconds for the regular specification and >4 seconds for the example with nP=20=nT.

I would like to go bring back the idea of partial algebra evaluation. As far as I understand, we spend a lot of time evaluating Algebras. If we take the mean function as an example for simplicity. In principle, we would only need to evaluate 'muI+muS' once and then only replace the corresponding t values for every entry. The same is true for the covariance matrix. So, I would think that a lot of time could be saved using this kind of partial algebra evaluation strategy, which is quite popular in functional programming languages. However, for using it, I probably would need to implement it in OpenMx, right?

For now, I might just accept that it is slow and see if it becomes a big problem. I thought that there might be an easy fix but that does not seem to be the case. Quite amazing how much time you guys, especially Joshua, put into helping me. I really appreciate that!

jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
benchmark

> I would like to go bring back the idea of partial algebra evaluation.

First you have to benchmark. If you optimize without benchmarking then you don't know whether you changes improved things or not and you may target the wrong code for optimization.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Krönecker
Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?

You could use the Krönecker product operator, %x%, instead of the elementwise multiplication operator.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
suggestion

Sorry, I don't follow what you're saying here. But I have a suggestion. Define muI and muS like this:

#ncol=3 because there are 3 timepoints in this demo:
mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mui",name="muI")
mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mus",name="muS")

. Then, simply define expMean as

mxAlgebra(muI + muS * t, name="expMean")

, which is a 1x3 matrix. You can probably do something similar to define the expected covariance matrix as a single MxAlgebra, without a separate MxAlgebra for each element.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Clarify

So, this is essentially the same suggestion then Joshua made but with additionally making muI and muS matrices, right? If Joshua's suggestion works, that seems unnecessary, doesn't it?

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Initially, I thought that

Initially, I thought that Joshua's suggestion wouldn't work, because I didn't realize that you could use the * operator to multiply matrices by scalars in MxAlgebras now. In OpenMx 1.x, you certainly couldn't do that, but I found out from Joshua that it has been possible in OpenMx 2.x for a long time.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Ignore

Yeah, ignore this. I was confused.

jkarch's picture
Offline
Joined: 03/15/2011 - 15:04
Ignore

Ignore #6 was what I meant.