Performance issues many algebras

Hey,
I am currently implementing a new model specification language for longitudinal panel models. As "backend", I use OpenMx. Everything works but it is painfully slow.
The model specification language uses mean and kernel functions to describe a model on a Gaussian process. The mean function gets some variables as input (for example time) and returns the model implied mean for that input (for example time point). The covariance function gets two inputs (for example time point 1, and time point 2) and returns the model implied covariance for that input (for example covariance between time point1 and time point 2). Thus, model specification in this language consists of formulating parametric mean and covariance functions. The latent growth curve model (with no correlation between intercept and slope) looks like this in this language:
$$m(t;[\mu_{Int},\mu_{Slope}])=\mu_{Int}+\mu_{Slope} \quad k(s,t;[\sigma_{Int}^2,\sigma_{Slope}^2,\sigma_\epsilon^2])=\sigma_{Int}^2+st\sigma_{Slope}^2+\delta(s-t)\sigma_\epsilon^2$$
I use the same parameter names here as in (http://openmx.psyc.virginia.edu/docs/openmx/latest/TimeSeries_Path.html). $\delta(x)$ is 1 if $x=0$ and $0$ otherwise.
The model can be defined like that
gppModel('muInt+muSlope*t','varInt+varSlope*t*t!+omxApproxEquals(t,t!,0.0001)*error',myData).
To transform this into an OpenMx model, I use mxExpectationNormal and construct the appropriate covariance matrix and mean vector. That is, every entry of mean vector is an algebra, which is really only 'muI+muS*t' where t is replaced with the appropriate input, which is in the data set (myData). I use the same idea to create the appropriate algebra for every entry of the covariance matrix.
This all works fine. I get the same results using this approach as using the regular path specification approach for all models. However, it is painfully slow (like a couple of thousands times slower than the regular approach). It is also much slower than a matlab implementation of this language I did, which uses a completely different software as backend. I started profiling the code and it seems that, not surprisingly, a lot of time is spent evaluating the algebras. **Any ideas how to make this faster?**
more complete code
I am not surprised that it is running slow, but it would help diagnosis if you could attach a complete code example that demonstrates the performance issue.
Log in or register to post comments
Interesting. Could you give
Interesting. Could you give an example of what one of these MxAlgebras looks like, syntax-wise (which might be a dumb question, but I want to make sure I understand what your program does), and how many distinct such algebras need to be computed? Also, about how large are the needed covariance matrices?
Log in or register to post comments
Thanks for the swift replies!
Thanks for the swift replies! I was not expecting this. You can run example code doing that
devtools::install_github("karchjd/gppmr@v0.1.1")
demo('example2LGCM',package='gppmr')
As a reminder, to find the folder where R saves the demos you can do
system.file("demo", package ="gppmr")
The issue is that line 19 in 'example2LGCM.R'
19: semModel <- mxRun(semModel,silent = TRUE)
is much faster than line 25:
25: gpModelFit <- gppFit(gpModel)
Line 19 fits the LGCM specified via RAM (path specification) matrices and line 25 fits the LGCM specified via the mean and kernel functions (see line 24):
24: gpModel <- gppModel('muI+muS*t','varI+covIS*(t+t!)+varS*t*t!+omxApproxEquals(t,t!,0.0001)*sigma',myData)
The algebras look like this for the mean vector
gpModel$omx$mus1
mxAlgebra 'mus1'
$formula: muI + muS * t[1, 1]
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL
and like this for the covariance matrix
mxAlgebra 'cov1,1'
$formula: varI + covIS * (t[1, 1] + t[1, 1]) + varS * t[1, 1] * t[1, 1] + omxApproxEquals(t[1, 1], t[1, 1], 1e-04) * sigma
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL
There is one for each entry of the mean vector and the covariance matrix. In this case, the covariance matrix is 20x20. With 20x20 the fitting is still a matter of a few seconds. However, as you increase the number of time points and thus the size of the covariance matrix the fitting becomes annoyingly slow.
Log in or register to post comments
combine algebras?
Why not say
instead of doing it elementwise with
etc? The fewer algebra ops you use, the better the performance. OpenMx doesn't optimize these expression trees. You have to do it yourself.
Log in or register to post comments
Seems like a good idea
Your comment goes in the direction I thought about: Making the algebras less redundant.
Probably, I don't comprehend how algebras work in OpenMx well enough to understand your comment. I thought I would need the index because while
gpModel$omx$mus1
mxAlgebra 'mus1'
$formula: muI + muS * t[1, 1]
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL
for mus2
gpModel$omx$mus2
mxAlgebra 'mus2'
$formula: muI + muS * t[1, 2]
$result: (not yet computed) <0 x 0 matrix>
dimnames: NULL
Log in or register to post comments
In reply to Seems like a good idea by jkarch
huh
So what's your question?
Log in or register to post comments
In reply to huh by jpritikin
Now I get it
Ah, now I understand your comment. Yes, for this example it is a great idea to vectorize! However, I would really like the user to be able to supply the model in this univariate fashion. Thus, to take full advantage of the vectorization idea I would need to parse all univariate expressions to their vectorized equivalent, right? Seems like a lot of work but I might be wrong.
Could an alternative approach be to partially evaluate the algebras once (evaluate muI, muS) and then only repeat the step that is necessary (t)?
Log in or register to post comments
In reply to Now I get it by jkarch
does vectorizing help?
It sounds like we need some performance improvements. Suppose you vectorize. Does it help enough? It is actually fast when written the way I suggested? It may not help. Can you provide a larger model (maybe 50-100 observations per row) so it will be easier to find hotspots in performance measurement? I assume you routinely use models this large?
Log in or register to post comments
In reply to does vectorizing help? by jpritikin
Good comments.
1. Is it really necessary to make it faster?
Probably, it is not crucial for now. However, I think this model language is especially suitable for experience sampling data with the number of measurements > 100.
2. Can you provide a model with more time points?
Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).
3. Is it actually faster?
I could not test that because I got an error:
I did this to my model:
gpModel2$omx$expMean <- NULL
gpModel2$omx <- mxModel(gpModel2$omx,mxAlgebra(muI + muS * t, name="expMean"))
If if now do
mxRun(gpModel2$omx)
I get the error messeage
Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during the evaluation of 'expMean' in model 'GPPM' : non-conformable arrays
The formula in evaluateExpression that causes the error is
as.matrix(11) * as.matrix(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19))
where 11 is the starting value for muS and c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19) is t. Apparently, the elementwise multiplication in R really assumes that both matrices are of the same size, which would require Robert's workaround?
Log in or register to post comments
In reply to Good comments. by jkarch
various
1. Error: The following error occurred while evaluating the subexpression 'GPPM.muS * GPPM.t' during
I pushed v2.8.3-30-g07b72268b to fix this.
2. Just change the nP and nT parameters in the example2LGCM.R example. nP is number of persons (rows) and nT number of time points (columns).
I increase nP to 20.
After that, the bulk of the difference is frontend time,
frontend time: 3.222098 secs
backend time: 0.4592214 secs
vs
frontend time: 0.02226472 secs
backend time: 0.08440924 secs
I'm not surprised that algebra processing in the frontend is slow. Furthermore, I don't know how to speed it up. But I suspect we don't actually need to do any algebra processing in the frontend. It's mostly there to provide good error messages. So I suggest we reimplement the sanity checking in the backend. But that's a tedious and thankless project.
Log in or register to post comments
In reply to various by jpritikin
unsafe
One possible approach is to incrementally make the unsafe=TRUE option to mxRun skip more of the checking. That might be the best short-term solution. Julian is wrapping the OpenMx API anyway so there isn't really a concern about models being invalid.
Log in or register to post comments
In reply to unsafe by jpritikin
some progress
I took an initial stab at this approach, v2.8.3-37-g9c643b866. When enabled, it reduced frontend time by more than 1 second compared to what I posted above. That's not as big a savings as I was hoping for, but it's a start.
Log in or register to post comments
In reply to some progress by jpritikin
Thanks a lot!
Thanks a lot for your this suggestion! I have been silent for while because I have issues switching to the source version of OpenMx. Once I resolved those I will rejoin this discussion.
Log in or register to post comments
In reply to some progress by jpritikin
I agree some progress
Both measures (vectorization and unsafe) seem to decrease the time. However, as you write, the effect is not as big as one would hope. It's still <.01 seconds for the regular specification and >4 seconds for the example with nP=20=nT.
I would like to go bring back the idea of partial algebra evaluation. As far as I understand, we spend a lot of time evaluating Algebras. If we take the mean function as an example for simplicity. In principle, we would only need to evaluate 'muI+muS' once and then only replace the corresponding t values for every entry. The same is true for the covariance matrix. So, I would think that a lot of time could be saved using this kind of partial algebra evaluation strategy, which is quite popular in functional programming languages. However, for using it, I probably would need to implement it in OpenMx, right?
For now, I might just accept that it is slow and see if it becomes a big problem. I thought that there might be an easy fix but that does not seem to be the case. Quite amazing how much time you guys, especially Joshua, put into helping me. I really appreciate that!
Log in or register to post comments
In reply to I agree some progress by jkarch
benchmark
> I would like to go bring back the idea of partial algebra evaluation.
First you have to benchmark. If you optimize without benchmarking then you don't know whether you changes improved things or not and you may target the wrong code for optimization.
Log in or register to post comments
In reply to Good comments. by jkarch
Krönecker
You could use the Krönecker product operator,
%x%
, instead of the elementwise multiplication operator.Log in or register to post comments
In reply to Seems like a good idea by jkarch
suggestion
Sorry, I don't follow what you're saying here. But I have a suggestion. Define
muI
andmuS
like this:#ncol=3 because there are 3 timepoints in this demo:
mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mui",name="muI")
mxMatrix(type="Full",nrow=1,ncol=3,free=T,values=10,labels="mus",name="muS")
. Then, simply define
expMean
asmxAlgebra(muI + muS * t, name="expMean")
, which is a 1x3 matrix. You can probably do something similar to define the expected covariance matrix as a single MxAlgebra, without a separate MxAlgebra for each element.
Log in or register to post comments
In reply to suggestion by AdminRobK
Clarify
So, this is essentially the same suggestion then Joshua made but with additionally making muI and muS matrices, right? If Joshua's suggestion works, that seems unnecessary, doesn't it?
Log in or register to post comments
In reply to Clarify by jkarch
Initially, I thought that
Initially, I thought that Joshua's suggestion wouldn't work, because I didn't realize that you could use the * operator to multiply matrices by scalars in MxAlgebras now. In OpenMx 1.x, you certainly couldn't do that, but I found out from Joshua that it has been possible in OpenMx 2.x for a long time.
Log in or register to post comments
In reply to Seems like a good idea by jkarch
Ignore
Yeah, ignore this. I was confused.
Log in or register to post comments
In reply to Ignore by jkarch
Ignore
Ignore #6 was what I meant.
Log in or register to post comments