Another excess memory usage problem - sudden spike when using csolnp

Posted on
No user picture. CharlesD Joined: 04/30/2013

CSOLNP is working quite nicely in general, but I have a few circumstances where things go dramatically wrong and crash my machine (hard reboot needed) due to excess memory usage if I don't notice fast enough. This doesn't occur when using npsol. With the earlier memory usage issues ( http://openmx.psyc.virginia.edu/thread/2551 ) memory usage increased gradually, but in this case it seems much more sudden.

A problem model can be downloaded from:
https://www.dropbox.com/s/kgpsualdlnaowhs/memprobmodel.RData?dl=0

test <- mxRun(memprobmodel, intervals=T)

edit: I don't know specifically what causes the issue, but I'm making extensive use of algebra, exponential functions, and definition variables.

editedit: problem *does* still exist even with latest updates (26-8-2014). so far I only experience it when calculating confidence intervals. With the above model, after a few minutes of fitting with memory usage at a couple of hundred mb, it suddenly starts going up very rapidly. The problem occurs on more than 1 pc.

Replied on Tue, 08/26/2014 - 12:27
Picture of user. neale Joined: Jul 31, 2009

In reply to by CharlesD

I could not reproduce this fault with OpenMx from SVN 3766, when run on a Mac Pro. No sign of excessive RAM usage (machine has 64G but reported 53G free throughout). For the record, here's the output I got with CSOLNP:


> summary(memprobRun)
Summary of ctsem

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 drift11 discreteDRIFT 1 1 0.9892736976 8.071140e-05 1
2 drift21 discreteDRIFT 2 1 0.0550340030 NA
3 drift12 discreteDRIFT 1 2 -0.0532963880 NA
4 drift22 discreteDRIFT 2 2 -0.0029642474 4.789469e-06 1
5 diffusion11 discreteDIFFUSION 1 1 0.1334714775 8.688719e-03 0
6 diffusion21 discreteDIFFUSION 2 1 0.0063801653 1.020747e-02
7 diffusion22 discreteDIFFUSION 2 2 0.2673616861 2.224372e-02 0
8 cint1 discreteCINT 1 1 0.4267176984 1.185061e-02
9 cint2 discreteCINT 1 2 3.4857018784 2.580032e-02
10 T1var11 withinphi 1 1 3.8303294374 1.511289e+00 0
11 T1var21 withinphi 2 1 -0.1442955883 5.484717e-01
12 T1var22 withinphi 4 1 0.5387750154 NA 0
13 T1meanV1 T1MEANS 1 1 17.6330892849 2.204301e-01
14 T1meanV2 T1MEANS 2 1 4.5930297322 7.128458e-02
15 traitvar11 discreteTRAITVAR 1 1 0.0005614714 1.897635e-03 0
16 traitvar21 discreteTRAITVAR 2 1 0.0147575449 3.549271e-03
17 traitvar22 discreteTRAITVAR 2 2 0.0083684409 1.051363e-02 0
18 T1traitcov11 T1TRAITCOV 1 1 0.0622806450 1.247996e-01
19 T1traitcov21 T1TRAITCOV 2 1 -0.4751025866 4.229783e-01
20 T1traitcov12 T1TRAITCOV 1 2 -0.1015906162 NA
21 T1traitcov22 T1TRAITCOV 2 2 -0.8484358084 NA

confidence intervals:
lbound estimate ubound note
ctsem.DRIFT[1,1] 0.02070937 0.0289038 0.02084729 !!!
ctsem.DRIFT[2,1] 0.66084469 0.7923929 0.70102875 !!!
ctsem.DRIFT[1,2] -0.81542200 -0.7673743 -0.64010226
ctsem.DRIFT[2,2] -14.30917421 -14.2575784 -1.96650516

observed statistics: 1200
estimated parameters: 21
degrees of freedom: 1179
-2 log likelihood: 1912.234
number of observations: 100
Information Criteria:
| df Penalty | Parameters Penalty | Sample-Size Adjusted
AIC: -445.7662 1954.234 NA
BIC: -3517.2619 2008.942 1942.619
Some of your fit indices are missing.
To get them, fit saturated and independence models, and include them with
summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...).
timestamp: 2014-08-26 13:29:15
wall clock time: 147.219 secs
OpenMx version number: 2.0.0.3766
Need help? See help(mxSummary)

And with NPSOL (which finds a lower minimum, unusual instance of better performance with NPSOL than CSOLNP):


> memprobRun <- mxRun(memprobmodel2, intervals=T)
Running ctsem
> summary(memprobRun)
Summary of ctsem

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 drift11 discreteDRIFT 1 1 0.48053361 0.058810931 1
2 drift21 discreteDRIFT 2 1 0.07088314 0.067632864
3 drift12 discreteDRIFT 1 2 0.02174985 0.040749825
4 drift22 discreteDRIFT 2 2 0.58355130 0.060506367 1
5 diffusion11 discreteDIFFUSION 1 1 0.10607987 0.007501988 0
6 diffusion21 discreteDIFFUSION 2 1 0.01775367 0.007384163
7 diffusion22 discreteDIFFUSION 2 2 0.20119085 0.014252114 0
8 cint1 discreteCINT 1 1 9.14676984 1.056070080
9 cint2 discreteCINT 1 2 0.60300918 1.216352717
10 T1var11 withinphi 1 1 2.84693038 0.402707390 0
11 T1var21 withinphi 2 1 0.10749578 0.077824140
12 T1var22 withinphi 4 1 0.20862826 0.029504750 0
13 T1meanV1 T1MEANS 1 1 17.70547447 0.168728456
14 T1meanV2 T1MEANS 2 1 4.50301285 0.045675842
15 traitvar11 discreteTRAITVAR 1 1 0.72703940 0.196624735 0
16 traitvar21 discreteTRAITVAR 2 1 -0.08178721 0.096569456
17 traitvar22 discreteTRAITVAR 2 2 0.00000000 0.021245529 0*
18 T1traitcov11 T1TRAITCOV 1 1 1.96383353 0.435687274
19 T1traitcov21 T1TRAITCOV 2 1 -0.31360423 0.347418356
20 T1traitcov12 T1TRAITCOV 1 2 0.07033241 0.057403361
21 T1traitcov22 T1TRAITCOV 2 2 -0.01570925 0.017056963

confidence intervals:
lbound estimate ubound note
ctsem.DRIFT[1,1] -1.02186200 -0.73579299 -0.5211748
ctsem.DRIFT[2,1] 0.05265773 0.13389238 0.3880235
ctsem.DRIFT[1,2] -0.10635909 0.04108366 0.1966068
ctsem.DRIFT[2,2] -0.76725750 -0.54120110 -0.3561074

observed statistics: 1200
estimated parameters: 21
degrees of freedom: 1179
-2 log likelihood: 1673.847
number of observations: 100
Information Criteria:
| df Penalty | Parameters Penalty | Sample-Size Adjusted
AIC: -684.1532 1715.847 NA
BIC: -3755.6488 1770.555 1704.232
Some of your fit indices are missing.
To get them, fit saturated and independence models, and include them with
summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...).
timestamp: 2014-08-26 12:25:25
wall clock time: 361.053 secs
OpenMx version number: 2.0.0.3766
Need help? See help(mxSummary)

Re-running CSOLNP improves the solution somewhat but it gets stuck again with -2 log likelihood: 1890.516 and no improvement was obtained from a third run. It was quite happy to stick with the estimated parameters from NPSOL though, and return standard errors without NA's:


> params <- omxGetParameters(memprobRunNPSOL)
> params
drift11 drift21 drift12 drift22 diffusion11 diffusion21 diffusion22
0.48053361 0.07088314 0.02174985 0.58355130 0.10607987 0.01775367 0.20119085
cint1 cint2 T1var11 T1var21 T1var22 T1meanV1 T1meanV2
9.14676984 0.60300918 2.84693038 0.10749578 0.20862826 17.70547447 4.50301285
traitvar11 traitvar21 traitvar22 T1traitcov11 T1traitcov21 T1traitcov12 T1traitcov22
0.72703940 -0.08178721 0.00000000 1.96383353 -0.31360423 0.07033241 -0.01570925
> npsolution <- omxSetParameters(memprobmodel2,labels=names(params),values=params)
> mxOption(NULL, "Default optimizer", "CSOLNP")
> memprobRunCSOLNP <- mxRun(npsolution,intervals=T)

> summary(memprobRunCSOLNP)
Summary of ctsem

free parameters:
name matrix row col Estimate Std.Error lbound ubound
1 drift11 discreteDRIFT 1 1 4.804554e-01 0.058832219 1
2 drift21 discreteDRIFT 2 1 7.086569e-02 0.067959088
3 drift12 discreteDRIFT 1 2 2.174293e-02 0.040773525
4 drift22 discreteDRIFT 2 2 5.835050e-01 0.060510100 1
5 diffusion11 discreteDIFFUSION 1 1 1.060794e-01 0.007501939 0
6 diffusion21 discreteDIFFUSION 2 1 1.775283e-02 0.007383953
7 diffusion22 discreteDIFFUSION 2 2 2.011832e-01 0.014251074 0
8 cint1 discreteCINT 1 1 9.148188e+00 1.056562609
9 cint2 discreteCINT 1 2 6.035270e-01 1.222148807
10 T1var11 withinphi 1 1 2.847043e+00 0.402804371 0
11 T1var21 withinphi 2 1 1.075018e-01 0.077831659
12 T1var22 withinphi 4 1 2.086286e-01 0.029505046 0
13 T1meanV1 T1MEANS 1 1 1.770547e+01 0.168731787
14 T1meanV2 T1MEANS 2 1 4.503013e+00 0.045675890
15 traitvar11 discreteTRAITVAR 1 1 7.272923e-01 0.196782445 0
16 traitvar21 discreteTRAITVAR 2 1 -8.178111e-02 0.097040229
17 traitvar22 discreteTRAITVAR 2 2 3.552471e-14 0.021339816 0*
18 T1traitcov11 T1TRAITCOV 1 1 1.964354e+00 0.436023851
19 T1traitcov21 T1TRAITCOV 2 1 -3.135785e-01 0.349096086
20 T1traitcov12 T1TRAITCOV 1 2 7.035303e-02 0.057427140
21 T1traitcov22 T1TRAITCOV 2 2 -1.570743e-02 0.017101853

confidence intervals:
lbound estimate ubound note
ctsem.DRIFT[1,1] -1.0155040 -0.73595494 -0.6076324
ctsem.DRIFT[2,1] 0.0526557 0.13387535 0.1379083
ctsem.DRIFT[1,2] -0.1025269 0.04107548 0.1966052
ctsem.DRIFT[2,2] -0.7661326 -0.54127946 -0.3580459

observed statistics: 1200
estimated parameters: 21
degrees of freedom: 1179
-2 log likelihood: 1673.847
number of observations: 100
Information Criteria:
| df Penalty | Parameters Penalty | Sample-Size Adjusted
AIC: -684.1532 1715.847 NA
BIC: -3755.6488 1770.555 1704.232
Some of your fit indices are missing.
To get them, fit saturated and independence models, and include them with
summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...).
timestamp: 2014-08-26 13:50:29
wall clock time: 109.1414 secs
OpenMx version number: 2.0.0.3766
Need help? See help(mxSummary)

Replied on Fri, 08/29/2014 - 13:06
Picture of user. mhunter Joined: Jul 31, 2009

In reply to by RobK

When using R 3.1.0 32-bit on Windows with the OpenMx Beta Binary, I don't get any huge memory usage. Running R and various background processes I'm using 2.24 GB of RAM. Running the example model with intervals=TRUE, it hangs around 2.25 GB for a while and eventually (probably when doing the intervals) it slowly climbs to 2.45 GB. On return after the model is done, everything goes back down to around 2.25 GB. This corresponds to a percent use between 27% and 30%. Nothing out of the ordinary to me. It sounds like I'm not replicating this problem.

Replied on Fri, 08/29/2014 - 14:42
No user picture. CharlesD Joined: Apr 30, 2013

In reply to by mhunter

Ok. I also don't get the issue with 32 bit R, memory usage remains very low. When I switch back to 64bit, I use all the spare physical memory on my laptop (6gb) and windows 'commits' 16gb of virtual memory to the process (I'm not clear on what that commitment actually means - is it using it or just prepared to use it in some way? This is according to the windows 8 resource monitor)

But, now I'm embarrassed... in the example I posted the confidence intervals are set to an algebra. When I correctly set them to the 'discreteDRIFT' matrix rather than the 'DRIFT' algebra (confusion arose because I've been switching between different parameter sets to work out which optimizes best), things work fine. I'll be surprised, but I won't say it's impossible, if this was the problem in the other cases. I'm impressed that confidence intervals estimate on an algebra in the first place - is that intended?

Replied on Fri, 08/29/2014 - 14:47
Picture of user. neale Joined: Jul 31, 2009

In reply to by CharlesD

Yes, that is a fully intended feature which has been present in classic Mx since 1995 and was designed into OpenMx from its earliest days.

I do hope that the memory issues are solved. Running the problem with Valgrind did not reveal any memory leaks. We really appreciate your input - keep the comments coming!

Replied on Mon, 09/08/2014 - 10:54
Picture of user. RobK Joined: Apr 19, 2011

In reply to by CharlesD

On Friday, I was running your memory-problematic model on a 64-bit Windows machine, under a debugger. When I compile without enabling multithreading, I notice that it doesn't memory-hog, but it does hang indefinitely. I'm trying to figure out where it gets stuck.

EDIT: Actually, I can tell from checkpointing that it's not hanging. It's just running a lot more slowly in debug mode than I thought. I also managed to trigger the memory leak on my 32-bit machine by running Charles' model repeatedly with mxTryHard() (in build from trunk).

Replied on Wed, 08/27/2014 - 16:17
No user picture. CharlesD Joined: Apr 30, 2013

In reply to by neale

Yes, I seem to encounter quite a lot of cases of starting value sensitivity with more complex continuous time models... making me think perhaps a bayesian approach would work better, but I'd love to hear any other suggestions or thoughts for dealing with the issue.

Replied on Wed, 08/27/2014 - 04:51
No user picture. CharlesD Joined: Apr 30, 2013

Ok, right, on my machine with more memory the above model also fits, but memory usage does still spike to 6gb or so, which illustrates what seems (to me) to be the problem (or potential improvement), as npsol fits with a steady 100mb or so. Does memory usage not start going up rapidly after a few minutes for you two? I'm surprised it fits on 32 bit windows actually, I would have thought it would definitely hit memory problems. I've been trying to generate a more problematic example but can't at the moment, if I get one that either memory spikes faster, or higher, I'll post it.

> mxVersion()
OpenMx version: 2.0.0.0
R version: R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP

This is with commit 9ce8fba on the master branch, on windows 7 and windows 8 pc's.

Replied on Wed, 08/27/2014 - 13:15
Picture of user. neale Joined: Jul 31, 2009

In reply to by CharlesD

Charles

I strongly suspect that this is a bug that has already been fixed, and that you are using an outdated version of the Beta. Your version number looks odd, it does not include a build number on the end, like this: 2.0.0.3766

When you say commit 9ce8fba I am confused (though others on the dev team may not be). Were you building from source? The svn tree is currently at version 3776.

Cheers
Mike

Replied on Wed, 08/27/2014 - 14:15
No user picture. CharlesD Joined: Apr 30, 2013

In reply to by neale

I was also surprised at the version number thing... I have rstudio setup with a project linked via git to the gitorious openmx (which is where I got the commit reference from), and build by telling rstudio to build (after specifying additional 'install' argument to the make command). This has worked ok in the past for getting updates, I can see the recent source code and see a recent change to default summary output wherein the optimizer is reported.

Replied on Wed, 08/27/2014 - 14:24
Picture of user. neale Joined: Jul 31, 2009

In reply to by CharlesD

If you could build from the svn repository version, per http://openmx.psyc.virginia.edu/wiki/howto-build-openmx-source-repository then I think the problem will go away. And you'll get a sensible version number.

Cheers
Mike

Replied on Wed, 08/27/2014 - 16:12
No user picture. CharlesD Joined: Apr 30, 2013

In reply to by neale

No change in behaviour, model still goes to 6gb of memory...

OpenMx version: 2.0.0.3777
R version: R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP

Does anybody know if / how I can impose a lower memory limit on 64 bit windows R? memory.limit doesn't want to let me decrease it. If I could do this I assume I would avoid the hard reboots on windows 7 at least (my windows 8 machine has nicer behaviour in this instance - instead of the machine bogging down to the point that I can't kill the task, it just pops a msg box complaining about mem usage).

Replied on Wed, 08/27/2014 - 17:26
Picture of user. jpritikin Joined: May 23, 2012

In reply to by CharlesD

I'm not sure how to impose a memory limit in Windows, but you'll need to impose a limit on application memory as a whole. OpenMx does not use R's memory in many cases so an R limit is not going to have much of an effect.

Replied on Tue, 09/02/2014 - 14:51
Picture of user. mhunter Joined: Jul 31, 2009

In reply to by CharlesD

I'm getting the same behavior on Windows 7 64-bit machine running R 3.1 64-bit on the OpenMx Binary. It looks like when confidence intervals start, memory usage quickly linearly increases to 100% RAM. Interestingly, the same machine running the same OpenMx on 32-bit R shows no problem.

Replied on Wed, 09/03/2014 - 10:02
Picture of user. RobK Joined: Apr 19, 2011

In reply to by mhunter

I should have tried to reproduce the problem on the 64-bit Windows machine in my office last week before I left for the long weekend... Anyhow, I just ran Charles' memprobmodel2, with intervals=T, and R's memory usage began to climb ceaselessly, as he described. So, it appears to be something specific to confidence intervals, with CSOLNP, under 64-bit Windows.

FWIW:

> mxVersion()
OpenMx version: 2.0.0.3751
R version: R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP