Program gets stuck

Posted on Tue, 06/20/2017 - 17:35

dtofighi Joined: 10/12/2009

Attachment

Size

My simulation file

1.93 KB

sessionInfo file

582 bytes

Forums

OpenMx 2.0 Discussion

Hello,

I am running an 'embarrassingly parallel' simulation study using your excellent OpenMx software. The simulation study generates data, fits an unconstrained and constrained mediation model, where the constraint is a non-linear function of the model parameters. The problem is it appears that the program (optimizer) gets stuck-- when I check the CPU usage, it comes back down to a minimum usage as if the the program were terminated. To replicate the problem, I am attaching my code (with a parallel random generator seed) as well as sessionInfo() of the R environment on my computer. I would greatly appreciate your response.

Replied on Wed, 06/21/2017 - 08:39

jpritikin Joined: May 23, 2012

works for me

I attached the output. I'm not sure, but maybe there are some bugs in how R manages processes under Windows.

File attachments

results.csv

Replied on Wed, 06/21/2017 - 11:04

AdminRobK Joined: Jan 24, 2014

I tried your script

I tried your script on my 32-bit Windows system with 4 logical CPUs, and my 64-bit Linux/GNU system with 2 logical CPUs. Your script used approximately the expected CPU load on both machines, but I saw no signs of progress even after waiting a while. So, I interrupted the script in RStudio and killed the processes it had spawned. I tried again with rep reduced from 200 to 20. The script ran very quickly on both systems.

How long should I expect it it take to run 200 replications, anyhow?

Under Windows, I'm running:
OpenMx version: 2.7.11 [GIT v2.7.11-dirty] R version: R version 3.4.0 (2017-04-21) Platform: i386-w64-mingw32 Default optimiser: CSOLNP
Under Linux/GNU:
OpenMx version: 2.7.11.59 [GIT v2.7.11-59-g5d1b3b3-dirty] R version: R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu Default optimiser: CSOLNP

If the issue is with OpenMx and not with doParallel, the only suggestions that come to my mind are to try using a different optimizer, or to provide each path coefficient with a nontrivial upper and lower bound with the lbound and ubound arguments to mxPath(). In fact, since your MxModel uses an MxConstraint, using SLSQP instead of CSOLNP is advisable in any event.

Replied on Wed, 06/21/2017 - 11:33

AdminRobK Joined: Jan 24, 2014

I tried again

I tried again with 'rep' set to 200, and with mxOption(NULL,"Default optimizer","SLSQP") after OpenMx is loaded. The script runs OK under Linux, but seems to hang under Windows, with only one of the three child processes still using the CPU. I can't tell if it's "stuck," or if that process's share of models to fit are just taking a long time to optimize.

I would not be surprised if doParallel has some Windows-specific bugs, as Joshua suggested.

Replied on Thu, 06/22/2017 - 12:06

dtofighi Joined: Oct 12, 2009

re-runs

Hello All,

I would like to thank you for all your advice. I modified my code per your suggestions using SLSQP optimizer as well. I ran the code using both optimizers on Mac, W10, and Ubuntu. For W10, I used doSNOW instead of doParallel because it would allow me to use a progress bar in an interactive session. The results of all my runs were the same: the program did not stop when all the calculations were done. This can clearly be seen when running R on W10 because of the progress bar.

I attached my Windows 10 scripts as well as the ones for Mac. Mac and Ubuntu scripts were the same. I am also attaching all my sessionInfo files.

File attachments

problemCSOLNPWin10.R

problemSLSQPWin10.R

problemCSOLNPMac.R

rsessioninfoCSOLNPubuntu.txt

rsessioninfoCSOLNPMac.txt

rsessioninfoCSOLNPW10.txt

rsessioninfoSLSQPMac.txt

rsessioninfoSLSQPubuntu.txt

rsessioninfoSLSQPW10.txt

problemSLSQPMac.R

Replied on Thu, 06/22/2017 - 13:01

jpritikin Joined: May 23, 2012

w10 only?

Just to clarify, does this failure mode appear on all platforms or only W10?

Replied on Thu, 06/22/2017 - 13:30

dtofighi Joined: Oct 12, 2009

It appears on all platforms.

Replied on Fri, 06/23/2017 - 08:57

jpritikin Joined: May 23, 2012

works for me

I tried problemSLSQPMac.R on Debian Linux and it seems to run fine in less than 5 minutes. It creates output file resultsSLSQMac.csv with 200 lines plus a header line. I am not able to reproduce the hang.

Replied on Fri, 06/30/2017 - 18:10

dtofighi Joined: Oct 12, 2009

Thanks for all your comments.

Thanks for all your comments. With constrained optimization, in my simulation, I get different results between the two optimizer. On average, the results of CSLONP is much better than those from SLSQP.

Below, I am pasting the output from one of the the simulation runs when CSLONP got stuck:

[0] MxComputeGradientDescent: engine CSOLNP (ID 1) #P=7 gradient=central tol=6.3e-012 constraints=1
[0] resultForTT

[0] 0.005102
[0] resultForTT

[0] 0.000012
[0] resultForTT

[0] 0.000011
[0] resultForTT

[0] 0.000043
[0] MxComputeGradientDescent: engine CSOLNP done, iter=1288 inform=10
[0] MxComputeGradientDescent: engine CSOLNP (ID 1) #P=7 gradient=central tol=6.3e-012 constraints=1
[0] resultForTT

[0] 0.013429
[0] resultForTT

[0] -0.000013
[0] resultForTT

[0] 0.000625
[0] resultForTT

[0] -0.016426
[0] resultForTT

[0] -0.000001

Replied on Fri, 06/30/2017 - 19:05

AdminRobK Joined: Jan 24, 2014

a few more comments

Thanks for all your comments. With constrained optimization, in my simulation, I get different results between the two optimizer.
On average, the results of CSLONP is much better than those from SLSQP.

The different optimizers have their strengths and weaknesses, but I'm VERY surprised that you're getting better results with CSOLNP than with SLSQP. By what criteria are they better, and are you sure? Our testing has consistently indicated that SLSQP is the best of the three gradient-descent optimizers at handling nonlinear constraints. CSOLNP's biggest strength is with in analyses involving ordinal data, for which, relative to the other two, it can sometimes reach a lower fitfunction value and/or reach the minimum in fewer function evaluations.

[0] MxComputeGradientDescent: engine CSOLNP done, iter=1288 inform=10

Status code 10 means that the start values were infeasible. If the start values violate constraints, the optimizer is supposed to try to find a feasible point before beginning its algorithm in full swing. The fact that the attempt ended with status code 10 indicates either (1) that the optimizer was not able to find an initial feasible point, or (2) that the start values are completely outside the parameter space, i.e. the fitfunction evaluates to NaN or Inf or something like that at the start values.

Have you tried using NPSOL as the optimizer? It's a proprietary library, so you would need to install OpenMx from our repository, rather than from CRAN.

On a completely different topic...since you're doing a simulation involving an MxConstraint, you can probably speed up optimization of the constrained model if you provide an analytic Jacobian for the constraint function. Only NPSOL and SLSQP can use one. If you're interested, I could post some suggestions on how to modify your script for that purpose.

News

Recent Posts

Program gets stuck

works for me

I tried your script

I tried again

re-runs

w10 only?

It appears on all platforms.

works for me

Thanks for all your comments.

a few more comments

News

Recent Posts