mxRun makes R crashed with character ID column

Posted on
No user picture. dmongin Joined: 08/09/2019
Dear all.

I am using OpenMx to solve differential equation models, thanks to the the example given here https://www.rdocumentation.org/packages/OpenMx/versions/2.12.2/topics/mxExpectationStateSpaceContinuousTime

It works nicely, and I have been able to estimate first order and second order differential equation. Tests in simulation tend to show that estimates are good. But I found a weird bug, that I reproduce below. If I create simple data with 52 time observations for 250 individuals, and with a noisy signal data with (no) dynamics caused by excitation1 vector:

library(data.table)
library(OpenMx)

# make test data
test <- data.table(ID1 =rep(1:250,each = 52),time = rep(1:52,250))
test[,signal := rnorm(.N,0,1), by = ID1]
test[,excitation1 := sample(c(0,1),.N,replace = T,prob = c(0.8,0.2)), by = ID1]

# simple first order model
cdim <- list('signal', c('ksi'))
amat <- mxMatrix('Full', 1, 1, TRUE, -.1, name='A', ubound=0)
cmat <- mxMatrix('Full', 1, 1, FALSE, 1, name='C', dimnames=cdim)
qmat <- mxMatrix('Zero', 1, 1, name='Q')
rmat <- mxMatrix('Diag', 1, 1, TRUE, 0.1, name='R', lbound=1e-6)
xmat <- mxMatrix('Full', 1, 1, TRUE, 0, name='x0', lbound=0, ubound=15) # initial position, should be yO
pmat <- mxMatrix('Diag', 1, 1, FALSE, 1, name='P0')
tmat <- mxMatrix('Full', 1, 1, name='time', labels='data.time') # time
dmat <- mxMatrix('Zero', 1, 1, name='D')
bmat <- mxMatrix('Full', 1, 1,TRUE,.1,name='B',lbound=0)
umat <- mxMatrix('Full', 1, 1, name='u',labels=c('data.excitation1'))

model <- mxModel("ODE1",
amat, bmat, cmat, dmat, qmat, rmat, xmat, pmat, umat, tmat,
mxExpectationSSCT('A', 'B', 'C', 'D', 'Q', 'R', 'x0', 'P0', 'u', 'time'),
mxFitFunctionML(),
mxData(test, 'raw'))
resMx <- mxRun(model)

The model runs with errors. And that is ok, because it is just a test (by the way mxTryHard manage to find propoer starting values). But if I transform the ID column into character:

test[,ID1 := as.character(ID1)]
resMx <- mxRun(model)

then R crashes. And I don't get it, because ID1 does not appear in the model. If I do just a simple character column, nothing happens.

Replied on Thu, 08/15/2019 - 11:18
Picture of user. AdminRobK Joined: 01/24/2014

I reproduce the crash you report. My `mxVersion()`:

OpenMx version: 2.13.2 [GIT v2.13.2]
R version: R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32
Default optimizer: CSOLNP
NPSOL-enabled?: Yes
OpenMP-enabled?: No

The crash occurs with all 3 of the main optimizers. What is your `mxVersion()` output?

AFAIK, OpenMx has never been tested with a data.table provided as its raw dataset. The fact that the data.table class stores a pointer to itself likely has something to do with why OpenMx crashes when you try to run the MxModel after you've modified the data.table from what it was when passed to `mxData()`.

Replied on Thu, 08/15/2019 - 11:44
Picture of user. AdminRobK Joined: 01/24/2014

I also reproduce the crash on my Linux/GNU development system. `mxVersion()`:

OpenMx version: 2.13.2.132 [GIT v2.13.2-132-gc8942afb-dirty]
R version: R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu
Default optimizer: CSOLNP
NPSOL-enabled?: Yes
OpenMP-enabled?: Yes

The crash is being caused by a segmentation fault. So, it looks like memory-safety is the underlying issue.
Replied on Thu, 08/15/2019 - 12:01
Picture of user. AdminRobK Joined: 01/24/2014

Here's what I found running under a debugger:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff2c35bfb in omxDoubleDataElement (col=0, row=0, od=0x55555f4b18c0) at omxData.cpp:513
513 else return cd.ptr.intData[row];
(gdb) bt
#0 0x00007ffff2c35bfb in omxDoubleDataElement (col=0, row=0, od=0x55555f4b18c0) at omxData.cpp:513
#1 omxDoubleDataElement (od=0x55555f4b18c0, row=row@entry=0, col=col@entry=0) at omxData.cpp:507
#2 0x00007ffff2d29df0 in omxStateSpaceExpectation::init (this=0x55555f8e0af0) at omxStateSpaceExpectation.cpp:900
#3 0x00007ffff2c9d0ed in omxState::omxCompleteMxExpectationEntities (this=this@entry=0x555556056d20)
at /usr/include/c++/8/bits/stl_vector.h:930
#4 0x00007ffff2c04589 in omxBackend2 (constraints=0x55555d12cf28, matList=0x555560fdcc78, varList=,
algList=0x555560cba318, expectList=, computeList=0x55555ef98228, data=0x55555cd6e480,
intervalList=0x5555555747d0, checkpointList=0x55555ef71df8, options=0x55555eae7840, defvars=0x55555dacf298, silent=false)
at glue.cpp:586
#5 0x00007ffff2c0524d in omxBackend (constraints=0x55555d12cf28, matList=0x555560fdcc78, varList=0x55555e2bcbb8,
algList=0x555560cba318, expectList=0x55555dda54a8, computeList=0x55555ef98228, data=0x55555cd6e480,
intervalList=0x5555555747d0, checkpointList=0x55555ef71df8, options=0x55555eae7840, defvars=0x55555dacf298,
Rsilent=0x5555555766b8) at glue.cpp:758
Replied on Thu, 08/15/2019 - 13:53
Picture of user. AdminRobK Joined: 01/24/2014

It looks like I was wrong about changing the data.table being the cause of the fault. The following script also segfaults at the same place in the backend:

library(data.table)
library(OpenMx)

# make test data
test <- data.table(ID1 =rep(1:250,each = 52),time = rep(1:52,250))
test[,ID1 := as.character(ID1)]
test[,signal := rnorm(.N,0,1), by = ID1]
test[,excitation1 := sample(c(0,1),.N,replace = T,prob = c(0.8,0.2)), by = ID1]

# simple first order model
cdim <- list('signal', c('ksi'))
amat <- mxMatrix('Full', 1, 1, TRUE, -.1, name='A', ubound=0)
cmat <- mxMatrix('Full', 1, 1, FALSE, 1, name='C', dimnames=cdim)
qmat <- mxMatrix('Zero', 1, 1, name='Q')
rmat <- mxMatrix('Diag', 1, 1, TRUE, 0.1, name='R', lbound=1e-6)
xmat <- mxMatrix('Full', 1, 1, TRUE, 0, name='x0', lbound=0, ubound=15) # initial position, should be yO
pmat <- mxMatrix('Diag', 1, 1, FALSE, 1, name='P0')
tmat <- mxMatrix('Full', 1, 1, name='time', labels='data.time') # time
dmat <- mxMatrix('Zero', 1, 1, name='D')
bmat <- mxMatrix('Full', 1, 1,TRUE,.1,name='B',lbound=0)
umat <- mxMatrix('Full', 1, 1, name='u',labels=c('data.excitation1'))

model <- mxModel("ODE1",
amat, bmat, cmat, dmat, qmat, rmat, xmat, pmat, umat, tmat,
mxExpectationSSCT('A', 'B', 'C', 'D', 'Q', 'R', 'x0', 'P0', 'u', 'time'),
mxFitFunctionML(),
mxData(as.data.frame(test), 'raw'))
resMx <- mxRun(model)

Note that `test` has its `ID1` column converted to character strings before it ever goes into `mxData()`, and that it's being cast to a vanilla data.frame when passed to `mxData()`.

I think this is a bug. The logic in backend `omxDoubleDataElement()` assumes that it will always return either a REALSXP or an INTSXP, and never a STRSXP. It should never be looking at a data column of strings in the first place.

Replied on Thu, 08/15/2019 - 14:02
Picture of user. AdminRobK Joined: 01/24/2014

OP, could you clarify what you mean by "If I do just a simple character column, nothing happens"?
Replied on Fri, 08/16/2019 - 14:18
Picture of user. jpritikin Joined: 05/23/2012

As of v2.13.2-152-ga29af6cb8, this code correctly fails. Maybe you meant to write test[,ID1 := as.character(test$ID1)] ?