You are here

Bug in multigroup omxSaturatedModel()

require(OpenMx)
data(demoOneFactor)
latents  = c("G")
manifests = names(demoOneFactor)
 
# 1. Make two models, and nest them in a multigroup
m1 <- mxModel("model1", type = "RAM",
       manifestVars = manifests, latentVars = latents,
       mxPath(from = latents, to = manifests),
       mxPath(from = manifests, arrows = 2),
       mxPath(from = latents, arrows = 2, free = F, values = 1.0),
       mxData(cov(demoOneFactor), type = "cov", numObs = 500)
)
 
m2 <- m1
m2 <- mxRename(m2, 'model2')
 
# 2. Nest them in a multigroup supermodel and run
 
m3 = mxModel("bob", m1, m2, mxFitFunctionMultigroup(c("model1","model2")) )
m3Run = mxRun(m3);
 
# 3 request saturated model
x = omxSaturatedModel(m3Run)
Error in obsdata[selVars, selVars] : subscript out of bounds
 
traceback()
# 2: omxSaturatedModel(x[[grpnames[i]]], run = FALSE)
# 1: omxSaturatedModel(m3)
Reporter: 
Created: 
Thu, 07/24/2014 - 19:07
Updated: 
Wed, 10/22/2014 - 19:07

Comments

With the needed model renaming and many important and helpful changes that Joshua made, this now works.

require(OpenMx)
data(demoOneFactor)
latents  = c("G")
manifests = names(demoOneFactor)
 
# 1. Make two models, and nest them in a multigroup
m1 <- mxModel("model1", type = "RAM",
       manifestVars = manifests, latentVars = latents,
       mxPath(from = latents, to = manifests),
       mxPath(from = manifests, arrows = 2),
       mxPath(from = latents, arrows = 2, free = F, values = 1.0),
       mxData(cov(demoOneFactor), type = "cov", numObs = 500)
)
 
m2 <- m1
m2 <- mxRename(m2, "model2")
 
# 2. Nest them in a multigroup supermodel and run
 
m3 = mxModel("bob", m1, m2, mxFitFunctionMultigroup(c("model1","model2")) )
m3 = mxRun(m3)
x = omxSaturatedModel(m3)

FYI, as of the current beta (not trunk) I get this (new error)

mxVersion()
OpenMx version: 2.0.0.3750
R version: R version 3.1.1 (2014-07-10)

require(OpenMx)
data(demoOneFactor)
latents  = c("G")
manifests = names(demoOneFactor)
 
# 1. Make two models, and nest them in a multigroup
m1 <- mxModel("model1", type = "RAM",
       manifestVars = manifests, latentVars = latents,
       mxPath(from = latents, to = manifests),
       mxPath(from = manifests, arrows = 2),
       mxPath(from = latents, arrows = 2, free = F, values = 1.0),
       mxData(cov(demoOneFactor), type = "cov", numObs = 500)
)
 
m2 <- m1
m2 <- mxRename(m2, "model2")
m3 = mxModel("bob", m1, m2, mxFitFunctionMultigroup(c("model1", "model2")) )
m3 = mxRun(m3)
# Running bob
x = omxSaturatedModel(m3)
The model 'model1' has not been run. So reference models of all the variables in the data will be made.  For reference models of only the variables used in the model, provide the model after it has been run.
Error in mxData(observed = obsdata, type = datatype) : 
  Number of observations must be specified for non-raw data, i.e., add numObs=XXX to mxData()

svn up (r3767 or greater), and re-try. It's fixed.

this seems to be broken again (at least in 2.0.0.3881)

require(OpenMx)
data(demoOneFactor)
latents  = c("G")
manifests = names(demoOneFactor)
 
m1 <- mxModel("model1", type = "RAM", 
    manifestVars = manifests, latentVars = latents, 
    mxPath(from = latents,   to = manifests),
    mxPath(from = manifests, arrows = 2),
    mxPath(from = latents,   arrows = 2, free = F, values = 1.0),
    mxData(cov(demoOneFactor), type = "cov", numObs = 500)
)
 
m2 <- mxRename(m1, "model2")
 
m3 = mxModel("bob", m1, m2,
    mxFitFunctionMultigroup(c("model1", "model2"))
)
 
m3 = mxRun(m3); summary(m3)
# All have been run, of course
m3@.wasRun #[1] TRUE
m3$model1@.wasRun #[1] TRUE
m3$model2@.wasRun #[1] TRUE
 
mRef = omxSaturatedModel(m3)
# The model 'model1' has not been run. So reference models of all the variables in the data will be made.  For reference models of only the variables used in the model, provide the model after it has been run.
# The model 'model2' has not been run. So reference models of all the variables in the data will be made.  For reference models of only the variables used in the model, provide the model after it has been run.

seems the same for mxRefModels

refModelList = mxRefModels(m3)
# The model 'model1' has not been run. So reference models of all the variables in the data will be made.  For reference models of only the variables used in the model, provide the model after it has been run.
# The model 'model2' has not been run. So reference models of all the variables in the data will be made.  For reference models of only the variables used in the model, provide the model after it has been run.

With mxRefModels provided, we should have the full list of stats, but:

summary(m3, refModels = refModelList)
# TLI: NaN
# RMSEA:  0 *(Non-centrality parameter is negative)
# Some of your fit indices are missing.
#   To get them, fit saturated and independence models, and include them with
  # summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...).

If you want the reference models run, you have to pass run=TRUE. The default is mxRefModels(x, run=FALSE)

Hi josh, Yes. However we're not telling the user that the refModels list contains non-run models which in turn provide NA comparator fits.

Issuing model not-run warning messages from mxRefModels and omxSaturatedModel is an error.

Re passing in non-run models as refModels, I guess we should detect when models without fit stats that have not been run are passed into refModels=, then return an error or perhaps a warning?

"The models passed in as have not been run. You must run them to create the fit statistics:
refsRun = mxRefModels(yourModel, run = TRUE)
"

Hm, good point. SVN 3883

nice!

summary(m3, refModels = mRef)
Error in refToLikelihood(satModel) : 
  Reference model must be run to obtain fit indices

Just need to figure out what's changed that mxRefModels thinks the sub-models haven't been run.

I bet that's a bug in Mike Hunter's code. Submodels are looking for model$runstate and not finding it because it's in the container model.

Yes... I see in "R/MxFitFunctionML.R" the comment "runstate is not available for submodels, ugh!" :-)

Instead of

wasRun <- length(model@runstate) != 0

Perhaps:

model@.wasRun instead?

or an omxWasRun(model) helper to make this cleaner for all.

Rob just let me know that what I thought was a bug in mxStandardizeRAMpaths() is actually something up with mxFitFunctionMultigroup(). That's a separate issue, but perhaps worth noting here too.

Using mxFitFunctionMultigroup(), the Hessian of m3 is indeed singular (all 0).

This is apparent in missing SEs in summary(m3), and in a hard error from mxStandardizeRAMpaths(m3, SE = TRUE)

Rob coded up m3 the old way: and this is working fine, so multigroup is doing something errant.

m4 <- mxModel("charlie", m1, m2,
              mxAlgebra(model1.fitfunction + model2.fitfunction, name = "overallfitfunc"),
              mxFitFunctionAlgebra("overallfitfunc")
)
m4run <- mxRun(m4)
mxStandardizeRAMpaths(m4run,T)

The reason I use runstate as the indicator of a model being run is I grab information from the runstate expectation. That's the only place I know of that has information about which variables in a data set are actually used.

If a model has data with the 26 variables A, B, C, ..., X, Y, and Z but only uses X, Y, and Z then the correct saturated model is only of those variables. I use the runstate to figure out which ones to use. This strategy doesn't work for multigroup because the submodels don't have a runstate. However, unless there's a bigger bug with multigroup, there must be some way that multigroup is keeping track of this information. Joshua, any ideas?

There are different solutions. I'm not sure which is the least bad. Maybe we should pass runstate to all the models in a multiple group situation? Or just pass around the list of free variables?

Passing around the list of used variables sounds like the best plan to me.

Does r3913 fix your issue?

mxRun now populates the expectation 'dims' with the variables used. And refModels() checks the expectation 'dims'. There are now two possible warnings thrown. First, if the dims are a single NA, then I complain. Second, if the model wasn't run, then I complain. The behavior is the same in both cases, but the complaint I make is more appropriate. With raw data multigroup models, everything should work and there's now a test to ensure it. With cov data multigroup models, OpenMx will complain about the 'dims' but everything should be fine.

There is an ongoing issue with cov data that the expectation for cov data does not modify the 'dims' slot in the expectation. However, for cov data, I believe we require all variables be modeled. (See below.) So fitting the saturated model to all the covariance variables is likely the correct behavior. It still throws a message, but the message says the 'dims' haven't been set, not that the model hasn't been run.

require(OpenMx)
data(demoOneFactor)
manifests <- names(demoOneFactor)
manifests <- manifests[-1]
latents <- c("G")
factorModel <- mxModel("One Factor",
      type="RAM",
      manifestVars = manifests,
      latentVars = latents,
      mxPath(from=latents, to=manifests),
      mxPath(from=manifests, arrows=2),
      mxPath(from=latents, arrows=2,
            free=FALSE, values=1.0),
      mxData(cov(demoOneFactor), type="cov",
            numObs=500)) # errors here
# Error: The observed data contains the variables: 'x1' that have not been declared in the manifest variables.
#summary(mxRun(factorModel))

Given that for cov data, the ref model with all vars is correct for comparison purposes, Perhaps tailor the message for advanced users who for some reason want to generate a ref model for a subset of the variables.

so instead of this (nb, note minor typo 'expectaion')

x = omxSaturatedModel(m3Run)

The model 'model1' has an expectaion with NA 'dims' slot. So reference models of all the variables in the data will be made. For reference models of only the variables used in the model, populate the 'dims' slot with the variable names to use.
The model 'model2' has an expectaion with NA 'dims' slot. So reference models of all the variables in the data will be made. For reference models of only the variables used in the model, populate the 'dims' slot with the variable names to use.

Maybe just

Reference models of all variables in the data were made. For reference models of only the variables used in the model, populate the 'dims' slot with the variable names to use.

Aside from the typo (fixed in r3934), the wording of the error message is parallel to my indifference curve on this topic. Change it if you like!