Best Practices for 3lvl measurement SEM

Posted on
Picture of user. Ben Joined: 06/20/2023
I'm currently building a 3lvl MSEM including measurement models. The Pritikin et al. (2017) paper is relevant and helpful but whereas they have observed variables on every level, I mainly want to use levels to control for the between-part of the variance of items I measured only on the 1st level. Hence, I want to make sure I don't misclassify my model (see below). Assuming the model is right, I'm also grateful on tips how to deal with a non-convex Hessian when trying to fit it.

**The overall goal**: My goal is to build a model with two variables measured on the 1st "individual" level (X, Y), two variables measured on the 2nd "team" level (V, W) and a 3rd "district" level to control for shared variance between teams. All X, Y, V, W will be related through regressions on the 2nd level. I compare my models with lavaan as far as feasible (lavaan isn't able to do 3lvl MSEMs).

**In this step, I start with a simple measurement model of one latent variable, X, measured through n items. I want to build a 3-lvl measurement model to do justice to the team and district structure underlying the data.** After building these models also for Y, V, and W, I want to build the full regression model by adding paths between the variables.

**2 Levels**: I get the single-level regressions and 2lvl-measurement models to agree between OpenMx and lavaan. This means lavaan seems to make the following implicit assumptions:
- fixing first factor loading to 1 while freeing the variance of latent variables (ok)
- fixing the means of variables to zero on the within level (why is this a good choice?)
- adding variables as latent variables to the between level (sure, this is what "levels" is all about)
- adding a path from between-level (latent) variables to the corresponding manifest variables on the within-level, fixed to 1 (ok)

**3 Levels**: I now added a third level ("district") and followed these basic steps. **Please see the attached file.**
There are two modeling choices I'm not 100% sure about - if you know about best practices I'd be very happy as I wasn't able to find any. (E.g., Pritikin et al. link latent variables, as manifest variables differ on each level; I'm not an expert in MPLUS but it seems many of their assumptions regarding these choices are not made explicit when defining the model.)
- should I fix means of the items to zero on the between level? I am assuming different team means so opted not do to so (but I also thought there would be different means on the individual level)
- should I link the latent variables corresponding to the items between the district-lvl and the between-lvl (lvls 3 and 2)? Teams are clustered in districts, so this makes sense. (Or do I link lvls 3 and 1 directly as this is the actual relationship I'm interested in?)

With these decisions out of the way, here is the code I used.
**Running the code** I get the following warning and standard errors are through the roofs: In model 'within' Optimizer returned a non-zero status code 5. The Hessian at the solution does not appear to be convex. See ?mxCheckIdentification for possible diagnosis (Mx status RED).

Please note that I have used the fitted model to eye-ball starting values of the right order of magnitude. Overall district-level variances are very very small, several of them turn negative in the estimates. Might this be a reason why mxRun() struggles with the model?

I'm grateful for your support. I hope I've just made a simple mistake somewhere and hope it's not too stupid.


data$team <- as.factor(as.character(data$team))
data$district <- as.factor(as.character(data$district))
dataL1 <- data
dataL2 <- data[!duplicated(data$team),]
dataL3 <- data[!duplicated(data$district),]
mxDataL1 <- mxData(observed=dataL1, type="raw")
mxDataL2 <- mxData(observed=dataL2, type="raw", primaryKey = "team")
mxDataL3 <- mxData(observed=dataL3, type="raw", primaryKey = "district")

nx <- 14
xItems <- paste0("x",1:nx)

# Define Paths
# Paths X lvl 1
pathXItemsVar <- mxPath(from=xItems, arrows=2, free=TRUE, values = diag(var(data[xItems], na.rm=TRUE)), labels=paste0("e_x",1:nx)) # residual variances
pathXItemsMeans <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_x",1:nx)) # means items
pathXItemsMeans0 <- mxPath(from = "one", to = xItems, free = FALSE, values=0, labels=paste0("m_x",1:nx)) # manifest means, constrained to 0
pathXLoadings <- mxPath(from="x", to=xItems, free=c(FALSE,rep(TRUE,nx-1)), values=1, labels=paste0("fl_x_w",1:nx)) # factor loadings, fixing first factor to 1
pathXLatentVar <- mxPath(from="x", arrows=2, free=TRUE, values=0.2, labels="var_x") # latent variance

# Paths X lvl 2
pathXItemsVarL2 <- mxPath(from=xItems, arrows=2, free=TRUE, values = 0.1, labels=paste0("e_l2x",1:nx)) # (latent) items variance
pathXItemsMeansL2 <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_l2x",1:nx)) # (latent) items means
pathXItemsMeansL20 <- mxPath(from = "one", to = xItems, free=FALSE, values=0, labels=paste0("m_l2x",1:nx)) # (latent) items means, constrained to 0 (not used here)
pathXLoadingsL2 <- mxPath(from = "xTeam", to = xItems,free = c(FALSE,rep(TRUE,nx-1)), values = 1, labels = paste0("fl_x_b",1:nx)) # latent Team factor, loadings constrained to same as on within level, first fixed to 1
pathXLatentVarL2 <- mxPath(from = "xTeam", arrows = 2, values = 0.1, free = TRUE, labels = "xTeam_var") # latent team factor, variance

# Paths X lvl 3
pathXItemsVarL3 <- mxPath(from=xItems, arrows=2, free=TRUE, values = 0.1, labels=paste0("e_l3x",1:nx)) # (latent) items variance
pathXItemsMeansL3 <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_l3x",1:nx)) # (latent) items means
pathXLoadingsL3 <- mxPath(from = "xDistrict", to = xItems, values = 1, free = c(FALSE,rep(TRUE,nx-1)), labels = paste0("fl_x_d",1:nx)) # latent Team factor, loadings constrained to same as on within level, first fixed to 1
pathXLatentVarL3 <- mxPath(from = "xDistrict", arrows = 2, values = 1, free = TRUE, labels = "xDistrict_var") # latent team factor, variance

# OpenMx - 3 lvl
modelXLatentL3 <- mxModel(
model = "district",
type = "RAM",
manifestVars = c(),
latentVars = c("xDistrict",xItems),
mxData(observed = dataL3, type="raw", primaryKey = "district"),
pathXItemsVarL3,
pathXItemsMeansL3,
pathXLoadingsL3,
pathXLatentVarL3
)

modelXLatentL2 <- mxModel(
model = "between",
type = "RAM",
modelXLatentL3,
manifestVars = c(),
latentVars = c("xTeam",xItems),
mxDataL2,
mxPath(from = paste0("district.",xItems), to = xItems, values=1, free=FALSE, joinKey="district"), #linking (latent) district lvl items
pathXItemsMeansL2, pathXItemsVarL2,
pathXLatentVarL2, pathXLoadingsL2
)

modelXLatentL123 <- mxModel(
model = "within",
type = "RAM",
modelXLatentL2,
manifestVars = c(xItems),
latentVars = c("x"),
mxDataL1,
mxPath(from = paste0("between.",xItems), to = xItems, values=1, free=FALSE, joinKey="team"), #linking (latent) between lvl Items to manifest vars
pathXItemsMeans0, pathXItemsVar,
pathXLatentVar, pathXLoadings
)

fitXLatentL123 <- mxRun(modelXLatentL123, intervals = TRUE)
summary(fitXLatentL123)

Replied on Fri, 07/21/2023 - 10:53
Picture of user. mhunter Joined: 07/31/2009

Here's an example 2-level factor model: [https://github.com/OpenMx/OpenMx/blob/master/inst/models/nightly/mplus-ex9.6.R](https://github.com/OpenMx/OpenMx/blob/master/inst/models/nightly/mplus-ex9.6.R)

It should be able to get you started.

Here's the core part of the model

library(OpenMx)

set.seed(1)
ex96 <- suppressWarnings(try(read.table("models/nightly/data/ex9.6.dat")))
if (is(ex96, "try-error")) ex96 <- read.table("data/ex9.6.dat")

ex96$V8 <- as.integer(ex96$V8)
bData <- ex96[!duplicated(ex96$V8), c('V7', 'V8')]
colnames(bData) <- c('w', 'clusterID')
wData <- ex96[,-match(c('V7'), colnames(ex96))]
colnames(wData) <- c(paste0('y', 1:4), paste0('x', 1:2), 'clusterID')

bModel <- mxModel(
'between', type="RAM",
mxData(type="raw", observed=bData, primaryKey="clusterID"),
latentVars = c("lw", "fb"),
mxPath("one", "lw", labels="data.w", free=FALSE),
mxPath("fb", arrows=2, labels="psiB"),
mxPath("lw", 'fb', labels="phi1"))

wModel <- mxModel(
'within', type="RAM", bModel,
mxData(type="raw", observed=wData),
manifestVars = paste0('y', 1:4),
latentVars = c('fw', paste0("xe", 1:2)),
mxPath("one", paste0('y', 1:4), values=runif(4),
labels=paste0("gam0", 1:4)),
mxPath("one", paste0('xe', 1:2),
labels=paste0('data.x',1:2), free=FALSE),
mxPath(paste0('xe', 1:2), "fw",
labels=paste0('gam', 1:2, '1')),
mxPath('fw', arrows=2, values=1.1, labels="varFW"),
mxPath('fw', paste0('y', 1:4), free=c(FALSE, rep(TRUE, 3)),
values=c(1,runif(3)), labels=paste0("loadW", 1:4)),
mxPath('between.fb', paste0('y', 1:4), values=c(1,runif(3)),
free=c(FALSE, rep(TRUE, 3)), labels=paste0("loadB", 1:4),
joinKey="clusterID"),
mxPath(paste0('y', 1:4), arrows=2, values=rlnorm(4),
labels=paste0("thetaW", 1:4)))

To extend this to three levels, you would make another model similar to bModel but for your district level. And then you'd add paths similar to the below:

mxPath('between.fb', paste0('y', 1:4), values=c(1,runif(3)),
free=c(FALSE, rep(TRUE, 3)), labels=paste0("loadB", 1:4),
joinKey="clusterID")

but for your district level.

If that doesn't help, perhaps generate some fake data with mxGenerateData() and we can try to debug that way.

Replied on Thu, 09/07/2023 - 00:12
Picture of user. Ben Joined: 06/20/2023

In reply to by mhunter

Thank you. I am familiar with that examples but decided to follow a different approach inspired by this model, https://openmx.ssri.psu.edu/comment/8569#comment-8569. It links the manifest variables on the within level with mirror latent variables on the between level. I wanted to replicate earlier lavaan results and this is what lavaan is doing.

I am currently taking a step back from this approach and model variables only on two levels. I might give this a go later.