You are here

Best Practices for 3lvl measurement SEM

3 posts / 0 new
Last post
Ben's picture
Joined: 06/20/2023 - 09:18
Best Practices for 3lvl measurement SEM
Image icon model-xl123.jpg732.64 KB

I'm currently building a 3lvl MSEM including measurement models. The Pritikin et al. (2017) paper is relevant and helpful but whereas they have observed variables on every level, I mainly want to use levels to control for the between-part of the variance of items I measured only on the 1st level. Hence, I want to make sure I don't misclassify my model (see below). Assuming the model is right, I'm also grateful on tips how to deal with a non-convex Hessian when trying to fit it.

The overall goal: My goal is to build a model with two variables measured on the 1st "individual" level (X, Y), two variables measured on the 2nd "team" level (V, W) and a 3rd "district" level to control for shared variance between teams. All X, Y, V, W will be related through regressions on the 2nd level. I compare my models with lavaan as far as feasible (lavaan isn't able to do 3lvl MSEMs).

In this step, I start with a simple measurement model of one latent variable, X, measured through n items. I want to build a 3-lvl measurement model to do justice to the team and district structure underlying the data. After building these models also for Y, V, and W, I want to build the full regression model by adding paths between the variables.

2 Levels: I get the single-level regressions and 2lvl-measurement models to agree between OpenMx and lavaan. This means lavaan seems to make the following implicit assumptions:
- fixing first factor loading to 1 while freeing the variance of latent variables (ok)
- fixing the means of variables to zero on the within level (why is this a good choice?)
- adding variables as latent variables to the between level (sure, this is what "levels" is all about)
- adding a path from between-level (latent) variables to the corresponding manifest variables on the within-level, fixed to 1 (ok)

3 Levels: I now added a third level ("district") and followed these basic steps. Please see the attached file.
There are two modeling choices I'm not 100% sure about - if you know about best practices I'd be very happy as I wasn't able to find any. (E.g., Pritikin et al. link latent variables, as manifest variables differ on each level; I'm not an expert in MPLUS but it seems many of their assumptions regarding these choices are not made explicit when defining the model.)
- should I fix means of the items to zero on the between level? I am assuming different team means so opted not do to so (but I also thought there would be different means on the individual level)
- should I link the latent variables corresponding to the items between the district-lvl and the between-lvl (lvls 3 and 2)? Teams are clustered in districts, so this makes sense. (Or do I link lvls 3 and 1 directly as this is the actual relationship I'm interested in?)

With these decisions out of the way, here is the code I used.
Running the code I get the following warning and standard errors are through the roofs: In model 'within' Optimizer returned a non-zero status code 5. The Hessian at the solution does not appear to be convex. See ?mxCheckIdentification for possible diagnosis (Mx status RED).

Please note that I have used the fitted model to eye-ball starting values of the right order of magnitude. Overall district-level variances are very very small, several of them turn negative in the estimates. Might this be a reason why mxRun() struggles with the model?

I'm grateful for your support. I hope I've just made a simple mistake somewhere and hope it's not too stupid.

data$team <- as.factor(as.character(data$team))
data$district <- as.factor(as.character(data$district))
dataL1 <- data
dataL2 <- data[!duplicated(data$team),]
dataL3 <- data[!duplicated(data$district),]
mxDataL1 <- mxData(observed=dataL1, type="raw")
mxDataL2 <- mxData(observed=dataL2, type="raw", primaryKey = "team")
mxDataL3 <- mxData(observed=dataL3, type="raw", primaryKey = "district")
nx <- 14
xItems <- paste0("x",1:nx)
# Define Paths
# Paths X lvl 1
pathXItemsVar <- mxPath(from=xItems, arrows=2, free=TRUE, values = diag(var(data[xItems], na.rm=TRUE)), labels=paste0("e_x",1:nx)) # residual variances
pathXItemsMeans <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_x",1:nx)) # means items
pathXItemsMeans0 <- mxPath(from = "one", to = xItems, free = FALSE, values=0, labels=paste0("m_x",1:nx)) # manifest means, constrained to 0
pathXLoadings <- mxPath(from="x", to=xItems, free=c(FALSE,rep(TRUE,nx-1)), values=1, labels=paste0("fl_x_w",1:nx)) # factor loadings, fixing first factor to 1
pathXLatentVar <- mxPath(from="x", arrows=2, free=TRUE, values=0.2, labels="var_x") # latent variance
# Paths X lvl 2
pathXItemsVarL2 <- mxPath(from=xItems, arrows=2, free=TRUE, values = 0.1, labels=paste0("e_l2x",1:nx)) # (latent) items variance
pathXItemsMeansL2 <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_l2x",1:nx)) # (latent) items means
pathXItemsMeansL20 <- mxPath(from = "one", to = xItems, free=FALSE, values=0, labels=paste0("m_l2x",1:nx)) # (latent) items means, constrained to 0 (not used here)
pathXLoadingsL2 <- mxPath(from = "xTeam", to = xItems,free = c(FALSE,rep(TRUE,nx-1)), values = 1, labels = paste0("fl_x_b",1:nx)) # latent Team factor, loadings constrained to same as on within level, first fixed to 1
pathXLatentVarL2 <- mxPath(from = "xTeam", arrows = 2, values = 0.1, free = TRUE, labels = "xTeam_var") # latent team factor, variance
# Paths X lvl 3
pathXItemsVarL3 <- mxPath(from=xItems, arrows=2, free=TRUE, values = 0.1, labels=paste0("e_l3x",1:nx)) # (latent) items variance 
pathXItemsMeansL3 <- mxPath(from = "one", to = xItems, free=TRUE, values=1, labels=paste0("m_l3x",1:nx)) # (latent) items means
pathXLoadingsL3 <- mxPath(from = "xDistrict", to = xItems, values = 1, free = c(FALSE,rep(TRUE,nx-1)), labels = paste0("fl_x_d",1:nx)) # latent Team factor, loadings constrained to same as on within level, first fixed to 1
pathXLatentVarL3 <- mxPath(from = "xDistrict", arrows = 2, values = 1, free = TRUE, labels = "xDistrict_var") # latent team factor, variance
# OpenMx - 3 lvl
modelXLatentL3 <- mxModel(
  model = "district",
  type = "RAM",
  manifestVars = c(),
  latentVars = c("xDistrict",xItems),
  mxData(observed = dataL3, type="raw", primaryKey = "district"),
modelXLatentL2 <- mxModel(
  model = "between",
  type = "RAM",
  manifestVars = c(),
  latentVars = c("xTeam",xItems),
  mxPath(from = paste0("district.",xItems), to = xItems, values=1, free=FALSE, joinKey="district"), #linking (latent) district lvl items
  pathXItemsMeansL2, pathXItemsVarL2,
  pathXLatentVarL2, pathXLoadingsL2
modelXLatentL123 <- mxModel(
  model = "within",
  type = "RAM",
  manifestVars = c(xItems),
  latentVars = c("x"),
  mxPath(from = paste0("between.",xItems), to = xItems, values=1, free=FALSE, joinKey="team"), #linking (latent) between lvl Items to manifest vars
  pathXItemsMeans0, pathXItemsVar,
  pathXLatentVar, pathXLoadings
fitXLatentL123 <- mxRun(modelXLatentL123, intervals = TRUE)
mhunter's picture
Joined: 07/31/2009 - 15:26

Here's an example 2-level factor model:

It should be able to get you started.

Here's the core part of the model

ex96 <- suppressWarnings(try(read.table("models/nightly/data/ex9.6.dat")))
if (is(ex96, "try-error")) ex96 <- read.table("data/ex9.6.dat")
ex96$V8 <- as.integer(ex96$V8)
bData <- ex96[!duplicated(ex96$V8), c('V7', 'V8')]
colnames(bData) <- c('w', 'clusterID')
wData <- ex96[,-match(c('V7'), colnames(ex96))]
colnames(wData) <- c(paste0('y', 1:4), paste0('x', 1:2), 'clusterID')
bModel <- mxModel(
    'between', type="RAM",
    mxData(type="raw", observed=bData, primaryKey="clusterID"),
    latentVars = c("lw", "fb"),
    mxPath("one", "lw", labels="data.w", free=FALSE),
    mxPath("fb", arrows=2, labels="psiB"),
    mxPath("lw", 'fb', labels="phi1"))
wModel <- mxModel(
    'within', type="RAM", bModel,
    mxData(type="raw", observed=wData),
    manifestVars = paste0('y', 1:4),
    latentVars = c('fw', paste0("xe", 1:2)),
    mxPath("one", paste0('y', 1:4), values=runif(4),
       labels=paste0("gam0", 1:4)),
    mxPath("one", paste0('xe', 1:2),
       labels=paste0('data.x',1:2), free=FALSE),
    mxPath(paste0('xe', 1:2), "fw",
       labels=paste0('gam', 1:2, '1')),
    mxPath('fw', arrows=2, values=1.1, labels="varFW"),
    mxPath('fw', paste0('y', 1:4), free=c(FALSE, rep(TRUE, 3)),
       values=c(1,runif(3)), labels=paste0("loadW", 1:4)),
    mxPath('between.fb', paste0('y', 1:4), values=c(1,runif(3)),
       free=c(FALSE, rep(TRUE, 3)), labels=paste0("loadB", 1:4),
    mxPath(paste0('y', 1:4), arrows=2, values=rlnorm(4),
       labels=paste0("thetaW", 1:4)))

To extend this to three levels, you would make another model similar to bModel but for your district level. And then you'd add paths similar to the below:

mxPath('between.fb', paste0('y', 1:4), values=c(1,runif(3)),
       free=c(FALSE, rep(TRUE, 3)), labels=paste0("loadB", 1:4),

but for your district level.

If that doesn't help, perhaps generate some fake data with mxGenerateData() and we can try to debug that way.

Ben's picture
Joined: 06/20/2023 - 09:18
Thank you

Thank you. I am familiar with that examples but decided to follow a different approach inspired by this model, It links the manifest variables on the within level with mirror latent variables on the between level. I wanted to replicate earlier lavaan results and this is what lavaan is doing.

I am currently taking a step back from this approach and model variables only on two levels. I might give this a go later.