Hi all,
I have been trying to replicate measurement invariance for the CFA model with ordinal data (theta parametrization) in Kline (2015) but have failed due to convergence issues (even after roughly adjusting starting values for thresholds based on lavaan output). I also tried mxAutoStart(). I am wondering if I am missing something.
library(OpenMx) radloffdata <- read.table("https://www.guilford.com/add/kline/radloff-lavaan.txt", sep="\t", header = FALSE) colnames(radloffdata) <- c("x1", "x2", "x3", "x4", "x5", "g") radloffdata[,1] <- ordered(radloffdata[,1]) radloffdata[,2] <- ordered(radloffdata[,2]) radloffdata[,3] <- ordered(radloffdata[,3]) radloffdata[,4] <- ordered(radloffdata[,4]) radloffdata[,5] <- ordered(radloffdata[,5]) rawData1 <- subset(radloffdata, g == 1) rawData2 <- subset(radloffdata, g == 2) # common parameters alphas <- mxPath(from = "one", to = c("x1", "x2", "x3", "x4", "x5"), arrows = 1, free = FALSE, values = 0) varDep <- mxPath(from = "depression", arrows = 2, free = TRUE, values = .5) lambdas <- mxPath(from = "depression", to = c("x1", "x2", "x3", "x4", "x5"), arrows = 1, free = c(FALSE, TRUE, TRUE, TRUE, TRUE), values = c(1, 1, 1, 1, 1)) # model for group 1 dataRaw1 <- mxData(observed = rawData1, type = "raw") kappa1 <- mxPath(from = "one", to = "depression", arrows = 1, free = FALSE, values = 0, labels = "kappa1") thetas1 <- mxPath(from = c("x1", "x2", "x3", "x4", "x5"), arrows = 2, free = FALSE, values = 1) thresholds1 <- mxThreshold(vars = c("x1", "x2", "x3", "x4", "x5"), nThresh = c(3, 3, 3, 3, 3), free = TRUE, values = c(1, 1.5, 2.5, 1.5, 2, 2.5, 1, 2, 2.5, .5, 1.5, 2, 1, 2, 2.5), labels = c("t11", "t12", "t13", "t21", "t22", "t23", "t31", "t32", "t33", "t41", "t42", "t43", "t51", "t52", "t53")) m1 <- mxModel("m1", type = "RAM", manifestVars = c("x1", "x2", "x3", "x4", "x5"), latentVars = "depression", dataRaw1, alphas, kappa1, lambdas, thetas1, varDep, thresholds1) # model for group 2 dataRaw2 <- mxData(observed = rawData2, type = "raw") kappa2 <- mxPath(from = "one", to = "depression", arrows = 1, free = TRUE, values = .5, labels = "kappa2") thetas2 <- mxPath(from = c("x1", "x2", "x3", "x4", "x5"), arrows = 2, free = TRUE, values = .7) thresholds2 <- mxThreshold(vars = c("x1", "x2", "x3", "x4", "x5"), nThresh = c(3, 3, 3, 3, 3), free = TRUE, values = c(1, 1.5, 2, 1.5, 2.2, 2.7, 1, 1.5, 2, .5, .7, 1, 1, 1.5, 2), labels = c("t11", "t12", "t2_13", "t21", "t2_22", "t2_23", "t31", "t2_32", "t2_33", "t41", "t2_42", "t2_43", "t51", "t2_52", "t2_53")) m2 <- mxModel("m2", type = "RAM", manifestVars = c("x1", "x2", "x3", "x4", "x5"), latentVars = "depression", dataRaw2, alphas, kappa2, lambdas, thetas2, varDep, thresholds2) # combine and run models MGOrd <- mxModel("MG Configural Invariance", m1, m2, mxFitFunctionMultigroup(c("m1.fitfunction", "m2.fitfunction"))) MGOrdFit <- mxRun(MGOrd) MGOrdRef <- mxRefModels(MGOrdFit, run = TRUE) summary(MGOrdFit, refModels = MGOrdRef)
https://www.guilford.com/add/kline/radloff-lavaan.r
What is your output from
mxVersion()
?Please see the output below:
Hi
I think there are two problems. One is that the model specified is not identified. This can be deduced from the number of parameters being greater than the number of parameters for the saturated model. The main reason for under-identification is that when thresholds are estimated, it is not possible to simultaneously estimate residual variances. Mehta PD, Neale, M.C., Flay BR: Squeezing interval change from ordinal panel data: latent growth curves with ordinal outcomes. Psychol Methods 2004; 9:301-333 describes how to avoid the issue by fixing the first two thresholds to arbitrary values, which identifies means and variances. I have modified your script to fix the thresholds (0 and 1 are more commonly used for this purpose but the values you supplied work as well and it's a matter of a simple transform to get back to the original z-score scale for the thresholds if needed). You may wish to liberate parameters for item means as well - zero is probably a poor choice :) and likely contributes to the poor fit of the model.
The second issue is that the starting values were not that great, and with ordinal data it can be difficult to get to the solution precisely on the first try. Therefore I substituted mxTryHard() for mxRun(), which seemed to solve the optimization issue. There's an mxTryHardOrdinal() which might have worked better, but I forgot to use it here.
Thank you so much for helping with this, and for the reference. I thought across group equality constraints (not fixed) on thresholds is helping with the identification overall. 20 parameters in m1 and 26 parameters in m2 is estimated but due to 6 equality constraints on thresholds (first two thresholds in the first item, first threshold for the rest), overall 40 parameters are freely estimated. Number of observations are 50 overall. I merely followed the book and lavaan syntax/ output.
Does that mean each model (m1 and m2) should be identified in its own in OpenMx (and not lavaan or Mplus??).
If we fix first two thresholds for each item and in each group, would this still produce the result for configural invariance since thresholds are not freely estimated?
I am sorry to trouble you with this but just wondering if there is something unique to OpenMx when multiple groups are of interest, or if it is something I overlooked but could not identify.
Please find the lavaan syntax and output for the same problem below.
First, my
mxVersion()
output:No. You can indeed use cross-group equality constraints to identify the overall fitted model, and it appears that what you've done actually is sufficient to identify the model. At least, the
mxCheckIdentification()
function says that your model is identified (at the start values and at the dubious solution). I also tried usingmxCheckIdentification()
to check your model using the v2.11.5 release, but it appears there was a bug in that function present in the release but repaired in the source repository since then. But, there appears to be a bug inmxRefModels()
as well, which has NOT been repaired since v2.11.5. The saturated model in your case should have 50 free parameters, not 35: fifteen thresholds and 10 polychorics in each of 2 groups. But here are the free parameters in the "saturated" model:What's happening in the "saturated" model is that all the thresholds are constrained equal across groups. I verified that by examining
MGOrdRef$Saturated$'Saturated m1'$thresholdDeviations
andMGOrdRef$Saturated$'Saturated m2'$thresholdDeviations
. Both had the same set of labels.So, now I don't think that identification is your problem. I think it's just a matter of getting an analysis of ordinal data to converge (which can be tricky). I have a few tips that I'll put in a subsequent post.
This is fixed in our code repository. To get things to work in the current release, you can do this:
Rob's right - I had thought that the residual variances were free in both groups, but only one group is okay. Thank you for posting the issue, which identified a bug in the saturated model function. This bug is now repaired in the bleeding edge source repository.
First, be advised that if you're analyzing ordinal data via maximum likelihood, OpenMx will sometimes report status Red (status code 5 or 6) even when it has found a local minimum. The multivariate-normal probability integral has no closed-form expression, so it must be approximated numerically. The algorithms that approximate it are subject to numerical inaccuracy, and in some cases, the numerical error is of such magnitude that OpenMx cannot be sure it has found a local minimum.
Here are some things you can do (they are not necessarily mutually exclusive):
1. SADMVN, the algorithm OpenMx uses for the multivariate-normal probability integral, can be given an upper limit ,'mvnRelEps', to how much error is acceptable in the probabilities it reports to OpenMx. You can reduce this limit (i.e., request less error from SADMVN) with something like
before you run your MxModel. Reducing 'mvnRelEps' will make SADMVN, and therefore OpenMx, run more slowly. Be careful: it is possible to set 'mvnRelEps' so low as to make it mathematically impossible for SADMVN to satisfy it. In that case, OpenMx will warn you about it, but maybe after running for a very long time.
2. You can replace
mxRun()
withmxTryHardOrdinal()
.3. You can run your model through
mxAutoStart()
before running it.