Fitting model with three latent variables and 16 ordinal and four continuous indicators

Posted on
Picture of user. pehkawn Joined: 05/24/2020
I am having trouble fitting a SEM with three latent variables with the following specification: An endogenous latent variable, created from four continuous manifest indicator variables, is regressed on two exogenous latent variables created from multiple ordinal manifest indicator variables. Prior to creating the SEM model I ran a CFA model, with similar specification, but where the three latent variables covary. I used the resulting factor loading values as a starting point for the SEM model.

However, while the CFA model will return a solution after about 30 min (albeit with warnings), the SEM model, does not return a solution. I let the model run for two days before I killed the process.

First of all, is this expected behaviour of OpenMx, or is something wrong with my model causing the kernel to hang? I would have assumed that even if the model does not converge, the process will be interrupted after a some number of iterations.
This lead me to wonder what may be wrong with my model specification? Some pointers on how can I improve my model would be much appreciated.

Replied on Thu, 03/24/2022 - 11:55
Picture of user. AdminRobK Joined: 01/24/2014

I'm counting 17 ordinal variables in your SEM model. Be advised that the algorithm that evaluates multivariate-normal probability integrals scales poorly with dimension (I believe its running time is exponential in dimension). So, analyzing ordinal data via maximum-likelihood is really only computationally feasible for about 10 or fewer ordinal variables. You might need to fit your models with weighted least squares instead of maximum likelihood.
Replied on Fri, 03/25/2022 - 19:30
Picture of user. pehkawn Joined: 05/24/2020

In reply to by AdminRobK

Thanks, I should have considered that. However, it was hard to find anything on WLS estimation in the documentation. If I understand correctly, OpenMx uses FIML by default, and it should therefore not be necessary to impute missing data. With WLS, I would think this is necessary. I tested a parts of my model with both, and the two methods produce very different estimates, and I would think FIML is the most accurate. Would you recommend any methods/packages for imputing data?
Replied on Mon, 03/28/2022 - 08:36
Picture of user. AdminNeale Joined: 03/01/2013

In reply to by pehkawn

Hi I'm not a fan of imputing data - it's only as good as the model used to impute it. Mulitple imputation is quite a bit of work, and like everything else, initial optimism about the small number of multiple imputations necessary is often found wanting more. There is a mice package for multiple imputation. I'm beginning to think that WLSMV is better for your application, where the correlations have initially been estimated by ML (not pseudo-ML with the thresholds fixed to marginal totals, for this bakes in any biases due to data being MAR not MCAR).
Replied on Tue, 03/29/2022 - 10:16
Picture of user. pehkawn Joined: 05/24/2020

In reply to by AdminNeale

Thanks for your reply.
I'd avoid imputing data if at all possible. However, it might be necessary the estimates vary considerably between WLS and ML.
The approach you describe sounds interesting. I couldn't really find much on WLSMV, from what I could figure out, MPlus uses this method by default for categorical variables. However, I couldn't really figure out how to implement it in OpenMx. Do you have any articles, tutorials, or example scripts you can recommend?
Replied on Wed, 03/30/2022 - 17:59
Picture of user. pehkawn Joined: 05/24/2020

In reply to by AdminNeale

To add to my previous comment, I've been testing a partial model with fewer ordinal variables (to be able to get ML estimates). One thing I noticed is that both ML and WLS produced inconsistent estimates. If I rerun the exact same model without any modifications, I will still get very varying estimates. In comparison DWLS estimates seem to produce far more consistent and reasonable outputs (see attachment).
How does WLSMV differ from the "DWLS" option in mxFitFunctionWLS()?
I've been trying to find some information on WLSMV, and [this comment](https://groups.google.com/g/lavaan/c/Nymu7jmVUk8/m/Su_dhLMgBwAJ) claims "WLSMV is just a keyword in the Mplus language that simultaneously requests the DWLS estimator and a mean- and variance-adjusted (MV) chi-squared test statistic. (...) lavaan implements the same as Mplus describes in its technical literature (available on their website), which can be requested using the arguments:

lavaan(..., estimator = "DWLS", se = "robust.sem", test = "scaled.shifted") "

Is there a similar way for implementing WLSMV in OpenMx? I am also curious as to how this fit function treats missing data. Unless it's imputed, I gather it will use listwise deletion?

Replied on Thu, 03/31/2022 - 11:37
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by pehkawn

You do not have to do anything specific to "do WLSMV" in OpenMx. As long as the OpenMx backend has the full weight matrix (which it will if you are using `mxFitFunctionWLS()` with raw data, irrespective of whether you're using `estimator="DWLS"`, `estimator="WLS"`, or `estimator="ULS"`), then, post-`mxRun()`, your MxModel will be populated with the WLSMV chi-square statistic.

As Terrence D. Jorgensen said in the comment you linked, "WLSMV" is a kind of correction to standard errors and the goodness-of-fit chi-square statistic, and not a kind of estimator.

Edit: as long as it has the full weight matrix, OpenMx *always* reports robust standard errors when fitting a model with `mxFitFunctionWLS()`.

Replied on Fri, 04/01/2022 - 10:23
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by pehkawn

To add to my previous comment, I've been testing a partial model with fewer ordinal variables (to be able to get ML estimates). One thing I noticed is that both ML and WLS produced inconsistent estimates. If I rerun the exact same model without any modifications, I will still get very varying estimates. In comparison DWLS estimates seem to produce far more consistent and reasonable outputs (see attachment).

Does the fitfunction value improve (i.e., decrease) when you re-run the MxModel?

OpenMx has two functions that might be useful to you. One, mxCheckIdentification(), checks to see if the model is locally identified. The other, mxTryHardOrdinal(), makes multiple attempts to fit the MxModel, randomly perturbs start values between tries, and returns the result of the "best" try.

Replied on Thu, 03/31/2022 - 09:11
Picture of user. pehkawn Joined: 05/24/2020

In reply to by AdminNeale

[Update:]
According to the [OpenMx v. 2.11 release logs](https://openmx.ssri.psu.edu/node/4413), WLSMV was implemented in this update ( mxFitFunctionWLS() ). Am I correct in assuming that this is the same as DWLS in OpenMx (mxFitFunctionWLS(type = "DWLS"))?

This lead me to the latter half of your comment: "(...) WLSMV is better for your application, where the correlations have initially been estimated by ML (...). Could you elaborate what you mean by that or how to proceed? Should I build partial models or a CFA estimated by ML, and use these estimates in a full model estimated by WLSMV/DWLS? In such case, I am wondering how to proceed. I already restricted all means and variances of the ordinal indicators and their latent variables to 1. (Only the continuous indicators and their latent variable are freely estimated means and variances.)

According to [Kline (2016, pp. 301)](https://books.google.no/books?id=Q61ECgAAQBAJ&lpg=PP1&ots=jFin3pz9sg&dq=kline%202016%20principles%20and%20practice%20structural%20equation%20modeling&lr&pg=PP1#v=onepage&q=kline%202016%20principles%20and%20practice%20structural%20equation%20modeling&f=false), "a standardized solution where all variables have unit variance (1.0), standardized pattern coefficients for simple indicators (they depend on a single factor) are estimated Pearson correlations. In this case, squared standardized pattern coefficients are proportions of explained variance. If a standardized coefficient is .80, for example, then the factor explains $.80^2 = .64$, or 64.0% of the observed variance of that simple indicator.".

If this is the case, for any latent variable with only simple indicators, all pattern coefficients $\lambda_i$ should be
$$
0 \leq \lambda_i \leq 1 ,
$$ and
$$
\sum_{i=1}^{n} \lambda_i^2 = 1 ,
$$
which they clearly are not. Should I constrain my model to meet these criteria for a proper standardized solution?

Replied on Fri, 04/01/2022 - 11:51
Picture of user. mhunter Joined: 07/31/2009

I've have sent an email to the OP containing the draft of our OpenMx WLS manuscript which provides many details about OpenMx, WLS, ML, and ordinal variables. The manuscript has detailed answers to many of the questions implied above.