General strategies to deal with non-convex Hessian matrix

11 posts / 0 new

Last post

Thu, 06/29/2023 - 06:33

Ben

Offline

Joined: 06/20/2023 - 09:18

General strategies to deal with non-convex Hessian matrix

Hello everyone!

I am working on a org psych paper that is currently under review. The model is a quite complex multilevel mediation SEM, featuring four latent variables (only one is exogenous/independent, the others are endogeneous mediators/outcomes). I also include the measurement models for each variable (i.e., the respective items as manifest variables). Until now I have done the estimations in a 2-level model using lavaan. For the review I am trying to add a third level to control for further nestedness.

Thank you for providing and maintaining OpenMx, I am grateful there is a stable R package for this task! Thank you also for providing so much guidance on this forum that was of great help so far!

I am building up the model step-wise, both with regards to the levels as well as the variables I consider. The results generally fit the ones I get with lavaan. Unfortunately though, I am often running into problems with non-convex Hessian matrices and I'm missing an intuition to trouble shoot. For example, fixing the regression coefficient between both variables on both levels of a two-level two-latent-variable model solved the problem. This was surprising as the model converged in lavaan even when both parameters were free.

Coming from lavaan, I'm not used to the level of detail OpenMx allows. As I want to be respectful with your time, I want to make sure I've done my homework before I hit you with different models and dozens of lines of code for each. (I also imagine others might run in similar problems with different models so more general heuristics might be helpful.)

Are there any major resources i might have overlooked to troubleshoot this problem?
Are there any general requirements related to the Hessian I have to consider and that I might miss? (Too many/few specified paths?)
Do you have helpful heuristics I can play around with? (e.g., with regard to starting values?)

(I've checked the example models here, https://openmx.ssri.psu.edu/node/4480, e.g., ex3, lmer-1, xxm-3, ex936. Another thread comparing lavaan and OpenMx here was also very helpful.)

Otherwise, I'm happy to share my code and walk with you step-wise through the models.

(As the paper is still under review I'm hesitating to share the full data set. Any best practices how do deal with this? Is a partial set of the data enough or should I generate random data from the observed covariance matrix - which would kill the multilevel structure though?)

Thu, 06/29/2023 - 11:06

AdminRobK

Offline

Joined: 01/24/2014 - 12:15

bound condition?

Are one or more of your free parameters near a bound? If so, then you are probably encountering this issue, and you can ignore the warning about the non-PD Hessian.

Mon, 07/03/2023 - 04:19

Ben

Offline

Joined: 06/20/2023 - 09:18

Don't think so

In many cases, both bounds have not been calculated (are NA) and I get the triple exclamation mark explained here, https://openmx.ssri.psu.edu/node/4303. Also, many of the standard error are very very high. (1-2 orders of magnitude higher than the estimate.)

For some reason I also can't replicate the model that worked after fixing the regression coefficient, so I'll prepare the models to share them as I might just have miss-specified something.

Is there something about the data I should be aware of? I have quite many and quite small clusters, so lavaan warns of low or zero variance in some of them. Can that lead to trouble?

Thu, 07/13/2023 - 02:44

Ben

Offline

Joined: 06/20/2023 - 09:18

Missing data issue!?

I got the model to converge after I omitted all missing data in advance. I'm not sure what the problem was "under the hood." Is this a known problem that can occur with missing data?

As there were times during trouble-shooting where I was on the verge of quitting my OpenMx experiment and I think others might take that step, I think it would be valuable to help trouble-shoot here. For example, the documentation currently reads as if dealing with missing data is a strength of OpenMx, not a potential angle of problems:

"These days, the standard approach for model fitting applications is to use raw data, which is simply a data table or rectangular file with columns representing variables and rows representing subjects. The primary benefit of this approach is that it handles datasets with missing values very conveniently and appropriately."

"In the case where the dataset is complete, in other words there are no missing data, there is no advantage to using raw data. For our example, we can easily create a covariance matrix based on our data set by using R’s var() function, in the case of analyzing a single variable, or cov() function, when analyzing more than one variable."

For more application-focused people like me, in a case like this, it's good to know what to do on a pragmatic level. I guess this is one of the problems where people like me who have "grown up" with simple tools that take away a lot of decisions now find themselves confronted with more freedom and responsibility for proper modeling. Maybe to think about stuff like this on a pedagogic angle is important though, as I know of few people in my field who use OpenMx, while from all I've learned since it seems far more powerful and also transparent than many of existing tools. I think our field would benefit from a more mindful and explicit approach to modeling and thus hope, more people would take over OpenMx.

I also hope this will also help me with the other models, I'll keep you posted :)

Fri, 07/14/2023 - 11:22

AdminNeale

Offline

Joined: 03/01/2013 - 14:09

Check positive definiteness of expected covariance matrices.

There are other threads that suggest how to troubleshoot this type of problem. I feel a bit in the dark due to lack of code, but one potential issue is that the starting values are bad so that the likelihood evaluates to zero or less, and the log-likelihood cannot be calculated. To figure out if that might be the case, one can extract the expected covariance matrices for the various groups in the model with mxGetExpected(model, 'covariance'). Then examine the Eigen values of the matrices, if any are negative you know you are starting in a bad place. The usual fix is to increase the variable-specific variance and decrease covariances. Possibly, the expected mean is way out of line too, so compare the observed and expected means in terms of their respective standard deviations (square roots of the diagonals of the expected covariance matrices). If more than 2, adjust mean or increase variance so it isn't so great an outlier. Still, all this might be worthless, as 'did not converge' isn't a very informative error message - even though other packages use. The error messaging from OpenMx is a bit harder to understand but it's a lot more specific about what might be going wrong.

HTH

Tue, 07/18/2023 - 01:47

(Reply to #5) #6

Ben

Offline

Joined: 06/20/2023 - 09:18

Thank you very much Neale! I

Thank you very much Neale! I am sorry if I overlooked past discussions on this. I went through the posts mentioning "Hessian" but I just learned how to make the search function work for phrases.

I tried your suggestion, negative Eigenvalues don't seem to be an issue here. By now, mxRun() also does give a result, but returns a warning about non-zero status code 5 and a non-convex Hessian. While the standard errors of my result are gigantic, I used the estimates as starting values which didn't solve the problem, so I'm not sure it's about them.

Overall, my third-level estimates, especially of the error terms, are very small (often between .001 - .01) and their estimates oftentimes negative. Is it possible that my third level clusters are just too weak for this analysis to make sense? This is for a review and I am following the editor on this, but in this case I might put the energy on arguing why controlling for the third level doesn't work instead of fighting my way through the models itself...

Fri, 07/21/2023 - 17:59

(Reply to #6) #7

lf-araujo

Offline

Joined: 11/25/2020 - 13:24

Did you try autostarting the

Did you try autostarting the model with mxAutostart() before running?

Fri, 08/11/2023 - 11:39

(Reply to #7) #8

AdminNeale

Offline

Joined: 03/01/2013 - 14:09

Underidentified model?

It seems like your model is under identified. Large SEs for the estimates, and non-positive definite Hessian (the covariances of the parameters) are signs of this problem. Maybe try mxCheckIdentification() on your model to figure out which parameters may not be identified? This would perhaps locate parameters for which the sparseness (or absence) of data is too great for them to be estimated. Trying to estimate a variance when only one observation is available will typically be pathological as the estimate of the variance heads towards zero, and the likelihood becomes undefined (the inverse of the variance is needed for this and x/0 isn't defined).

HTH!

Wed, 08/23/2023 - 00:13

(Reply to #8) #9

Ben

Offline

Joined: 06/20/2023 - 09:18

mxCheckIdentification, mxAutoStart and multi-level

Thank you for your suggestions. mxAutoStart unfortunately doesn't help as I am studying a multilevel model and it's currently not applied for multilevel models.

mxCheckIdentification() is interesting too. I get a long list of non-identified parameters which are basically all parameters of my higher-level models. Does this mean I have misclassified all of them or is this again about the fact its a submodel/multilevel model?

I'm not sure it's the model itself. After starting from a 2-level lavaan model and lots of tweaking and trial-and-error, I found start values that worked... but they seem to be very very brittle. I lost them after calculating some variants I need to report for which I made the mistake of changing some of the values again. (I thought I had changed the copied code but alas apparently I hadn't.)

It's quite Sysiphean work. I can replicate the model when using the results of the earlier mxRun()-estimates I saved as an .rds as start values. But it's not a very unaesthetic solution, and looks a bit fishy for someone who doesn't know OpenMx. Are the start values saved in the mxRun() object somewhere?

And do you have a heuristic/intuition wrt starting values? For example, are absolute differences more important or the right order of magnitude? (For example, is it more important to define a variance of 2 not as one or to be in the right ballpark whether error terms are .1 or .01?) I suspect that small values are especially dangerous given some of the matrices are inverted? It seems weird to define different start values for the factor loadings for items of the same latent variable, even though they differ only by a factor of 2.

As I've been using lavaan in the past I have little experience with OpenMx and I'm not sure I'm on the wrong path and missing something obvious or just extremely lucky to even find something that works at all.

Wed, 08/23/2023 - 10:59

(Reply to #9) #10

lf-araujo

Offline

Joined: 11/25/2020 - 13:24

MWE

At this point Ben, the best way to help you would be to have access to a minimal working example with a simplified version of your model and a bit of simulated data with similar characteristics of your real data. There is an ancient function I found in this very forum, which I never used, that is supposed to do exactly that (attached).

File attachments:

FakeData_1.R

Wed, 08/23/2023 - 21:21

(Reply to #10) #11

AdminNeale

Offline

Joined: 03/01/2013 - 14:09

mxGenerateData() function for de-identifying & anonymizing data

The built-in mxGenerateData() function is useful for making data from a particular covariance and mean structure specified in an mxModel. So if you have real data you don't want to share, use the function to replace the original data in a script so that you can then safely share it. There is a slight gotcha that perhaps the same problem doesn't occur with the mxGenerateData'd version of the script. If that is the case, it suggests some anomaly with your data, such as having values that are extreme (e.g., over 4 standard deviations from the model-expected mean, in terms of the corresponding square root of the model-expected variance).

Main menu

Navigation

You are here

General strategies to deal with non-convex Hessian matrix

Main menu

User login

Navigation

You are here

General strategies to deal with non-convex Hessian matrix

Search form