What advice should we give on this error with ordinal data. Reducing the number of categories appears to help at times, but what systematic ways are there to ensure that the data don't produce non-positive definite matrices during the run?

Tue, 03/01/2011 - 12:16

#1
Error: Objective function returned a value of NaN.

What advice should we give on this error with ordinal data. Reducing the number of categories appears to help at times, but what systematic ways are there to ensure that the data don't produce non-positive definite matrices during the run?

Hello!

The previous post is old but I do get this error message when I run a model with two latent variables, one of them (SES) has ordinal indicators (2). The other has 4 continuous indicators.

I attached the script and here is the output:

# Output

> print(m.thresh.run.summ)

data:

$`Modell mit Thresholdmatrix.data`

income_cat isced_max AVM_1_week EH_4_week PHYS_ACTIV LEIS_ACTIV

1 : 680 1 : 85 Min. : 0.00 Min. : 0.000 Min. : 0.00 Min. : 0.00

2 : 325 2 : 349 1st Qu.: 6.75 1st Qu.: 4.000 1st Qu.: 10.50 1st Qu.:12.54

3 :1710 3 :1522 Median : 10.50 Median : 7.000 Median : 16.00 Median :21.99

4 : 768 4 : 833 Mean : 11.35 Mean : 9.054 Mean : 17.77 Mean :25.32

5 : 859 5 :1881 3rd Qu.: 15.25 3rd Qu.: 11.000 3rd Qu.: 22.50 3rd Qu.:35.23

NA's: 337 NA's: 9 Max. : 56.00 Max. :120.000 Max. :114.22 Max. :89.00

NA's :132.00 NA's :381.000 NA's : 94.00 NA's :53.00

free parameters: thresh 1 income_cat 1.0 2.121996e-314 thresh 2 income_cat 2.0 NaN thresh 3 income_cat 3.0 NaN thresh 4 income_cat 4.0 NaN thresh 1 isced_max 1.0 NaN thresh 2 isced_max 2.0 NaN thresh 3 isced_max 3.0 6.365987e-314 thresh 4 isced_max 4.0 4.456191e-313

name matrix row col Estimate Std.Error lbound ubound

1 l1 A income_cat SES 1.0 NaN

2 l2 A isced_max SES 1.0 2.758595e-313

3 g1 A U SES 1.0 NaN

4 l3 A AVM_1_week U 1.0 1.909796e-313

5 l4 A EH_4_week U 1.0 NaN

6 l5 A PHYS_ACTIV U 1.0 NaN

7 l6 A LEIS_ACTIV U 1.0 4.243992e-313

8 e3 S AVM_1_week AVM_1_week 1.0 NaN

9 e4 S EH_4_week EH_4_week 1.0 NaN

10 e5 S PHYS_ACTIV PHYS_ACTIV 1.0 NaN

11 e6 S LEIS_ACTIV LEIS_ACTIV 1.0 NaN

12 mean3 M 1 AVM_1_week 0.1 NaN

13 mean4 M 1 EH_4_week 0.1 NaN

14 mean5 M 1 PHYS_ACTIV 0.1 3.182994e-313

15 mean6 M 1 LEIS_ACTIV 0.1 NaN

16

17

18

19

20

21

22

23

observed statistics: 27068

estimated parameters: 23

degrees of freedom: 27045

-2 log likelihood: NaN

saturated -2 log likelihood: NA

number of observations: 4679

chi-square: NA

p: NA

Information Criteria:

df Penalty Parameters Penalty Sample-Size Adjusted

AIC NaN NaN NA

BIC NaN NaN NaN

CFI: NA

TLI: NA

RMSEA: NA

timestamp: 2012-07-20 11:34:34

frontend time: 0.234375 secs

backend time: 0.1875 secs

independent submodels time: 0 secs

wall clock time: 0.421875 secs

cpu time: 0.421875 secs

openmx version number: 1.2.4-2063

Warnmeldung:

The job for model 'Modell mit Thresholdmatrix' exited abnormally with the error message: Objective function returned a value of NaN.

I would appreciate any help on this.

My final goal is to include four latent variables in the model: 'SES' as shown, 'U' as shown, 'G' as a latent variable with one ordinal indicator and 'A' as a latent variable with four continuous indicators.

Thanks in advance

Sophia

It appears that your model crashed at the very first iteration, as none of the starting values have changed (except for those that started at zero, which OpenMx tried 0.1 in addition). Your starting covariance matrix is positive definite and your thresholds are in order, which are two common ways to yield this error. A NaN objective function value typically comes from one or more individuals having a likelihood of zero, which yields a log likelihood of -Inf and kills the optimization.

Do you have any rows of your data that could be outliers? I had a similar issue recently with a mixture model, such that the starting values expected two variables to be highly correlated. One individual had extreme values for these two variables (think -1 and 5 sds away from the mean, model expected correlation of .99), and the likelihood of that response was so low that the optimizer thought it was zero. In your case, you have people with scores of 50-120 on your continuous variables, and your starting values expect those variables to have means of zero and variances of 2 or 3.

I'd recommend that you try some more informed starting values. Set all of your initial means to something close to the data (11, 9, 18, 25, or just a constant number around 15) and multiply your factor loadings and residual variances by 4 or 5. Similarly, spread your starting thresholds around zero: you shouldn't expect the thresholds to all be above the variable's mean unless more than half of your sample selects the lowest category. Let us know if that fixes the problem.

yes, thank you!!!

Putting the start values of the initial means to the means of the data and multiplying the factor loadings by 4 or 5 did the trick. My model can be estimated. Where did you see that I should multiply the factor loadings by 4 or 5?

Just a guess based on the approximate scale of the variables given in your summary statement. Remember that your model is being fit to the data covariance matrix, either explicitly in an MLObjective or implicitly when using FIML. You should pick starting values that are as close as possible to the real answer. Try to make sure the model implied variance is reasonably close to the observed variance. You don't have to get very very close, but don't imply a variance of 1 when the data variances are 50.