You are here

Standard Errors reported as NA for identified model when N=5000 but not N=500 nor N=50

In many different, complicated, models, I have found Standard Errors to return as NA. Here is a simple example! Changing N to something smaller yields sensible-looking SE's. With 5000, where the SE's should be small but not minuscule (this is ordinal data) we get a lot of NA's.

free parameters:
                                      name              matrix row     col   Estimate  Std.Error lbound ubound
1                    thresholdModel.L[1,1]                   L   1       1  0.6910161         NA  -0.99   0.99
2                    thresholdModel.L[2,1]                   L   2       1  0.7280372 0.01354229  -0.99   0.99
3                    thresholdModel.L[3,1]                   L   3       1  0.6919454         NA  -0.99   0.99
4  thresholdModel.thresholdDeviations[1,1] thresholdDeviations   1 banana1 -0.6766820         NA   -Inf       
5  thresholdModel.thresholdDeviations[2,1] thresholdDeviations   2 banana1  0.6919464 0.01488096   0.01       
6  thresholdModel.thresholdDeviations[3,1] thresholdDeviations   3 banana1  0.6686823 0.01630470   0.01       
7  thresholdModel.thresholdDeviations[1,2] thresholdDeviations   1 banana2 -0.6241117 0.01629688   -Inf       
8  thresholdModel.thresholdDeviations[2,2] thresholdDeviations   2 banana2  0.6395418 0.01358104   0.01       
9  thresholdModel.thresholdDeviations[3,2] thresholdDeviations   3 banana2  0.6427120 0.01574627   0.01       
10 thresholdModel.thresholdDeviations[1,3] thresholdDeviations   1 banana3 -0.6534246         NA   -Inf       
11 thresholdModel.thresholdDeviations[2,3] thresholdDeviations   2 banana3  0.6685782         NA   0.01       
12 thresholdModel.thresholdDeviations[3,3] thresholdDeviations   3 banana3  0.6632309 0.01444722   0.01       
 
observed statistics:  15000 
estimated parameters:  12 
degrees of freedom:  14988 
-2 log likelihood:  39035.51 
number of observations:  5000 
Information Criteria: 
     df Penalty Parameters Penalty Sample-Size Adjusted
AIC:   9059.509           39059.51                   NA
BIC: -88620.183           39137.72             39099.58
timestamp: 2014-07-27 14:20:36 
wall clock time: 0.254709 secs 
OpenMx version number: 2.0.0.3651 
AttachmentSize
Binary Data thresholdModel1Factor3Variate.R2.48 KB
Reporter: 
Created: 
Sun, 07/27/2014 - 15:26
Updated: 
Mon, 07/28/2014 - 20:04

Comments

NPSOL does not reach the MLE of 39034.359 for a sample size of 5000. Try CSOLNP. It works better.

So it does! This is perhaps a use case where fine-tuning the step size as NPSOL does automatically is counterproductive. The following syntax

> summary(thresholdModelrun2 <- mxRun(thresholdModelrun))

with NPSOL gets to the ML solution, and the SE's agree with those post-CSOLNP. This raises the question as to what should be recommended to the user when Standard Errors come back as NA. It should probably go in the FAQ, but I would say a message to the effect of:

"Some standard errors are reported as NA. This may be because the model is not identified, or because optimization has not finished successfully. To check optimization completion, try:

summary(refittedModel <- mxRun(fittedModelFromPriorRun))
  • If we could capture what fittedModelFromPriorRun should actually be and substitute it in the error message, so much the better.

Good idea!
NAs in the SE column are a nice warning sign, but users often think OpenMx or their code is broken… when neither may be the case.

Be good to get something like this to run as a note at the end of mxSummary() (perhaps after mxRun also?)

if(any(is.na(model$parameters$Std.Error)){
    message("Some standard errors are reported as NA.\n",
    "This may be because optimization has not finished successfully.\n",
    "To check optimization completion, re-run the model:\n",
    deparse(substitute(model)), " <- mxRun(", deparse(substitute(model)), ")\n",
    "summary(fittedModelFromPriorRun)\n",
    "(note: it is also possible that model ", deparse(substitute(model)), " is not identified)")
}