Standard Errors versus Confidence Intervals
Posted on

Attachment | Size |
---|---|
example.R | 2.53 KB |
fake.data_.R | 7.95 KB |
Forums
When I run this code (a threshold model for ordinal data), the standard errors are incredibly large but the confidence intervals are relatively narrow. Are any of the results trustworthy? I've tried simplifying the thresholds (using labeling to reduce the number of parameters) but it doesn't help. I'm not sure why this data is so hard to model.
If I treat the ordinal data as if it were quantitative, I have no trouble fitting a common factor model.
I've also noticed that when I used mxGenerateData, the resulting data is noticeably different from the original. The original ordinal data goes from 0 to 9. mxGenerateData changes this to 1 to 10 - which I could live with, but the percentage of 1s is MUCH larger than the percentage of 0s for each of the seven ordinal variables.
Here is the output:
Running oneFactorThresholdModel with 25 parameters
Warning message:
In model 'oneFactorThresholdModel' Optimizer returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
> summary(oneFactorFit.npsol.fake)
Summary of oneFactorThresholdModel
The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
free parameters:
name matrix row col Estimate Std.Error A lbound ubound
1 r1 facLoadings 1 1 0.72393105 43.17916 -0.99 0.99
2 r2 facLoadings 2 1 0.71059798 43.03563 -0.99 0.99
3 r3 facLoadings 3 1 0.69776866 41.61558 -0.99 0.99
4 r4 facLoadings 4 1 0.75890452 49.58525 -0.99 0.99
5 r5 facLoadings 5 1 0.70166711 42.71634 -0.99 0.99
6 r6 facLoadings 6 1 0.78463064 53.51735 -0.99 0.99
7 r7 facLoadings 7 1 0.77898024 52.09968 -0.99 0.99
8 k11 thresholdDeviations 1 v1 0.12058928 24.74546 -Inf
9 k21 thresholdDeviations 2 v1 0.26322885 40.79872 0.01
10 k31 thresholdDeviations 3 v1 0.17018005 47.23743 0.01
11 k41 thresholdDeviations 4 v1 0.22078831 40.55031 0.01
12 k51 thresholdDeviations 5 v1 0.22655569 35.78757 0.01
13 k61 thresholdDeviations 6 v1 0.16270037 37.08574 0.01
14 k71 thresholdDeviations 7 v1 0.09345118 NaN ! 0.01
15 k81 thresholdDeviations 8 v1 0.24889435 26.16626 0.01
16 k91 thresholdDeviations 9 v1 0.16995595 27.66385 0.01
17 k12 thresholdDeviations 1 v2 0.13324363 43.75378 -Inf
18 k22 thresholdDeviations 2 v2 0.19559708 111.85217 0.01
19 k32 thresholdDeviations 3 v2 0.15311304 121.41920 0.01
20 k42 thresholdDeviations 4 v2 0.18729238 106.37887 0.01
21 k52 thresholdDeviations 5 v2 0.19904518 97.32996 0.01
22 k62 thresholdDeviations 6 v2 0.23197829 82.46422 0.01
23 k72 thresholdDeviations 7 v2 0.20277574 78.16480 0.01
24 k82 thresholdDeviations 8 v2 0.14844392 80.63950 0.01
25 k92 thresholdDeviations 9 v2 0.25727615 54.20237 0.01
confidence intervals:
lbound estimate ubound note
r1 0.6211308 0.7239311 0.7389540
r2 0.6946362 0.7105980 0.7913712
observed statistics: 1617
estimated parameters: 25
degrees of freedom: 1592
fit value ( -2lnL units ): 4844.447
number of observations: 231
** Information matrix is not positive definite (not at a candidate optimum).
Be suspicious of these results. At minimum, do not trust the standard errors.
Information Criteria:
| df Penalty | Parameters Penalty | Sample-Size Adjusted
AIC: 1660.447 4894.447 NA
BIC: -3819.882 4980.508 4901.272
To get additional fit indices, see help(mxRefModels)
timestamp: 2016-04-17 18:15:04
Wall clock time (HH:MM:SS.hh): 00:05:20.75
optimizer: NPSOL
OpenMx version number: 2.5.2
Log in or register to post comments
In reply to Here is the output: by rabil
the curse of dimensionality
fake.data
:#> table(fake.data[,1],fake.data[,2])
1 2 3 4 5 6 7 8 9 10
1 89 9 5 6 4 5 1 3 1 3
2 13 3 3 1 0 2 1 0 0 0
3 3 1 0 0 2 4 2 1 0 1
4 6 3 1 2 0 1 1 1 0 2
5 8 2 0 1 0 1 0 0 2 0
6 2 0 0 1 1 0 2 1 1 0
7 1 0 0 1 1 0 0 0 1 0
8 1 1 0 0 2 1 2 1 0 1
9 0 0 1 1 0 0 0 1 1 1
10 3 0 1 0 1 2 3 0 1 0
The abundance of empty cells is only compounded in the seven-variable space where the multivariate normal probability integral is calculated. Basically, I'm saying I think your model is empirically unidentified, but partly due to the algorithm being used to fit it. You could try a different algorithm, such as WLS or IFA, or just treat your variables as continuous rather than ordinal. Another possibility would be to collapse your ordinal categories into a smaller number of categories.
Something else entirely would be to reparameterize your model so that the first two thresholds of each variable are fixed to 0 and 1, respectively, with the latent mean and variance of each variable freely estimated. Then, regularize estimation of the covariance matrix with an inverse-Wishart prior. Your main model would have two submodels: the one with your data that fits your factor analysis, and another that computes the -2 logdensity of the covariance matrix, given the inverse-Wishart hyperparameters you selected for the prior. Then, your main model would combine the fitfunctions of the two submodels via a multigroup fitfunction.
Log in or register to post comments
In reply to the curse of dimensionality by AdminRobK
Thanks. I tried recoding to a
> mxDataWLS(
+ data=na.omit(real.data),
+ type="WLS"
+ )
Calculating asymptotic summary statistics ...
Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, :
‘corr’ is not a correlation matrix
real.data consists of mxFactor results.
mxDataWLS works with fake.data but won't work with any version of the actual data.
As I mentioned before, treating the data as quantitative easily works and gives me results almost identical to what a lavaan threshold model gives (which I believe lavaan in this case is using WLS). If I try to regress the latent variables on a covariate, then the model will not complete without a warning message and code 6. I've been able to fit the full model in lavaan but would have liked to compare the results with OpenMx.
Log in or register to post comments
In reply to Thanks. I tried recoding to a by rabil
traceback()
Error in checkmvArgs(lower = lower, upper = upper, mean = mean, corr = corr, :
‘corr’ is not a correlation matrix
what does running
traceback()
from the R console tell you?I'm betting the issue has something to do with checking the symmetry of the computed correlation matrix. It finds that some element of the lower triangle isn't exactly equal to the lower triangle: e.g. 0.3948576 is not exactly equal to 0.3948575 so it throws an error. Having a better idea of the sequence of calls that led to this problem will help me fix it.
Sorry that it's a problem in the first place!
Log in or register to post comments
In reply to traceback() by mhunter
equivalent not equal?
x = cov2cor(x)
x[lower.tri(x)] <- t(x)[lower.tri(t(x))]
Log in or register to post comments
In reply to Thanks. I tried recoding to a by rabil
bug - report it
mxDataWLS()
I'd share the data confidentially with the developer-lead on that function, Michael Hunter
Best, t
Log in or register to post comments