Computing and Reporting fit statistics
Posted on

A place to discuss models current and desired og fit, df, AIC, etc and how to make basic functionality happen
A place to discuss models current and desired og fit, df, AIC, etc and how to make basic functionality happen
Several models do not report
Several models do not report fit statistics. What should be happening here? Do we need to attach them here the summary function can be improved, or should summary report the problems it found that stopped it computing the fit (couldn't be sure of df etc.) and then users can manually set these parameters in the model (df=xxx, ) to allow the existing summary to work?
Log in or register to post comments
Front page factor model in
Front page factor model in pathicCov, matricCov, and matricRaw forms as an example of differences in fit (models attached).
One-factor pathic model of covariance data
Now a Matrix model of the cov data
NB: same model of the same data as the pathic, but RMSEA now suggests terrible fit, p differs etc. etc.
Next try a matrix model with full information
Same model, but raw data. RMSEA now improbably = 0, most statistics not computed
Log in or register to post comments
In reply to Front page factor model in by tbates
Are you sure these are
Are you sure these are actually the same models? The non-pathic models are reporting 0 estimated parameters. That's just a count of the number of elements in the parameter vector passed to the estimator. Zero free parameters means that the estimator just calculated the -2LL of the starting values and returned. Fit stats will be horrible for such a model (unless you are very lucky with your starting values!)
Log in or register to post comments
In reply to Front page factor model in by tbates
I tried running the code in
I tried running the code in ML_vs_FIML.R and the ML matrix example came up with the same results as the pathic example. The FIML example returned an error because "names" was not defined.
Once I set
your code ran and came up with the same parameter estimates and reasonable fit statistics.
Log in or register to post comments
In reply to I tried running the code in by Steve
Hi Steve, Thanks for running
Hi Steve,
Thanks for running the script. I did it again from a clean boot at home and all is well. Can't see how the cov matrix version ever got 0 estimated but as you say... it runs fine.
For the FIML, I must have had the "names" variable set from some other model... A lesson in ensuring all used variables are set in the job, not just hanging around. oops.
For the missing Saturated -2 log likelihood in the FIML model, could mxRun() get this by estimating a saturated model (for means, variances, and covariances of the data)? Then we could have Chi-Square in the FIML model?
Log in or register to post comments
In reply to Hi Steve, Thanks for running by tbates
Interesting proposal. Does
Interesting proposal. Does anyone have objections?
Log in or register to post comments
In reply to Interesting proposal. Does by Steve
There are reasons that Mx1
There are reasons that Mx1 does not fit the saturated model for FIML. One is because fitting the saturated model (a free parameter for every mean, variance and covariance) is computationally intensive. Ideally it should be done only once. Mx1 has two ways of using the results of such a model: Option issat, which assumes that later in the same script there is a model of interest for which chi-sq fit will be computed; and ii) option saturated=-2lnL,df which allows the user to supply fit statistics which have been computed in a separate job.
Two, when there are definition variables, what the saturated model actually is becomes obscure, since any parameter of the model might be any polynomial function of any or all of the definition variables.
Note that there isn't a shortcut to computing the likelihood when there are missing data, although it is possible to supply decent starting values as long as the data are MCAR; when data are MAR the observed means, variances and covariances are not the same as the MLE's.
I am not sure whether we should just have an mxCompare(model1,model2,...) function to compare models, or if an additional argument to mxModel(compare=list) might allow for comparative statistics to be computed on the fly. I lean towards mxCompare, albeit at a cost of additional commands.
Log in or register to post comments
In reply to There are reasons that Mx1 by neale
I would like to see an
I would like to see an mxCompare function. As you say, one of the things it could do is compare to a saturated model.
Tim Brick and I had a conversation yesterday about the saturated model problem. I continue to think that the obscure cases are only maybe 5% of actual usage in SEM. So, we should expect to need to override something that can be automatically computed, and we need to throw an error when a person asks for automatic computation and there are potential pitfalls such as definition variables in the model. For the vast majority of SEMers, an auto saturated model computation (expensive though it might be) would be useful. Whether it should be on by default is another question about which I remain unsure.
Log in or register to post comments
In reply to I would like to see an by Steve
Especially as the saturated
Especially as the saturated model can be computed in parallel, so on machines where the user has a few spare cores, will add no time.
Maybe something like
saturated=(T|F) as default, and if the user provides a numeric value, that is used as the saturated value of the objective?
Log in or register to post comments
In reply to There are reasons that Mx1 by neale
Hi Mike, I'm finally starting
Hi Mike,
I'm finally starting with OpenMx, and I must say that I love it!
Have you given any thoughts to the mxCompare function?
In the meantime, is there a way to get the degrees of freedom using mxEval, in the same way that we get the LL. If that's possible then one can simply compute the LRT and the difference in degrees of freedom, with the corresponding p value using R commands.
Thanks!
Log in or register to post comments
In reply to Hi Mike, I'm finally starting by irebollo
Dear Irene, You can get the
Dear Irene,
You can get the degrees of freedom from summary().
assuming you have a fitted model in "fit"
> a = summary(fit)
> names(a)
[1] "parameters" "Minus2LogLikelihood" "SaturatedLikelihood" "numObs"
[5] "estimatedParameters" "observedStatistics" "degreesOfFreedom" "Chi"
[9] "p" "AIC.Mx" "BIC.Mx" "RMSEA"
[13] "dataSummary" "frontendTime" "backendTime" "mxVersion"
> a$Minus2LogLikelihood
1066
Log in or register to post comments
In reply to Hi Steve, Thanks for running by tbates
hmm. In at work again and
hmm. In at work again and running the covariance model. Building today's R, shutting down and reloading R.app (today's build)... I get a model with 0 df.
get the same result just copying the front page ALA'+U model into R and pressing return... Baffled, as it worked at home (intel laptop) last night.
Can some other people try this please?
Log in or register to post comments
In reply to hmm. In at work again and by tbates
Just did an svn up cd
Just did an
svn up
cd trunk
make clean install
R
That installed rev 803.
Copied the model from your post.
I got exactly what I'd expect:
Log in or register to post comments
In reply to Just did an svn up cd by Steve
yes, and works fine or me at
yes, and works fine or me at home (intel macbook).
I wonder if it is something about the PowerPC (that's my work machine) NPSOL?
Log in or register to post comments
In reply to yes, and works fine or me at by tbates
No it's not a PPC issue.
No it's not a PPC issue. Your model has been specified with no free parameters. I took the original model and replaced all the free parameters with fixed parameters and I got the same summary statistics as you reported when running the model. Try executing rm(list=ls()) in your workspace before running the model (I'm assuming you don't store critical data or code in your workspace). Hmm, I believe you had another forum post where you assigned T <- FALSE as an example of R weirdness. If that assignment was done in your workspace, that would explain the behavior you are seeing.
Log in or register to post comments
In reply to No it's not a PPC issue. by mspiegel
hi mike, just came here to
hi mike, just came here to report that exact solution... beat me too it.
This has to be a _great_ reason for stopping using variables as native booleans in example code...
Log in or register to post comments
In reply to hi mike, just came here to by tbates
The reason I did it on the
The reason I did it on the front page code is... well... it's the front page and I was trying to fit a whole factor model into one column.
I think we should have a short check script for R newbies (and wicked beta testers who do bad things on purpose).
Something like
and so forth so that if you reassign 'T' or 't' or 'F' or 'c', etc. we can catch it.
I mean, we can be verbose and protect people against 'T', but we can't protect people against 't'. So, I'm not sure I even buy the logic of never ever using 'T'.
Instead, let's put together an omnibus script that we recommend bug reporters run to verify that they have a working instance of .RData.
Log in or register to post comments