A function to return fit indices

# starter R code to compute the usual assortment of fit indices
#
# based on (what I understood of):
# Tabachnick & Fidell, pp. 697-702
# Klein, pp. 135-144
# also http://www.tau.ac.il/cc/pages/docs/sas8/stat/chap19/sect34.htm
#
fit.index <- function(indepfit,modelfit) {
indep <- summary(indepfit)
indep.chi <- indep$Chi
indep.df <- indep$degreesOfFreedom
model <- summary(modelfit)
model.chi <- model$Chi
model.df <- model$degreesOfFreedom
model.ml <- as.numeric(modelfit$objective@result)
N <- model$numObs # sample size
N.est.param <- model$estimatedParameters # t
N.manifest <- length(modelfit@manifestVars) # n
observed.cov <- modelfit@data@observed # S
estimate.cov <- modelfit@matrices$S@values[1:N.manifest,1:N.manifest] # C, W
observed.cor <- cov2cor(observed.cov)
estimate.cor <- cov2cor(estimate.cov)
residual.cov <- observed.cov-estimate.cov
residual.cor <- observed.cor-estimate.cor
F0 <- (model.chi-model.df)/N
if (F0<0) { F0 <- 0 }
NFI <- (indep.chi-model.chi)/indep.chi
NNFI <- (indep.chi-indep.df/model.df*model.chi)/(indep.chi-indep.df)
PNFI <- (model.df/indep.df)*NFI
IFI <- (indep.chi-model.chi)/(indep.chi-model.df)
CFI <- 1.0-(model.chi-model.df)/(indep.chi-indep.df)
RMSEA<- sqrt(F0/model.df) # need confidence intervals
MFI <- exp(-0.5*(model.chi-model.df)/N)
GFI <- sum(estimate.cov*estimate.cov)/sum(observed.cov*observed.cov) # definitely not right!
AGFI <- 1.0 - (1.0-GFI)/(1.0-N.est.param/N)
PGFI <- (1.0-N.est.param/N)*GFI
AIC <- model.chi-2*model.df
AIC2 <- model.chi+2*N.est.param
CAIC <- model.chi-(log(N)+1.0)*model.df
CAIC2<- model.chi+(log(N)+1.0)*N.est.param
BIC <- model.chi-log(N)*model.df
BIC2 <- model.chi+log(N)*N.est.param
RMR <- sqrt(2.0*sum((residual.cov*residual.cov)/(N.est.param*(N.est.param+1))))
SRMR <- sqrt(2.0*sum((residual.cor*residual.cor)/(N.est.param*(N.est.param+1)))) # not right!
indices <- rbind(NFI,NNFI,PNFI,IFI,CFI,RMSEA,MFI,GFI,AGFI,PGFI,AIC,AIC2,CAIC,CAIC2,BIC,BIC2,RMR,SRMR)
return(indices)
}
This looks good! We already
-Which of these functions is very important to you? Which do you not care to see?
-Are there any additional functions you'd like to see?
If these all get forwarded into summary, I'd propose something like this:
-Information criteria are put in a data frame, making a 4 x 2 (or 2 x 4) table of AIC, BIC, CBIC and sBIC with both the -2*observedStatistics and +2*parameters corrections
-*FI get put in a data frame because there's a lot of them. If there is no saturated model given, an NA or character message is presented instead
-RMSEA (and its CI) remain their own object
-RMR and SRMR are put together, as I understand SRMR to be the standardized version of RMR. (This will have to wait for an explicit output of the estimated and observed covariances; i have no idea what to do with these with ordinal data).
Log in or register to post comments
In reply to This looks good! We already by Ryne
My impression is that the
By all means it's better to have something that works for the most common cases than wait for something that would be perfectly general for any possible situation (ordinal data or what have you).
Another thing that would be extremely useful is the calculation of standardized coefficients. I have probably misunderstood how that's supposed to be done, because my first attempt is very definitely wrong:
my.std.coef <- function(modelfit) { # accepts result of MxRun
s <- modelfit@matrices$S@values
s[which(modelfit@matrices$A@free)] <- modelfit@matrices$A@values[which(modelfit@matrices$A@free)]
t <- matrix(nrow=nrow(s))
for (i in 1:nrow(s)) {t[i] <- sum(s[i,])}
std <- matrix(nrow=nrow(s),ncol=ncol(s))
rownames(std)=rownames(s)
colnames(std)=colnames(s)
for (i in 1:nrow(s)) {std[i,] <- s[i,]/t[i]}
std <- sqrt(std)
mod <- summary(modelfit)
nparam <- mod$estimatedParameters
stdcoef <- vector(length=nparam)
for (i in 1:nparam) { stdcoef[i] <- std[mod$parameters$row[i], mod$parameters$col[i]] }
z <- mod$parameters$Estimate / mod$parameters$Std.Error
p <- 1-pnorm(z)
augcoef <- cbind(mod$parameters,z,p,stdcoef)
return(augcoef)
}
If you will be my god and fix these so that we can run our usual analyses with OpenMx I'll have one more reason to convince my students to use R for everything!
Log in or register to post comments
In reply to My impression is that the by 0avasns
Thanks for the
I've already written a RAM model standardization function, which can be found here: http://openmx.psyc.virginia.edu/thread/718. It shows up as posted by anonymous for some reason, but let me know if you have any questions.
Log in or register to post comments
In reply to Thanks for the by Ryne
Works like a charm! Except
Log in or register to post comments
In reply to Works like a charm! Except by 0avasns
I guess that was a bit too
Log in or register to post comments
In reply to I guess that was a bit too by 0avasns
That's odd. If you don't
Log in or register to post comments
In reply to That's odd. If you don't by Ryne
Thanks a lot for looking into
> semTwo$coeff
Estimate Std Error z value Pr(>|z|)
b_wrdflu_F 0.9065400 0.07784689 11.645167 0.000000e+00 WRDFLU3 <--- lv_F
b_pseflu_F 0.8461698 0.07447068 11.362455 0.000000e+00 PSEFLU3 <--- lv_F
eps_wrdflu 0.1781848 0.12684788 1.404713 1.601068e-01 WRDFLU3 <--> WRDFLU3
eps_pseflu 0.2839963 0.11156310 2.545611 1.090866e-02 PSEFLU3 <--> PSEFLU3
b_lc_L 0.5033123 0.06281616 8.012466 1.110223e-15 LC3 <--- lv_L
b_token_L 0.4961814 0.06238059 7.954099 1.776357e-15 TOKEN2 <--- lv_L
eps_lc 0.7466788 0.06687948 11.164543 0.000000e+00 LC3 <--> LC3
eps_token 0.7538076 0.06629075 11.371233 0.000000e+00 TOKEN2 <--> TOKEN2
b_rcabc_R 0.2817331 0.09090740 3.099122 1.940949e-03 zRC3abc <--- lv_R
b_rcdef_R 0.3453483 0.11609160 2.974791 2.931878e-03 zRC3def <--- lv_R
eps_rcabc 0.6383103 0.05663863 11.269875 0.000000e+00 zRC3abc <--> zRC3abc
eps_rcdef 0.4565309 0.06581463 6.936617 4.015899e-12 zRC3def <--> zRC3def
b_L_R 1.7222738 0.71833503 2.397591 1.650327e-02 lv_R <--- lv_L
b_F_R 0.5387619 0.20021772 2.690880 7.126376e-03 lv_R <--- lv_F
> mxTwo$parameters
name matrix row col Estimate Std.Error
1 b_wrdflu_F A WRDFLU3 F 0.9065392 0.07791102
2 b_pseflu_F A PSEFLU3 F 0.8461700 0.07451795
3 b_F_R A R F 0.5387590 0.19795745
4 b_lc_L A LC3 L 0.5033116 0.06266909
5 b_token_L A TOKEN2 L 0.4961789 0.06224838
6 b_L_R A R L 1.7222646 0.70604404
7 b_rcabc_R A zRC3abc R 0.2817338 0.08940058
8 b_rcdef_R A zRC3def R 0.3453491 0.11413211
9 eps_wrdflu S WRDFLU3 WRDFLU3 0.1781857 0.12691719
10 eps_pseflu S PSEFLU3 PSEFLU3 0.2839953 0.11162171
11 eps_lc S LC3 LC3 0.7466774 0.06672460
12 eps_token S TOKEN2 TOKEN2 0.7538065 0.06615317
13 eps_rcabc S zRC3abc zRC3abc 0.6383107 0.05662865
14 eps_rcdef S zRC3def zRC3def 0.4565310 0.06571747
> semTwo.std
Std. Estimate
1 1.0000000 lv_F <--> lv_F
2 b_wrdflu_F 0.9065402 WRDFLU3 <--- lv_F
3 b_pseflu_F 0.8461700 PSEFLU3 <--- lv_F
4 eps_wrdflu 0.1781849 WRDFLU3 <--> WRDFLU3
5 eps_pseflu 0.2839964 PSEFLU3 <--> PSEFLU3
6 1.0000000 lv_L <--> lv_L
7 b_lc_L 0.5033118 LC3 <--- lv_L
8 b_token_L 0.4961805 TOKEN2 <--- lv_L
9 eps_lc 0.7466772 LC3 <--> LC3
10 eps_token 0.7538049 TOKEN2 <--> TOKEN2
11 0.2349353 lv_R <--> lv_R
12 b_rcabc_R 0.5883051 zRC3abc <--- lv_R
13 b_rcdef_R 0.7256102 zRC3def <--- lv_R
14 eps_rcabc 0.6538971 zRC3abc <--> zRC3abc
15 eps_rcdef 0.4734898 zRC3def <--> zRC3def
16 b_L_R 0.8347882 lv_R <--- lv_L
17 b_F_R 0.2611386 lv_R <--- lv_F
> mxTwo.std
name matrix row col Std. Estimate Std.Std.Error
1 b_wrdflu_F A WRDFLU3 F 0.9065397 0.07791105
2 b_pseflu_F A PSEFLU3 F 0.8461704 0.07451799
3 b_F_R A R F 0.2611382 0.09595061
4 b_lc_L A LC3 L 0.5033116 0.06266909
5 b_token_L A TOKEN2 L 0.4961789 0.06224838
6 b_L_R A R L 0.8347872 0.34222181
7 b_rcabc_R A zRC3abc R 0.1382145 0.04385865
8 b_rcdef_R A zRC3def R 0.1704727 0.05633838
9 eps_wrdflu S WRDFLU3 WRDFLU3 0.1781858 0.12691731
10 eps_pseflu S PSEFLU3 PSEFLU3 0.2839956 0.11162182
11 eps_lc S LC3 LC3 0.7466774 0.06672460
12 eps_token S TOKEN2 TOKEN2 0.7538065 0.06615317
13 eps_rcabc S zRC3abc zRC3abc 0.6538981 0.05801151
14 eps_rcdef S zRC3def zRC3def 0.4734907 0.06815881
>
(I hope the code shows up correctly with all the angle brackets in there... I don't know how to get the tabs to align in the posting, so I am also attaching the output separately)
The problem is evident in comparing rows 11-13 of semTwo.std to rows 7-8 of mxTwo.std (there's no standardized disturbance of R in mxTwo). The corresponding unstandardized coefficients match fine.
Log in or register to post comments
In reply to Thanks a lot for looking into by 0avasns
Fixed. I'll update and detail
I've altered your code to make comparison easier. I used R's merge() function to allow for more direct comparison of the sem and OpenMx output.
I've also changed all of your '->' assignment operators to '<-'. Left assignment is the standard in every programming language I'm familiar with. Mixing left and right assignment in the same script is especially bad, as it becomes very easy to misread assignment operators.
Log in or register to post comments
In reply to Fixed. I'll update and detail by Ryne
Thank you very much for
(as for the left/right assignment, it is one of the things I really like about R :-) )
Log in or register to post comments
In reply to Fixed. I'll update and detail by Ryne
Fit indices - RMSEA confidence interval, p close
Thanks to the script below (suggested by Athanassios Protopapas and further developed by Paolo Ghisletta, thank you!!!) I was able to obtain a large number of fit indices, but I still can't get the 90% confidence interval for RMSEA and p close (the test of the null hypothesis that RMSEA (in the population) in less than .05). Can anyone help me?
fit.cov <- function(indepfit,modelfit) {
options(scipen=3)
indep <- summary(indepfit)
model <- summary(modelfit)
deviance <- model$Minus2LogLikelihood
Chi <- model$Chi
indep.chi <- indep$Chi
df <- model$degreesOfFreedom
p.Chi <- 1 - pchisq(Chi, df)
Chi.df <- Chi/df
indep.df <- indep$degreesOfFreedom
N <- model$numObs
N.parms <- model$estimatedParameters
N.manifest <- length(modelfit@manifestVars)
q <- (N.manifest*(N.manifest+1))/2
N.latent <- length(modelfit@latentVars)
observed.cov <- modelfit@data@observed
observed.cor <- cov2cor(observed.cov)
A <- modelfit@matrices$A@values
S <- modelfit@matrices$S@values
F <- modelfit@matrices$F@values
I <- diag(N.manifest+N.latent)
estimate.cov <- F %*% (qr.solve(I-A)) %*% S %*% (t(qr.solve(I-A))) %*% t(F)
estimate.cor <- cov2cor(estimate.cov)
Id.manifest <-diag(N.manifest)
residual.cov <- observed.cov-estimate.cov
residual.cor <- observed.cor-estimate.cor
F0 <- (Chi-df)/(N-1)
if (F0<0) {F0 <- 0}
NFI <- (indep.chi-Chi)/indep.chi
NNFI.TLI <- (indep.chi-indep.df/df*Chi)/(indep.chi-indep.df)
PNFI <- (df/indep.df)*NFI
RFI <- 1 - (Chi/df) / (indep.chi/indep.df)
IFI <- (indep.chi-Chi)/(indep.chi-df)
CFI <- 1.0-(Chi-df)/(indep.chi-indep.df)
if (CFI>1) {CFI=1}
PRATIO <- df/indep.df
PCFI <- PRATIO*CFI
NCP <- max((Chi-df),0)
RMSEA<- sqrt(F0/df) # need confidence intervals
MFI <- exp(-0.5*(Chi-df)/N)
GH <- N.manifest / (N.manifest+2*((Chi-df)/(N-1)))
GFI <- 1-(sum(diag(((solve(estimate.cor)%*%observed.cor)-Id.manifest)%*%((solve(estimate.cor)%*%observed.cor)-Id.manifest))) / sum(diag((solve(estimate.cor)%*%observed.cor)%*%(solve(estimate.cor)%*%observed.cor))))
AGFI <- 1 - (q/df)*(1-GFI)
PGFI <- GFI * df/q
AICchi <- Chi+2*N.parms
AICdev <- deviance+2*N.parms
BCCchi <- Chi + 2*N.parms/(N-N.manifest-2)
BCCdev <- deviance + 2*N.parms/(N-N.manifest-2)
BICchi <- Chi+N.parms*log(N)
BICdev <- deviance+N.parms*log(N)
CAICchi <- Chi+N.parms*(log(N)+1)
CAICdev <- deviance+N.parms*(log(N)+1)
ECVIchi <- 1/N*AICchi
ECVIdev <- 1/N*AICdev
MECVIchi <- 1/BCCchi
MECVIdev <- 1/BCCdev
RMR <- sqrt((2*sum(residual.cov^2))/(2*q))
SRMR <- sqrt((2*sum(residual.cor^2))/(2*q))
indices <-
rbind(N,deviance,N.parms,Chi,df,p.Chi,Chi.df,AICchi,AICdev,BCCchi,BCCdev,BICchi,BICdev,CAICchi,CAICdev,RMSEA,SRMR,RMR,GFI,AGFI,PGFI,NFI,RFI,IFI,NNFI.TLI,CFI,PRATIO,PNFI,PCFI,NCP,ECVIchi,ECVIdev,MECVIchi,MECVIdev,MFI,GH)
return(indices)
}
fit.cov(indepfit,modelfit)
Log in or register to post comments
In reply to This looks good! We already by Ryne
CIs of RMSEA
cheers,
Andreas
Log in or register to post comments
So you *are* my god! Thanks
Log in or register to post comments
Update on fit
First, I've noted your observation that AIC and BIC "seemed OK or off by a factor of 2." For some reason, we were multiplying BIC by 0.5. This will be checked in for the next binary release; we should support AIC and BIC with both the -2*df and the +2*parameters corrections soon, as well as sample size adjusted BIC.
The bigger issue pertains to fit statistics like CFI that require not just a saturated likelihood, but an independence model likelihood and df for all models as well. This will take a few more additions to the summary function and optimization, and may take a little time to make sure we get the specification right.
Log in or register to post comments
In reply to Update on fit by Ryne
Thanks a lot for the update.
Log in or register to post comments
In reply to Thanks a lot for the update. by 0avasns
again on fit statistics: I
I updated and extended Oavasns original function (I think I found a couple of minor mistakes and hopefully was able to fix them). Any check by the OpenMx team or anybody else would be appreciated. The code can certainly be simplified. For instance, I ignore how to extract a model's estimated covariance matrix (called 'estimate.cov' below), hence got this with RAM algebra. The Confidence Interval of the RMSEA is still missing.
fit.index <- function(indepfit,modelfit) {
indep <- summary(indepfit)
indep.chi <- indep$Chi
indep.df <- indep$degreesOfFreedom
model <- summary(modelfit)
model.chi <- model$Chi
model.dev <- model$SaturatedLikelihood
model.df <- model$degreesOfFreedom
model.ml <- as.numeric(modelfit$objective@result)
N <- model$numObs
N.est.param <- model$estimatedParameters
N.manifest <- length(modelfit@manifestVars)
N.latent <- length(modelfit@latentVars)
observed.cov <- modelfit@data@observed
A <- modelfit@matrices$A@values
S <- modelfit@matrices$S@values
F <- modelfit@matrices$F@values
I <- diag(N.manifest+N.latent)
estimate.cov <- F %*% (qr.solve(I-A)) %*% S %*% (t(qr.solve(I-A))) %*% t(F)
observed.cor <- cov2cor(observed.cov)
estimate.cor <- cov2cor(estimate.cov)
Id.manifest <-diag(N.manifest)
residual.cov <- observed.cov-estimate.cov
residual.cor <- observed.cor-estimate.cor
F0 <- (model.chi-model.df)/(N-1)
if (F0<0) {F0 <- 0}
NFI <- (indep.chi-model.chi)/indep.chi
NNFI.TLI <- (indep.chi-indep.df/model.df*model.chi)/(indep.chi-indep.df)
PNFI <- (model.df/indep.df)*NFI
IFI <- (indep.chi-model.chi)/(indep.chi-model.df)
CFI <- 1.0-(model.chi-model.df)/(indep.chi-indep.df)
if (CFI>1) {CFI=1}
RMSEA<- sqrt(F0/model.df) # need confidence intervals
MFI <- exp(-0.5*(model.chi-model.df)/N)
GH <- N.manifest / (N.manifest+2*((model.chi-model.df)/(N-1)))
GFI <- 1 - (sum(diag(((solve(estimate.cor)%*%observed.cor)-Id.manifest)%*%((solve(estimate.cor)%*%observed.cor)-Id.manifest))) /
(sum(diag((solve(estimate.cor)%*%observed.cor)%*%(solve(estimate.cor)%*%observed.cor)))))
AGFI <- 1.0 - ((N.manifest*(N.manifest+1))/(2*model.df) * (1-GFI))
PGFI <- (2*model.df)/(N.manifest*(N.manifest+1))*GFI
#AIC <- model.chi-2*model.df
AIC <- model.chi+2*N.est.param
#CAIC <- model.chi-(log(N)+1.0)*model.df
CAIC<- model.chi+(log(N)+1.0)*N.est.param
#BIC <- model.chi-log(N)*model.df
BIC <- model.chi+log(N)*N.est.param
RMR <- sqrt((2*sum(residual.cov^2))/(N.manifest*(N.manifest+1)))
SRMR <- sqrt((2*sum(residual.cor^2))/(N.manifest*(N.manifest+1)))
indices <- rbind(N,model.chi,model.df,NFI,NNFI.TLI,PNFI,IFI,CFI,RMSEA,MFI,GH,GFI,AGFI,PGFI,AIC,CAIC,BIC,RMR,SRMR)
return(indices)
}
Log in or register to post comments
In reply to again on fit statistics: I by paologhisletta
Thanks, Paolo. These formulae
Log in or register to post comments
In reply to Thanks, Paolo. These formulae by Ryne
saturated likelihood
Can I just stick this in there before using his fuction?
Log in or register to post comments
umxFitIndices: A place for all your fit stats
note This is beta code, not tested for robustness to, for instance, multiple group models, missing data, definition variables, matrix-based models. Bang on the code: pull requests to https://github.com/tbates/umx
umx::umxFitIndices(model) returns:
N, deviance, N.parms, Chi, df, p.Chi, Chi.df, AICchi, AICdev, BCCchi, BCCdev, BICchi, BICdev, CAICchi, CAICdev, RMSEA, SRMR, RMR, MAR, SMAR, GFI, AGFI, PGFI, NFI, RFI, IFI, NNFI.TLI, CFI, PRATIO, PNFI, PCFI, NCP, ECVIchi, ECVIdev, MECVIchi, MECVIdev, MFI, GH
Log in or register to post comments