information matrix with IFA models
Posted on
falkcarl
Joined: 10/29/2015
Attachment | Size |
---|---|
infotest.R | 6.05 KB |
Forums
I'm experimenting with OpenMx's capabilities to calculate the information matrix (and standard errors) in the context of IFA models with priors on some item parameters. e.g., see for example, the 3PL model in the documentation: http://openmx.psyc.virginia.edu/docs/OpenMx/latest/ItemFactorAnalysis.html#a-3pl-with-bayesian-priors
Say one does the following changes to the example inside mxComputeSequence (e.g., see also end of attached code):
Remove mxComputeNumericDeriv()
Add mxComputeOnce('fitfunction', 'information', "sandwich")
If I do this, I do get standard errors, but no information matrix in the output. Just wondering how the standard errors are calculated in this case and how to obtain the information matrix. I do see there being a note in the example possibly suggesting that the above approach ought not to work or is not recommended in the context of priors, but at this point I'm just experimenting to figure out how things work.
Many thanks!
info matrix
> Just wondering how the standard errors are calculated in this case
The method in this case is Yuan, Cheng, & Patton (2013).
> the above approach ought not to work or is not recommended in the context of priors
Yeah, any approach based on Louis (1982) is not going to take the priors into account. In simulations, I've found that Oakes (1999) is the best method (fastest and most accurate).
All the options are not really well documented yet. If you want to experiment, you might browse through inst/models/enormous/lib/stderrlib.R which I have used to run simulations.
Log in or register to post comments
In reply to info matrix by jpritikin
ifaTools
Log in or register to post comments
In reply to info matrix by jpritikin
I see. So, adding arguments
information="oakes1999", infoArgs=list(fitfunction=c('itemModel.fitfunction','gaussModel.fitfunction','betaModel.fitfunction'))
I'll have to read Oakes and compare to the other approaches with which I'm more familiar. Might need to do a few simulations to see what (if anything) works well with the item models I'm using.
mxComputeReportDeriv() should also return the gradient?
Thank you again and hope you all enjoy some vacation!
Log in or register to post comments
In reply to I see. So, adding arguments by falkcarl
oakes
Yes, Oakes will take priors into account. You can compare the accuracy to central difference Richardson extrapolation (mxComputeNumericDeriv).
> information="oakes1999", infoArgs=list(fitfunction=c('itemModel.fitfunction','gaussModel.fitfunction','betaModel.fitfunction'))
I think you only need to provide the top-level multigroup fit function, so
infoArgs=list(fitfunction="demo3pl_xpdse.fitfunction")
> mxComputeReportDeriv() should also return the gradient?
Yes
> I'll have to read Oakes and compare to the other approaches with which I'm more familiar.
Oakes is elegant. Take a look at src/Compute.cpp line 2222. It's about 40 lines of code.
Log in or register to post comments
In reply to oakes by jpritikin
mxComputeReportDeriv() for
If I understand correctly, let’s see if the following about sums up the current methods for computing the "information matrix" as implemented in OpenMx:
1. Ways computing information once at the end
Add something like this to mxComputeSequence():
mxComputeOnce('fitfunction', 'information', 'meat')
“meat” = empirical cross-product (observed?)
“bread” = Fisher information matrix (expected?)
“sandwich” = same as the name implies (but, expected or observed?)
“hessian” = M-step
Use of mxComputeNumericDeriv() will do Richardson extrapolation
2. Ways requiring keeping track of things or doing stuff across EM cycles.
Add arguments such as the following to mxComputeEM:
information="mr1991",
infoArgs=list(semMethod=”mr”, fitfunction='fitfunction')
For the information argument:
“oakes1999” = Oaks (1999), no need to specify semMethod
“mr1991” = supplemented EM (S-EM)
In combo with “mr1991”, a particular method for S-EM may be specified using semMethod
“mr” = as applied by Cai (2008) to IFA models
“tian” = as specified by Tian, Cai, Thissen, & Xin (2013)
“agile” = your method for S-EM
Hope this helps others and hope I've covered what's available.
Thank you!
Log in or register to post comments
In reply to mxComputeReportDeriv() for by falkcarl
derivs and information
My hunch is that the gradient is not returned because the gradient is only computed in the M-step and not preserved outside of mxComputeEM.
To get the gradient, take a look at inst/models/nightly/ifa-meat-2d.R
> “meat” = empirical cross-product (observed?)
I am confused about the distinction between observed and expected information. But yeah, "meat" is the empirical cross-product. Mislevy (1984) is implemented for the latent distribution.
Methods "bread" and "sandwich" are implemented according to Yuan, Cheng, & Patton (2013).
With respect to the EM related methods, "mr" and "tian" require the convergence history while "oakes1999" and "agile" do not.
You have covered all the options.
Log in or register to post comments
In reply to mxComputeReportDeriv() for by falkcarl
Fisher information
> “bread” = Fisher information matrix (expected?)
> “sandwich” = same as the name implies (but, expected or observed?)
There are two different ways of defining the Fisher information: (1) the variance of the first derivative of the loglikelihood and (2) the expectation of the second derivative of the loglikelihood. The true Fisher information is generally unknown, because it requires knowledge of the true parameter values and the true functional form (i.e., family) of the distribution. If we assume a particular functional form of the distribution, and "plug-in" our MLEs in place of the true parameter values, we can estimate the true Fisher information from data. If the necessary integrals are analytically tractable, then we can estimate the true Fisher information as the expected information matrix. But, sometimes those integrals are not so easy, so we instead use the observed information as an estimator, and calculate it either as a quadratic function of the first derivative loglikelihood (proportional to the "empirical cross-product" or "meat") or as a first-degree function of the second-derivative loglikelihood (proportional to the Hessian matrix, or "bread"). When the distribution is correctly specified, those two ways of calculating the observed information will on average give the same answer in large samples. The "sandwich" is `solve(bread)%*%meat%*%solve(bread)`, and so if the distribution is correctly specified, then asymptotically, the product of the first two factors tends toward an identity matrix, and the sandwich merely tends toward the inverted Hessian. But if the distribution is incorrectly specified, it can be shown that the sandwich nonetheless provides a consistent estimator of the repeated-sampling covariance matrix of the MLEs.
Log in or register to post comments
In reply to Fisher information by AdminRobK
Thank you! Do I understand
Log in or register to post comments
In reply to Thank you! Do I understand by falkcarl
Re: loop over the model-implied proportions of all possible resp
Log in or register to post comments