Confidence intervals are basic to most results sections now, so implementing Mx's 'Intervals' likelihood-based confidence intervals (Neale & Miller, 1997) (plus standard error-based CIs for speed) would be high on the wish-list
maybe something like
mxInterval(..., percent=95, type="ML") where ... consists of matrix addresses
mxInterval(z.A) would mean all cells of A
mxInterval(z.A[1:2,], type="SE") first two rows of A, standard-error based method
Agreed, and I like the syntax you suggest. We might make SE the default since it is faster and when statistical power is sufficient it agrees quite well with the likelihood based approach.
From the standpoint of the mxMatrix, we would now need two other matrices added to the mxMatrix's already lengthy list of matrices, an upper CI and lower CI matrix in the most general case. On top of that, we'd need a slot for the CI type. This needs some discussion in the team meetings.
We also would need to provide pathic versions of the interface.
I guess the CI matrices would be added only when CIs are requested.
In that case, instead of a type slot, might be best allow both sets of CI matrices to be added: that way people could say
mxCI(z.A, "SE");
mxCI(z.A, "ML");
and get both kinds to compare etc.
If I'm reading Tim's suggestion correctly, this works more like mxEval than an mxModel slot. I'm with Steve that we shouldn't add more to mxModel, but would a function that took either (a) parameters and standard errors, or (b) an @output slot work? Is it possible that (a) is already out there in the R community?
Hi Ryne,
I thought CIs will have to be part of mxModel, as asking for confidence intervals causes the optimiser to fit the model, then move each (requested) parameter estimate down (and up), holding the others constant, until fit declines by some number of units related to the confidence requested. But perhaps it could be handled in a separate mxEval-like mxCI() function, which would take the model and current parameters as input?
Right you are. It's going to need to be handed to the optimizer. The question is how should we do that?
I would say that if a model contains an mxCI statement, then a flag is set and when mxRun is finished running the model (i.e., when it would have returned), that flag is checked, and if true, any mxCI statements are executed by forming all the CIs requested into an array, then looping over them: setting the cell to fixed, and then sweeping the value up and down until the fit degrades to the request chi2 change.
The code for choosing new values and homing in on the point where the fit degrades below the desired CI must be in old mx for something to work off.
Hey guys, I just discovered now OpenMx so should rather shut up:-)
But is it a goal in OpenMx to keep names that probably come from MX syntax like mxInterval?
If no, you could maybe consider R generic functions intuitively known to users such as confint()?
Yes, the Mx1 code is there. What it does is to change the fit function so that there is a constraint that the -2lnL is 3.84 (in the case of 95%) higher than it was when the model was initially fitted, and either + or - the parameter in question. That is, we maximize or minimize the parameter in question, subject to the constraint that the -2lnL is worse by some predetermined amount. All parameters are free as in the original model.
Possibly, these revised fit functions won't be too bad to implement with an Algebra type fit function.
For example, using the UnivariateTwinAnalysis_MatrixRaw.R script from the demo directory, we can obtain the likelihood-based CI's as follows:
CIaupper<-mxModel(twinACEFit,mxMatrix("Full",values=mxEval(objective,twinACEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective))^2-A,name="upperCIa"),mxAlgebraObjective("upperCIa"))
CIalower<-mxModel(twinACEFit,mxMatrix("Full",values=mxEval(objective,twinACEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective))^2+A,name="lowerCIa"),mxAlgebraObjective("lowerCIa"))
then running
runCIalower<-mxRun(CIalower)
runCIalower@algebras$A
runCIaupper<-mxRun(CIaupper)
runCIaupper@algebras$A
which are the lower and upper CI's on the computed matrix element A[1,1]
Obviously, making this an automated function would be much nicer for the user! But it's pretty neat that it can be done without any programing on behalf of the OpenMx team.
Thanks mike: those are helpful user functions!
I think it would still be very well worth while to support this core functionality inside an openMx-package supported mxInterval() function, able to be added to an mxModel, calling itself recursively, or as a parameter of mxRun
t
Just a thought too: The univariate example has only a 1*1 matrix to attend too... It would take a lot of each-time programming to scale up to multivariate.
Would be good to have the ability to specify a range of cells to get CIs on.
and to be able to specify CIs on RAM mxPaths by c(name,)
Agreed - syntax of the type you suggest would be a boon to the user. Any takers for writing wrapper functions like this?
Second, on the issue of computation time, you are right, likelihood-based CI's are expensive. They can be done in parallel though, so perhaps this is another area where Drs. Wilde & Kenny may be able to help. With a 48 core laptop on the horizon, it's tempting indeed to make this example run locally in parallel.
With the zero influence I have, I second the notion of CIs in parallel. I'm waiting for hours and hours to get CIs on a common pathway ACE model. A call to snowfall and a host of independent models to calculate CIs would be absolutely fabulous!
-Scott
Ah, that would be useful. In the meantime, you could manually setup independent submodels in order to take advantage of snowfall. Be sure to place only a single confidence interval specification in each submodel. The workflow would be something like:
UPDATE: I forgot to give the submodels new names. The name collision would have resulted in only one submodel stored as a child of the topModel.
That's a great idea, thanks! Given that you replied to my post about 3 minutes after I posted it, I was able to try this last night. Unfortunately I ran into the renaming problem.
The renaming of submodels is a problem, to be sure. My ACE common pathway model has several submodels ('ACE', 'DZ','MZ'), and renaming them for each confidence interval is time-consuming. It seems some kind of built-in parallelizing confidence interval function would save a lot of people a lot of time. As others are saying, reporting confidence intervals is (and should) becoming routine. Each submodel could be automatically renamed to make it easier on the user (e.g., for the first confidence interval requested it could have 'ACE1', 'DZ1', 'MZ1', and so on).
Just thinking out loud. THanks for an extremely useful product thusfar!
-Scott
Ah yes the model naming problem is complicated when independent submodels are used. Here's a function that will take a model and all its submodels and add a prefix to their model names:
So when I tested it using
foo1 <- modelNameAddPrefix(twinACEModel, "foo1_")
the resulting model is named 'foo1_twinACE' and has three submodels 'foo1_common', 'foo1_MZ', 'foo1_DZ'. It's not exactly user-friendly at the moment, but it will help in running the confidence intervals.Oh, plus I found a bug in the function mxRename(). It will currently not rename the appropriate confidence intervals, because confidence intervals had not been implemented when mxRename() was written. But you should be specifying the confidence intervals manually after creating all the submodels for this particular workflow.
UPDATE: Oops, I forgot that child models could refer to entities in the parent models. I had to change the recursion step to apply to the entire nested model. The code has been updated.
ML CI's are inherently computationally intensive, requiring two new optimizations over as many parameters as there are in the model for each interval requested. CI's on a model with 50 parameters (or functions of parameters) would take 100 times as long as the original model. If there are lots of CI's, and raw data are input, then it can be more efficient to use the bootstrap. Here's an example using the dataset from the OpenMx homepage:
require(boot)
mles<-function(dataset,wt){
manifests <- names(dataset)
latents <- c("G")
covwt<-cov.wt(dataset,wt)
mlevals<-mxRun(mxModel("One Factor", type="RAM",
manifestVars = manifests,
latentVars = latents,
mxPath(from=latents, to=manifests),
mxPath(from=manifests, arrows=2),
mxPath(from=latents, arrows=2, free=F, values=1.0),
mxData(covwt$cov, type="cov",
numObs=500)))
return(as.vector(mlevals@output$estimate))}
boot(demoOneFactor,mles,R=1000)
Possibly a wrapper function would facilitate joint use of the boot and OpenMx packages. Note that the mlevals in the above example might be augmented to have the results of algebras included in the return statement.
could someone please post a working CI demo script that includes CIs on both an estimated and a calculated element?
thanks
Same here. Having svn upped this morning, make install, restart R, the CI's are being very coy. OpenMx seems to be doing the work but discarding the CI's from summary and output. See below, non-compliant script attached.
> require(OpenMx)
> data(demoOneFactor)
> factorModel <- mxModel("One Factor",
+ mxMatrix("Full", 5, 1, values=0.2,free=T, name="A"),
+ mxMatrix("Symm", 1, 1, values=1,free=F, name="L"),
+ mxMatrix("Diag", 5, 5, values=1,free=T, name="U"),
+ mxAlgebra(A %% L %% t(A) + U, name="R"),
+ mxCI(c('A[1,1]','R[1,1]')),
+ mxMLObjective("R", dimnames = names(demoOneFactor)),
+ mxData(cov(demoOneFactor), type="cov", numObs=500))
> factorModelOuput <- (mxRun(factorModel,intervals=T))
Running One Factor
Warning messages:
1: In model 'One Factor' while computing the lower bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
2: In model 'One Factor' while computing the lower bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
3: In model 'One Factor' while computing the upper bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
4: In model 'One Factor' while computing the upper bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
> factorModelSummary <- summary(factorModelOutput)
> factorModelOutput@output$confidenceIntervals
[,1] [,2]
>
And that's it. Summary is no help either.
Umm, I'm getting confidence intervals.
R 2.11.0 and OpenMx subversion revision 1297. I'll logon to some other machines and try running it on older versions of R. Works on R 2.9.2 and 2.10.1 also.
I don't think it's an R version thing - I am at 2.10.1
?OpenMx indicates Package OpenMx version 0.3.3-1264
but I suspect that the version number is only updated with binary releases. It coincides with what I get during the build:
tar: trunk/build/OpenMx_0.3.3-1264.tar: Can't add archive to itself
and I'm pretty sure the building is going according to plan:
Michael-Neales-Macbook-Pro-4% svn up
At revision 1297.
Michael-Neales-Macbook-Pro-4% cd trunk
Michael-Neales-Macbook-Pro-4% sudo make install
I retried this procedure after quitting R, then fired up R again and still no CI's in the summary:
> require(OpenMx)
> data(demoOneFactor)
> factorModel <- mxModel("One Factor",
+ mxMatrix("Full", 5, 1, values=0.2,free=T, name="A"),
+ mxMatrix("Symm", 1, 1, values=1,free=F, name="L"),
+ mxMatrix("Diag", 5, 5, values=1,free=T, name="U"),
+ mxAlgebra(A %% L %% t(A) + U, name="R"),
+ mxCI(c('A[1,1]','R[1,1]')),
+ mxMLObjective("R", dimnames = names(demoOneFactor)),
+ mxData(cov(demoOneFactor), type="cov", numObs=500))
> factorModelOuput <- (mxRun(factorModel,intervals=T))
> factorModelSummary <- summary(factorModelOutput)
> factorModelSummary
data:
$
One Factor.data
$
One Factor.data
$covx1 x2 x3 x4 x5
x1 0.1985443 0.1999953 0.2311884 0.2783865 0.3155943
x2 0.1999953 0.2916950 0.2924566 0.3515298 0.4019234
x3 0.2311884 0.2924566 0.3740354 0.4061291 0.4573587
x4 0.2783865 0.3515298 0.4061291 0.5332788 0.5610769
x5 0.3155943 0.4019234 0.4573587 0.5610769 0.6703023
free parameters:
name matrix row col Estimate Std.Error
1 A x1 G 0.39715212 0.015549769
2 A x2 G 0.50366111 0.018232514
3 A x3 G 0.57724141 0.020448402
4 A x4 G 0.70277369 0.024011418
5 A x5 G 0.79624998 0.026669452
6 S x1 x1 0.04081419 0.002812717
7 S x2 x2 0.03801999 0.002805794
8 S x3 x3 0.04082718 0.003152308
9 S x4 x4 0.03938706 0.003408875
10 S x5 x5 0.03628712 0.003678561
observed statistics: 15
estimated parameters: 10
degrees of freedom: 5
-2 log likelihood: -3648.281
saturated -2 log likelihood: -3655.665
number of observations: 500
chi-square: 7.384002
p: 0.1936117
AIC (Mx): -2.615998
BIC (Mx): -11.84452
adjusted BIC:
RMSEA: 0.03088043
timestamp: 2010-07-01 10:24:30
frontend time: 0.08384323 secs
backend time: 0.01688194 secs
independent submodels time: 5.698204e-05 secs
wall clock time: 0.1007822 secs
cpu time: 0.1007822 secs
openmx version number: 0.3.3-1264
Weird, huh?
Same errors as Mike. I'm on "R version 2.11.0 (2010-04-22)"
svn info tells me: Revision: 1298
Er, those seem like different errors. At least I get factorModelOutput to exist. It just lacks the confidence intervals bit. Trying nuclear option:
rm(list = ls())
DOES help.
Now I get the errors reported by Tim! But I am able to see the CI's if I don't go via the summary object. So basically the CI's are causing the creation of a summary object to fail, and my earlier confusion was caused by my inspection of a historical summary object of the same name (created by an earlier version of OpenMx which didn't have this bug).
Yes, having something fail and leave you with the last result in place of the new one you requested is such a trap I've wondered if mxRun shouldn't output a simple error string when it fails to at least stamp on the old output.
Copy/pasting Mike's input yields an error because of the typo:
factorModelOuput <- (mxRun(factorModel,intervals=T))
. The "t" is missing in Output.Oops! Sorry. Much ado about nothing. However, glad to have it resolved and to be able to attach a working script as a correct response to the inquiry from Sarah for a script with CI's on both free parameter matrix elements and mxAlgebra() computed elements.
Could you clarify this syntax for me a bit? I'm constructing a bifactor model I need CIs for and mxCI is impossible to run (raw data from HRS, large and lots of missing data). I'm just not sure how to incorporate the boot function.
How did you end up solving the question of univariate and multivariate confidence intervals? Any link to example scripts which provide CI's in the multivariate case?
Can I check that I understand this correctly and that CI's are possible (although fiddly) in the multivariate case?
Based on the code Mike wrote above, If I have 4 variables and a best fit AE model, then by specifying A for each variable (a11a11, a22a22, etc), I should be able to obtain corresponding CI's, as in;
A <- mxEval(a11*a11, multiCholAEFit)
CIaupper<-mxModel(multiCholAEFit,mxMatrix("Full",values=mxEval(objective, multiCholAEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective+Sib.objective))^2-A,name="upperCIa"),mxAlgebraObjective("upperCIa"))
CIalower<-mxModel(multiCholAEFit,mxMatrix("Full",values=mxEval(objective, multiCholAEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective+Sib.objective))^2+A,name="lowerCIa"),mxAlgebraObjective("lowerCIa"))
runCIalower<-mxRun(CIalower)
runCIalower@algebras$A
runCIaupper<-mxRun(CIaupper)
runCIaupper@algebras$A
If this is right, like Cathy and Marcel's recent post, I keep running into code reds (even when re-running an optimised fit).
I wouldn't use mxEval() for this purpose. Rather I would create a matrix within the multiCholAEFit using an mxAlgebra(). I don't know if there's a problem with mxEval() but I usually think of that as a sort of after-the-fact computation, and I'm not sure if it gets calculated during optimization. If it does, since it will parse any R expression, I would expect it to be slower than using mxAlgebra. If it does not, then I would expect code reds and no change to the value in matrix A.
Also, please be aware that a better interface for requesting & estimating confidence intervals is due to be released shortly.
That's correct. mxEval() evaluates its expression using the current state of the model, and cannot be updated during a run of a model. Use mxAlgebra() inside a model to create an expression that will be re-calculated as the optimizer traverses the search space.
And yes, a confidence interval interface will be released in OpenMx 1.0.