Confidence Intervals

Posted on Tue, 01/28/2014 - 19:55

a9mike Joined: 06/28/2013

Attachment

Size

refined CSS Model 2.R

4.8 KB

Forums

OpenMx General Help

I'm not sure the best way to get confidence intervals for my estimates. I'm doing a bifactor model with a large dataset (HRS) with lots of missing data, and the mxCI are taking impossibly long (days long).

Anyone have suggestions? Would bootstrapping be faster? If so, what would that script look like? (I've never bootstrapped before)

Replied on Thu, 01/30/2014 - 19:54

bwiernik Joined: Jan 30, 2014

Definitely bootstrap

Bifactor models typically have issues with Likelihood based confidence intervals, especially if one of your group factors shouldn't really be there (if it's entirely defined by the specific variance of one or two items because the items that load on the group factor really only load onto the general factor).

You're right that you should use bootstrapping. If you're not familiar with bootstrapping, the idea is that you create a sampling distribution for a statistic by repeatedly drawing samples with replacement from your data and then computing the statistic for each redrawn sample.

You have two major choices when it comes to bootstrapping. If you have raw data, you can use empirical bootstrapping, where you draw samples from your actual data and run models on them. As an alternative, if you are willing to accept distributional assumptions, you can combine bootstrapping with Monte Carlo simulation and generate simulated samples based on the expected or observed covariance matrix from your model. This is your only option if you don't have raw data, and it has also received quite a bit of support in simulation studies.

Here is a set of functions to generate bootstrapped confidence intervals for your model.

First, this is a helper function to extract the expected covariance matrix from an mxModel:
umxCov <- function(model,latent=TRUE, manifest=TRUE){ mA <- mxEval(A,model) mS <- mxEval(S,model) mI <- diag(1, nrow(mA)) mE <- solve(mI - mA) mCov <- (mE) %*% mS %*% t(mE) # The model-implied covariance matrix mV <- NULL if(latent) mV <- model@latentVars if(manifest) mV <- c(mV,model@manifestVars) return(mCov[mV, mV]) # return only the selected variables }

Second, here is the actual bootstrap function. This function has these parameters:
- model is an optimized mxModel
- samp is the raw data matrix used to estimate model
- type is the kind of bootstrap you want to run. "par.expected" and "par.observed" use parametric Monte Carlo bootstrapping based on your expected and observed covariance matrices, respective. "empirical" uses empirical bootstrapping based on samp.
- std specifies whether you want CIs for unstandardized or standardized parameters
- rep is the number of bootstrap samples to compute.
- conf is the confidence value
- dat specifies whether you want to store the bootstrapped data in the output (useful if you want to do multiple different analyses, such as mediation analyses)

umxCIpboot = function(model, samp=NULL, type=c("par.expected", "par.observed", "empirical"), std=TRUE, rep=1000, conf=95, dat=FALSE) { if(type=="par.expected") exp = umxCov(model,latent=FALSE) if(type=="par.observed") { if(model$data@type=="raw") { exp = var(mxEval(data,model)) } else { if(model$data@type=="sscp") { exp = mxEval(data,model)/(model$data@numObs-1) } else { exp = mxEval(data,model) } } } } N = round(model@data@numObs) pard = t(data.frame("mod"=summary(model)$parameters[,5+2*std],row.names=summary(model)$parameters[,1])) pb = txtProgressBar(min=0,max=rep,label="Computing confidence intervals",style=3) ##### if(type=="empirical") { if(length(samp)==0) { if(model$data@type=="raw") samp = mxEval(data,model) else stop("No raw data supplied for empirical bootstrap.") } for(i in 1:rep){ bsample.i = sample.int(N,size=N,replace=TRUE) bsample = var(samp[bsample.i,]) mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE) pard = rbind(pard,summary(mod)$parameters[,5+2*std]) rownames(pard)[nrow(pard)]=i setTxtProgressBar(pb,i) } } else { for(i in 1:rep){ bsample = var(mvrnorm(N,rep(0,nrow(exp)),exp)) mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE) pard = rbind(pard,summary(mod)$parameters[,5+2*std]) rownames(pard)[nrow(pard)]=i setTxtProgressBar(pb,i) } } low=(1-conf/100)/2 upp=((1-conf/100)/2)+(conf/100) LL=apply(pard,2,FUN=quantile,probs=low) #lower limit of confidence interval UL=apply(pard,2,FUN=quantile,probs=upp) #upper quantile for confidence interval LL4=round(LL,4) UL4=round(UL,4) ci = cbind(LL4, UL4) colnames(ci) = c(paste((low*100),"%",sep=""),paste((upp*100),"%",sep="")) p = summary(model)$parameters[,c(1,2,3,4,c(5:6+2*std))] if(dat) return(list("Type"=type,"bootdat"=data.frame(pard),"CI"=cbind(p,ci))) return(list("CI"=cbind(p,ci))) }

Log in or register to post comments

Replied on Wed, 02/05/2014 - 15:56

tbates Joined: Jul 31, 2009

added to library(umx)

That's a nice pair of functions!

I've added them to the umx library with some modifications and crediting this conversation and your name. Comments on all umx package functions welcome.

library(devtools) install_github("tbates/umx") library(umx) ?umxCI_boot #shows usage example ?umxGetExpectedCov

Log in or register to post comments

Replied on Sun, 03/01/2015 - 11:36

martonandko Joined: Feb 19, 2015

Trouble getting CI for twin ACE model

I'm trying to get bootstrapped CI for the estimated variables (A, C, E) using Neale's posted twin ACE script (http://openmx.psyc.virginia.edu/sites/default/files/UnivariateTwinAnalysis_MatrixRaw-3.R) using the umxCI_boot function, but I get the following warning message:

umxCI_boot(model = twinACEModel, type = "par.observed")
Error in umxCI_boot(model = twinACEModel, type = "par.observed") :
trying to get slot "type" from an object of a basic class ("NULL") with no slots

What am I doing wrong? What is the difference between the type = "par.observed" and type = "par.expected" options?

Log in or register to post comments

Replied on Sun, 03/01/2015 - 12:12

mhunter Joined: Jul 31, 2009

Update

Make sure you have the most recent versions of umx and OpenMx.

source("http://openmx.psyc.virginia.edu/getOpenMx.R") library(OpenMx)

library(devtools) install_github("tbates/umx") library(umx)

Log in or register to post comments

Replied on Sun, 03/01/2015 - 14:40

martonandko Joined: Feb 19, 2015

Same problem

Thank you for your quick reply on both of my questions! Unfortunately I still get the same error message as before after updating OpenMx and umx. I'm attaching the model and the fitted object(just changed the file extension to .txt, so I could upload it), could someone have a look into it, what might be the problem?

Log in or register to post comments

Replied on Tue, 03/24/2015 - 18:56

bwiernik Joined: Jan 30, 2014

Where's mxData?

The umxCI_boot function is looking for an MxData object in the model from which to run the bootstraps. Neither of the two models you included in your file have anything in their data slot. You can see this by running:

model$data

Without a data object, the bootstrap function can't simulate samples or resample data points to do the bootstrap. How did you estimate twinACEFit without an mxData statement?

Log in or register to post comments

Replied on Wed, 03/25/2015 - 10:39

RobK Joined: Apr 19, 2011

Data are in submodels

Note that while the "top-level" model, twinACEFit (or twinACEModel) contains no MxData object, its submodels 'MZ' and 'DZ' do.

Log in or register to post comments

Replied on Wed, 03/25/2015 - 17:29

bwiernik Joined: Jan 30, 2014

That's the issue

I see. I don't do behavioral genetics, so I haven't had much need to use submodels. When I wrote the umxCI_boot function, I didn't have submodels in mind. The function looks for the parameters and data objects for the top-level model. At present, it doesn't work with mxModel objects where the data and parameters are stored in different submodels. I can't make any promises as to if or when I will have the time to expand its functionality, though I know tbates always welcomes patches and pull requests for the umx package. https://github.com/tbates/umx

Log in or register to post comments

Replied on Sun, 03/29/2015 - 04:51

martonandko Joined: Feb 19, 2015

Calculate CI-s using for cycle?

Thank you all for looking into the problem! Basically I could just run my model in a for cycle replacing the raw data by resampling my original sample with replacement let say 2000 times, and then calculate the path coefficients and standardized variance components each time, and then just take the 2.5% and 97.5% percentile of these distributions to get the CI-s? Is there an easy way of only replacing the raw data of the submodels using the omxSetParameters command, I just can't get it to work, I have to redefine the hole model in every cycle.

Log in or register to post comments

Replied on Sun, 03/29/2015 - 11:40

neale Joined: Jul 31, 2009

Care with naive bootstrap

In, e.g., a factor model, there can be some invariance to sign. So a factor model in which all loadings are positive gives the same fit as one in which all loadings are negative. You would want to, e.g., constrain the first loading to be positive to avoid very wide (incorrect) estimated bootstrap CI's.

To replace data, I would use something like
newBootStrapSampleMxData <- mxData(newStuff) mymodel <- mxModel(mymodel, newBootStrapSampleMxData)

Log in or register to post comments

Replied on Tue, 06/02/2015 - 10:57

martonandko Joined: Feb 19, 2015

Some bootstrap samples don't converge, omit them?

Thanks a lot for straitening things out! Using univariate twin ACE/ADE models, calculating standardized path coefficients (A, C/D, E) for each sample would resolve this problem, since they are always positive, right?
In case of small samples where bootstrap samples might result in a distorted sample distributions and thus the model might not converge, is it reasonable to only include bootstrap samples into the bootstrap calculations where the model was able to converge, where the Mx status is Green?

Log in or register to post comments

Replied on Mon, 02/03/2014 - 17:40

neale Joined: Jul 31, 2009

Check identification

I assume you have not been able to obtain conventional standard errors? If they look reasonable, they can be used to derive confidence intervals. If these are not available, then take a look at identification. Although the model is likely identified in principle, it is possible that it suffers empirical under-identification, because two factors are essentially identical in structure. A signature of this would be paths estimated at or close to zero. Possibly, fixing one or two of them to zero would clear up the standard error calculation, and alleviate the need to estimate likelihood-based or bootstrap CIs

With factor models, there is a risk that factor loadings from a factor could all flip sign, and provide identical fit. This type of under-identification could mess up all three types of confidence interval estimation, but I think you are immune to that potential problem here because you have fixed a factor loading to 1 and estimated factor variances instead.

Log in or register to post comments

Log in or register to post comments

News

Recent Posts

Confidence Intervals

Definitely bootstrap

added to library(umx)

Trouble getting CI for twin ACE model

Update

Same problem

Where's mxData?

Data are in submodels

That's the issue

Calculate CI-s using for cycle?

Care with naive bootstrap

Some bootstrap samples don't converge, omit them?

Check identification

News

Recent Posts