You are here

bootstrap coverage probability and the bootstrap replication

8 posts / 0 new
Last post
Veronica_echo's picture
Offline
Joined: 02/23/2018 - 01:57
bootstrap coverage probability and the bootstrap replication

Hi everyone,

I am conducting a simulation study of a growth model and would like to evaluate the bootstrap CP of it. I kept the simulation replication as 1000 and set bootstrap replication as 1000 and 2000, respectively. The results seemed wiered, since the CPs of bootstrap 1000 (all of them were located between (0.93, 0.97)) were much better than those of bootstrap 2000 (some CPs were quite low, say 0.86). Any advice about this issue? Should I increase the bootstrap replication to a larger number, say 5000? Thank you in advance.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Increase bootstrap replications

If your target coverage probability is 0.95 (as your post suggests), then increasing the number of bootstrap replications to 5000 makes sense.

Veronica_echo's picture
Offline
Joined: 02/23/2018 - 01:57
Thanks, and one more question

Thanks for your kind advice! I am trying it now. I have one more question about the simulation study. During the process to have 1,000 effective replications (i.e. no errors, no warnings), I got 283 errors (detailed information could be found in the attached) and 14 warnings (code 6). May I know some possible solutions to this issue? I am using the "ture" values of parameters (the ones I set to generate data) as initial values when fitting the model, so I assume nothing can be done for that part. I am using the default optimizer, do I need try other two? Any advice would be appreciated!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
some tips
During the process to have 1,000 effective replications (i.e. no errors, no warnings), I got 283 errors (detailed information could be found in the attached) and 14 warnings (code 6). May I know some possible solutions to this issue?

If you're using mxRun() to initially (i.e., before any bootstrapping or jackknifing) run the model in each replication, you could replace mxRun() with mxTryHard() or one of its wrappers. If you do, you'll probably want to read the man page for mxTryHard().

Are you running your simulation in a 'for' loop? If so, you could instead run it in a 'while' loop, e.g.,

i <- 1
while(i <= 1000){
# do simulation stuff here;
# be sure to somewhere define boolean variable `modelRanWell` as TRUE if no errors or warnings, FALSE otherwise
if(modelRanWell){i <- i+1}
}

, which will keep it running until it gives you 1000 "effective" replications.

I am using the "ture" values of parameters (the ones I set to generate data) as initial values when fitting the model, so I assume nothing can be done for that part.

I don't know what sort of model you're using to generate and fit to data. But, the error message in your attachment makes it sound as though the model-expected covariance matrix was non-PD at the start values. Using mxTryHard() should help in cases where the start values are poor for the current dataset. You could also calculate empirical start values from the dataset in each replication, but I can't make any specific suggestions there without knowing more about the model you're fitting.

I am using the default optimizer, do I need try other two?

None of the 3 main optimizers can get off the ground if the covariance matrix is non-PD at the start values, and only 14 warnings in 1000 replications is pretty good. It sounds as though CSOLNP, which is the on-load default optimizer, is working well for you. Using mxTryHard(), if you're not already, should really cut down on the number of errors and warnings you get.

Veronica_echo's picture
Offline
Joined: 02/23/2018 - 01:57
Thanks for your kind and

Thanks for your kind and prompt advice. I am going to use the mxTryHard() instead of mxRun() to make multiple tries. For the simulation, I am using the repeat loop with try() function, I guess it is similar to the while loop. I am fitting a growth curve model, is the empirical initial values helpful? Thank you very much!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
sounds good
For the simulation, I am using the repeat loop with try() function, I guess it is similar to the while loop.

OK. I guess your script just breaks out of the loop eventually, when some criterion is satisfied?

I am fitting a growth curve model, is the empirical initial values helpful?

I bet you could get really good start values via lmer(), from package 'lme4'.

I'm a bit surprised that fitting a growth-curve model at its true parameter values, to data generated under that model, would lead to a non-PD covariance matrix.

Veronica_echo's picture
Offline
Joined: 02/23/2018 - 01:57
Thanks for your kind and prompt reply.

Yes, when the number of effective replications is 1000, it breaks out the loop.

My growth model has definition variables, that might be an explaination to non-PD issue? If so, could I have a smaller number of errors by decreasing the range of definition variables (current setting: scaled equally-spaced time and $dv\sim unif(t_{j}-0.45, t_{j}+0.45)$)?

When I use lme4::lmer() or nlme::nlme(), I guess I should use "reml" instead of "ml"?

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
lme4; definition variables
When I use lme4::lmer() or nlme::nlme(), I guess I should use "reml" instead of "ml"?

"ml" is actually closer to what OpenMx does than "reml", but it shouldn't matter much for your purposes, since you're just trying to get start values.

My growth model has definition variables, that might be an explaination to non-PD issue?

I'm not sure, though I doubt it.