Hi everyone,
I am conducting a simulation study of a growth model and would like to evaluate the bootstrap CP of it. I kept the simulation replication as 1000 and set bootstrap replication as 1000 and 2000, respectively. The results seemed wiered, since the CPs of bootstrap 1000 (all of them were located between (0.93, 0.97)) were much better than those of bootstrap 2000 (some CPs were quite low, say 0.86). Any advice about this issue? Should I increase the bootstrap replication to a larger number, say 5000? Thank you in advance.
If your target coverage probability is 0.95 (as your post suggests), then increasing the number of bootstrap replications to 5000 makes sense.
Thanks for your kind advice! I am trying it now. I have one more question about the simulation study. During the process to have 1,000 effective replications (i.e. no errors, no warnings), I got 283 errors (detailed information could be found in the attached) and 14 warnings (code 6). May I know some possible solutions to this issue? I am using the "ture" values of parameters (the ones I set to generate data) as initial values when fitting the model, so I assume nothing can be done for that part. I am using the default optimizer, do I need try other two? Any advice would be appreciated!
If you're using
mxRun()
to initially (i.e., before any bootstrapping or jackknifing) run the model in each replication, you could replacemxRun()
withmxTryHard()
or one of its wrappers. If you do, you'll probably want to read the man page formxTryHard()
.Are you running your simulation in a 'for' loop? If so, you could instead run it in a 'while' loop, e.g.,
, which will keep it running until it gives you 1000 "effective" replications.
I don't know what sort of model you're using to generate and fit to data. But, the error message in your attachment makes it sound as though the model-expected covariance matrix was non-PD at the start values. Using
mxTryHard()
should help in cases where the start values are poor for the current dataset. You could also calculate empirical start values from the dataset in each replication, but I can't make any specific suggestions there without knowing more about the model you're fitting.None of the 3 main optimizers can get off the ground if the covariance matrix is non-PD at the start values, and only 14 warnings in 1000 replications is pretty good. It sounds as though CSOLNP, which is the on-load default optimizer, is working well for you. Using
mxTryHard()
, if you're not already, should really cut down on the number of errors and warnings you get.Thanks for your kind and prompt advice. I am going to use the mxTryHard() instead of mxRun() to make multiple tries. For the simulation, I am using the repeat loop with try() function, I guess it is similar to the while loop. I am fitting a growth curve model, is the empirical initial values helpful? Thank you very much!
OK. I guess your script just
break
s out of the loop eventually, when some criterion is satisfied?I bet you could get really good start values via
lmer()
, from package 'lme4'.I'm a bit surprised that fitting a growth-curve model at its true parameter values, to data generated under that model, would lead to a non-PD covariance matrix.
Yes, when the number of effective replications is 1000, it breaks out the loop.
My growth model has definition variables, that might be an explaination to non-PD issue? If so, could I have a smaller number of errors by decreasing the range of definition variables (current setting: scaled equally-spaced time and $dv\sim unif(t_{j}-0.45, t_{j}+0.45)$)?
When I use lme4::lmer() or nlme::nlme(), I guess I should use "reml" instead of "ml"?
"ml" is actually closer to what OpenMx does than "reml", but it shouldn't matter much for your purposes, since you're just trying to get start values.
I'm not sure, though I doubt it.