Hi all,
I am running a trivariate Cholesky ADE model with one binary and two continuous variables. As profile-based confidence intervals took more than 24h to run and a lot of them were missing I decided to try bootstrapping. I ran mxBootstrap with 1000 replicates to get 95% CI and all looked fine except CI's for dominant genetic correlation rD. I get the following error:
> mxEval(rD, boot$MZ)
[,1] [,2] [,3]
[1,] 1.00000000 -0.56438885 -0.57273008
[2,] -0.56438885 1.00000000 -0.32630007
[3,] -0.57273008 -0.32630007 1.00000000> mxBootstrapEval(rD, boot, bq=c(0.025,0.975))
Error: The following error occurred while evaluating the subexpression 'solve(sqrt(ADE.I * ADE.D))' during the evaluation of 'rD' in model 'ADE' : Lapack routine dgesv: system is exactly singular: U[1,1] = 0
rD was defined as following:
corD <- mxAlgebra( expression=solve(sqrt(I*D))%&%D, name ="rD" )
If I am not mistaken the reason for that is the fact that the first element of D was estimated as zero. But I don't know how to get around this problem. Running 1000 replicates for this bootstrap took more than 17 hours, so I would really like to avoid rerunning it again, but I guess this is not possible?
Thank you in advance!
Julia
Maybe. I would say that the reason is that at least one of the diagonal elements of
D
was estimated as zero (or possibly that one of the off-diagonal elements ofD
was estimated as equal to its two corresponding diagonal elements). If you doyou can see the estimates of
D
's elements from each bootstrap replication, and verify whether some of the diagonal elements are being estimated at zero. If I correctly understand howomxBootstrapEval()
works, then each row of its output corresponds to a bootstrap replication, and the elements of each row of output are the elements ofD
, in column-major order.As a workaround, you could try
to get ordinary bootstrap quantile intervals.
Perhaps
mxBootstrapEval()
needs to be made more robust to this sort of thing.Thank you, Rob! There were indeed diagonal elements (first elements) in D which were estimated as zeros. In five replications.
When I ran what you sugested next, I get the following:
I am not sure how reliable those CI's are. Would that be possible/ok to remove those replications where the first element of D got estimated as zero from the CI estimation?
I don't know how much D variance you're estimating in each of the three traits, but if it's not very much, then you can get legitimate confidence intervals for the dominance-genetic correlations that span most of [-1,1].
Since it's only 5 out of 1000 replications, I doubt it will make a difference if you drop them. In effect, the
na.rm=TRUE
is supposed to already be dropping theNA
s from the columns ofrDboot
.Thanks a lot, Rob! The D variance is indeed small.
I think I will try to increase the number of replications to see whether confidence intervals will get a bit more narrow. Is there any reference which say how many of them it is good to have?
I think sometime when I was in grad school I read a publication by Bradley Efron that had rules-of-thumb for setting B, the number of bootstrap replication, but I've never been able to find it. I'll refer you to the following two references, which contain discussion about how to select a value of B to achieve a desired level of Monte-Carlo precision:
As discussed in those two books, B can be calculated as a function of sample size, the desired coverage probability of the interval, and certain properties of the true data-generating distribution or of the statistic's true sampling distribution.