System is exactly singular

Posted on
Picture of user. Julia Joined: 03/29/2012
Hi all,

I am running a trivariate Cholesky ADE model with one binary and two continuous variables. As profile-based confidence intervals took more than 24h to run and a lot of them were missing I decided to try bootstrapping. I ran mxBootstrap with 1000 replicates to get 95% CI and all looked fine except CI's for dominant genetic correlation rD. I get the following error:

> mxEval(rD, boot$MZ)
[,1] [,2] [,3]
[1,] 1.00000000 -0.56438885 -0.57273008
[2,] -0.56438885 1.00000000 -0.32630007
[3,] -0.57273008 -0.32630007 1.00000000

> mxBootstrapEval(rD, boot, bq=c(0.025,0.975))
Error: The following error occurred while evaluating the subexpression 'solve(sqrt(ADE.I * ADE.D))' during the evaluation of 'rD' in model 'ADE' : Lapack routine dgesv: system is exactly singular: U[1,1] = 0

rD was defined as following:

corD <- mxAlgebra( expression=solve(sqrt(I*D))%&%D, name ="rD" )

If I am not mistaken the reason for that is the fact that the first element of D was estimated as zero. But I don't know how to get around this problem. Running 1000 replicates for this bootstrap took more than 17 hours, so I would really like to avoid rerunning it again, but I guess this is not possible?

Thank you in advance!
Julia

Replied on Tue, 05/22/2018 - 15:52
Picture of user. AdminRobK Joined: 01/24/2014

If I am not mistaken the reason for that is the fact that the first element of D was estimated as zero.

Maybe. I would say that the reason is that at least one of the diagonal elements of D was estimated as zero (or possibly that one of the off-diagonal elements of D was estimated as equal to its two corresponding diagonal elements). If you do

omxBootstrapEval(D, boot)

you can see the estimates of D's elements from each bootstrap replication, and verify whether some of the diagonal elements are being estimated at zero. If I correctly understand how omxBootstrapEval() works, then each row of its output corresponds to a bootstrap replication, and the elements of each row of output are the elements of D, in column-major order.

As a workaround, you could try

rDboot <- omxBootstrapEval(cov2cor(D),boot)
apply(rDboot, 2, quantile, probs=c(0.025,0.975), type=1, na.rm=TRUE)

to get ordinary bootstrap quantile intervals.

Perhaps mxBootstrapEval() needs to be made more robust to this sort of thing.

Replied on Wed, 05/23/2018 - 14:27
Picture of user. Julia Joined: 03/29/2012

In reply to by AdminRobK

Thank you, Rob! There were indeed diagonal elements (first elements) in D which were estimated as zeros. In five replications.

> dummy = omxBootstrapEval(D, boot)
> which(dummy[,1]==0)
[1] 77 87 125 972 976
> which(dummy[,5]==0)
integer(0)
> which(dummy[,9]==0)
integer(0)

When I ran what you sugested next, I get the following:

> rDboot <- omxBootstrapEval(cov2cor(D),boot)
Warning messages:
1: In cov2cor(c(0, 0, 0, 0, 0.0436062325150783, 0.0158601213564099, :
diag(.) had 0 or NA entries; non-finite result is doubtful
2: In cov2cor(c(0, 0, 0, 0, 0.0602353527164806, 0.0225896913128754, :
diag(.) had 0 or NA entries; non-finite result is doubtful
3: In cov2cor(c(0, 0, 0, 0, 0.0105604458655485, 0.00794684550788506, :
diag(.) had 0 or NA entries; non-finite result is doubtful
4: In cov2cor(c(0, 0, 0, 0, 0.0433256522219743, 0.0104454428727477, :
diag(.) had 0 or NA entries; non-finite result is doubtful
5: In cov2cor(c(0, 0, 0, 0, 0.00102029470219143, 0.00657858226215936, :
diag(.) had 0 or NA entries; non-finite result is doubtful
> apply(rDboot, 2, quantile, probs=c(0.025,0.975), type=1, na.rm=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
2.5% 1 -0.9995622 -0.9999974 -0.9995622 1 -0.9999944 -0.9999974 -0.9999944 1
97.5% 1 0.9999901 0.4048525 0.9999901 1 0.9950811 0.4048525 0.9950811 1

I am not sure how reliable those CI's are. Would that be possible/ok to remove those replications where the first element of D got estimated as zero from the CI estimation?

Replied on Wed, 05/23/2018 - 15:32
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Julia

I am not sure how reliable those CI's are.

I don't know how much D variance you're estimating in each of the three traits, but if it's not very much, then you can get legitimate confidence intervals for the dominance-genetic correlations that span most of [-1,1].

Would that be possible/ok to remove those replications where the first element of D got estimated as zero from the CI estimation?

Since it's only 5 out of 1000 replications, I doubt it will make a difference if you drop them. In effect, the na.rm=TRUE is supposed to already be dropping the NAs from the columns of rDboot.

Replied on Fri, 05/25/2018 - 14:22
Picture of user. Julia Joined: 03/29/2012

In reply to by AdminRobK

Thanks a lot, Rob! The D variance is indeed small.

I think I will try to increase the number of replications to see whether confidence intervals will get a bit more narrow. Is there any reference which say how many of them it is good to have?

Replied on Mon, 05/28/2018 - 12:36
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Julia

Is there any reference which say how many of them it is good to have?

I think sometime when I was in grad school I read a publication by Bradley Efron that had rules-of-thumb for setting B, the number of bootstrap replication, but I've never been able to find it. I'll refer you to the following two references, which contain discussion about how to select a value of B to achieve a desired level of Monte-Carlo precision:

  • Efron, B., & Tibshirani, R. J. (1994). An Introduction to the Bootstrap. Boca Raton: Taylor & Francis Group.
  • Shao, J., & Tu, D. (1995). The Jackknife and Bootstrap. New York: Springer-Verlag New York, Inc.

    As discussed in those two books, B can be calculated as a function of sample size, the desired coverage probability of the interval, and certain properties of the true data-generating distribution or of the statistic's true sampling distribution.