CI estimation of Trivariate Cholesky model

Posted on
No user picture. Liz Joined: 04/20/2016
Recently I was trying a trivariate Cholesky model with the latest version of OpenMx. However, when I estimated the standardized CI of parameters, it reported “Error in runHelper(model, frontendStart, intervals, silent, suppressWarnings, : NLOPT fatal error -1”. Then I followed the advices of Rob replied to another topic that change the optimizer from “SLSQP” to “NPSOL”. Although it run smoothly, the results contained minus value and bigger than 1 that were very strange, such as:

CholACE.StandardizedA[1,1] 4.835575e-02 0.4772359455 8.043944e-01
CholACE.StandardizedA[2,1] -3.001322e+03 1.6422971828 1.889364e+03
CholACE.StandardizedA[3,1] -4.236242e+02 -0.2316164474 1.479296e+00

The genetic correlation seemed also unreasonable:

CholACE.CorA[1,1] 1.000000e+00 1.0000000000 1.000000e+00 !!!
CholACE.CorA[2,1] -9.999988e-01 -0.9999978673 9.999902e-01
CholACE.CorA[3,1] -9.999983e-01 0.9999635762 9.999960e-01

I also checked my dataset which contained no singular value. I totally have no idea about this situation. Could anyone help me find the problems of my codes (as follow)? Many thanks!

meVals <- c(0.5,0.5,0.5) # start value for means
meanLabs <- paste(Vars,"mean",sep="") # create labels for means
paVal <- .6 # start value for path coefficient
paLBo <- .0001 # start value for lower bounds
paValD <- vech(diag(paVal,nv,nv)) # start values for diagonal of covariance matrix
paLBoD <- diag(paLBo,nv,nv) # lower bounds for diagonal of covariance matrix
paLBoD[lower.tri(paLBoD)] <- -10 # lower bounds for below diagonal elements
paLBoD[upper.tri(paLBoD)] <- NA # lower bounds for above diagonal elements

# Create Labels for Lower Triangular Matrices
aLabs <- paste("a",rev(nv+1-sequence(1:nv)),rep(1:nv,nv:1),sep="_")
cLabs <- paste("c",rev(nv+1-sequence(1:nv)),rep(1:nv,nv:1),sep="_")
eLabs <- paste("e",rev(nv+1-sequence(1:nv)),rep(1:nv,nv:1),sep="_")

# Matrices declared to store a, c, and e Path Coefficients
pathA <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE,
values=paValD, labels=aLabs, lbound=paLBoD, name="a" )
pathC <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE,
values=paValD, labels=cLabs, lbound=paLBoD, name="c" )
pathE <- mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE,
values=paValD, labels=eLabs, lbound=paLBoD, name="e" )

# Matrices generated to hold A, C, and E computed Variance Components
covA <- mxAlgebra( expression=a %*% t(a), name="A" )
covC <- mxAlgebra( expression=c %*% t(c), name="C" )
covE <- mxAlgebra( expression=e %*% t(e), name="E" )

# Algebra to compute total variances and standard deviations (diagonal only)
covP <- mxAlgebra( expression=A+C+E, name="V" )
matI <- mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I")
invSD <- mxAlgebra( expression=solve(sqrt(I*V)), name="iSD")

# Algebra for expected Mean and Variance/Covariance Matrices in MZ & DZ twins
meanG <- mxMatrix( type="Full", nrow=1, ncol=nv, free=TRUE,
values=meVals, labels=meanLabs, name="Mean" )
meanT <- mxAlgebra( expression= cbind(Mean,Mean), name="expMean" )
covMZ <- mxAlgebra( expression=
rbind( cbind(A+C+E , A+C),
cbind(A+C , A+C+E)), name="expCovMZ" )
covDZ <- mxAlgebra( expression=
rbind( cbind(A+C+E , 0.5%x%A+C),
cbind(0.5%x%A+C , A+C+E)), name="expCovDZ" )

# Standeraized parameters and CI
sta <- mxAlgebra(iSD %*% a, name="sta")
stc <- mxAlgebra(iSD %*% c, name="stc")
ste <- mxAlgebra(iSD %*% e, name="ste")
StandardizedA <- mxAlgebra(A/V, name="StandardizedA")
StandardizedC <- mxAlgebra(C/V, name="StandardizedC")
StandardizedE <- mxAlgebra(E/V, name="StandardizedE")
CorA <- mxAlgebra(solve(sqrt(I*A)) %&% A, name="CorA")
CorC <- mxAlgebra(solve(sqrt(I*C)) %&% C, name="CorC")
CorE <- mxAlgebra(solve(sqrt(I*E)) %&% E, name="CorE")
CorP <- mxAlgebra(solve(sqrt(I*V)) %&% V, name="CorP")

CI <- mxCI(c('sta', 'stc', 'ste','StandardizedA', 'StandardizedC', 'StandardizedE','CorA', 'CorC', 'CorE', 'CorP'))

# Data objects for Multiple Groups
dataMZ <- mxData( observed=mzData, type="raw" )
dataDZ <- mxData( observed=dzData, type="raw" )

# Objective objects for Multiple Groups
objMZ <- mxExpectationNormal( covariance="expCovMZ", means="expMean",
dimnames=selVars )
objDZ <- mxExpectationNormal( covariance="expCovDZ", means="expMean",
dimnames=selVars )
funML <- mxFitFunctionML()

# Combine Groups
pars <- list( pathA, pathC, pathE, covA, covC, covE, covP,
matI, invSD, meanG, meanT, sta, stc, ste,
StandardizedA, StandardizedC, StandardizedE,
CorA, CorC, CorE, CorP)
modelMZ <- mxModel( pars, covMZ, dataMZ, objMZ, funML, name="MZ" )
modelDZ <- mxModel( pars, covDZ, dataDZ, objDZ, funML, name="DZ" )
fitML <- mxFitFunctionMultigroup(c("MZ.fitfunction","DZ.fitfunction") )
CholAceModel <- mxModel( "CholACE", pars, modelMZ, modelDZ, fitML, CI)

Replied on Fri, 04/22/2016 - 11:31
Picture of user. AdminRobK Joined: 01/24/2014

Although it run smoothly, the results contained minus value and bigger than 1 that were very strange, such as:

CholACE.StandardizedA[1,1] 4.835575e-02 0.4772359455 8.043944e-01
CholACE.StandardizedA[2,1] -3.001322e+03 1.6422971828 1.889364e+03
CholACE.StandardizedA[3,1] -4.236242e+02 -0.2316164474 1.479296e+00

Your MxAlgebra StandardizedA is defined as A/V, which is the A matrix elementwise-divided by the phenotypic covariance matrix. Only the diagonal elements of StandardizedA--elements [1,1], [2,2], and [3,3]--are readily interpretable; they are, of course, the standardized variance components (i.e., variance proportions). Elements [2,1] and [3,1] of StandardizedA are not bounded between 0 and 1.

The genetic correlation seemed also unreasonable:

CholACE.CorA[1,1] 1.000000e+00 1.0000000000 1.000000e+00 !!!
CholACE.CorA[2,1] -9.999988e-01 -0.9999978673 9.999902e-01
CholACE.CorA[3,1] -9.999983e-01 0.9999635762 9.999960e-01

If you're looking for the genetic correlations among the three phenotypes, you need to be looking at elements [2,1], [3,1], and [3,2] (or equivalently, [1,2], [1,3], and [2,3]) of CorA. The CI you report for the [1,1] element is correct because that's a diagonal element of a correlation matrix, and is therefore fixed at 1.0. The other two appear OK, in that they are not flagged with "!!!". How large is your sample? If it is relatively small, it's quite possible to get an extreme estimate of genetic correlations, and for a valid CI for the genetic correlations to span nearly the whole parameter space--see this post for a similar situation.

Replied on Sat, 04/23/2016 - 01:18
No user picture. Liz Joined: 04/20/2016

In reply to by AdminRobK

Many thanks for your kind suggestions, Robk. So the script is correct and the results also make sense? If so, how could I estimate the CIs of a31, a21 and a32, changing the MxAlgebra StandarizedA?

And yes, my sample size is very small that contains only 20 pairs of MZ and 15 pairs of DZ. How does the sample size affect the genetic correlations? Is it nessesary to calculate the statistical power? Thanks~ :D

Replied on Sun, 04/24/2016 - 13:01
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Liz

how could I estimate the CIs of a31, a21 and a32, changing the MxAlgebra StandarizedA?

What do you mean by "a31," "a21," and "a32?" It looks like you're requesting CIs for all the elements of StandardizedA and CorA (and for that matter, sta), so you should be seeing entries for the elements you want in the confidence-interval output. Right?

And yes, my sample size is very small that contains only 20 pairs of MZ and 15 pairs of DZ. How does the sample size affect the genetic correlations? Is it nessesary to calculate the statistical power?

That's tiny! With a sample that small, you're going to get very statistically imprecise estimates of the parameters of interest, so it's not that surprising that you're seeing extremely wide CIs for the genetic correlations.

Replied on Sun, 04/24/2016 - 22:39
No user picture. Liz Joined: 04/20/2016

In reply to by AdminRobK

Thanks. Refers to the "a21" an "e21", I mean the genetic or unique environmental contribution from the 1st phenotype to the 2nd phenotype in the Cholesky decomposition model. In the bivariate or multivariate model, how can we decompose the genetic contribution to the 2nd phenotype into two parts: the part from the 1st phenotype, and the part from the 2nd phenotype itself (I mean the path coefficients). In my trivariate cholesky model I found that the sum of of the standarized a2, c2 and e2 of the 2nd phenotype already equals to 1 that means the genetic contribution from the 1st phenotype to the 2nd phenotype is unclear?
Replied on Mon, 04/25/2016 - 10:37
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Liz

Refers to the "a21" an "e21", I mean the genetic or unique environmental contribution from the 1st phenotype to the 2nd phenotype in the Cholesky decomposition model. In the bivariate or multivariate model, how can we decompose the genetic contribution to the 2nd phenotype into two parts: the part from the 1st phenotype, and the part from the 2nd phenotype itself (I mean the path coefficients).

The standardized path coefficients of the Cholesky parameterization are in your MxAlgebras sta, stc, and ste. It sounds like you're specifically interested in the [2,1] element of sta and ste.

In my trivariate cholesky model I found that the sum of of the standarized a2, c2 and e2 of the 2nd phenotype already equals to 1 that means the genetic contribution from the 1st phenotype to the 2nd phenotype is unclear?

I'm not sure what you mean here. If you're summing standardized variance components (from the diagonals of StandardizedA, StandardizedC, and StandardizedE), you're going to get a sum of 1.0 for sure.

Replied on Fri, 04/29/2016 - 09:10
No user picture. Liz Joined: 04/20/2016

In reply to by AdminRobK

BTW, how could I calculated the contribution of genes to phenotypic correlation between two phenotypes in the OpenMx script (rpha, not the genetic correlation rg). I know the fomula is "rpha = (a1*rg*a2)/rph" or "rpha = (sqrt(a12^2+a22^2)*rg*a11)/rph". However, the rpha I calculated by hand was bigger than 1, is that reseanable, or the fomula was wrong? Thanks.
Replied on Tue, 05/03/2016 - 10:54
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Liz

I think you have the formula wrong. According to Posthuma (2009), the additive-genetic contribution to the phenotypic correlation between traits #1 and #2 is given by

Note that none of the path coefficients from the Cholesky factorization appear in the above formula; only the genetic correlation and the (raw) variance components do.

Source: Posthuma, D. (2009). Multivariate genetic analysis. In Kim, Y-K. (Ed.), Handbook of Behavior Genetics (pp. 47-59). New York: Springer Science+Business Media, LLC.

Replied on Tue, 05/03/2016 - 22:30
No user picture. Liz Joined: 04/20/2016

In reply to by AdminRobK

Many thanks! I have readed the relevant part in the book you mentioned, it is so helpful to me.
Replied on Sun, 04/24/2016 - 23:31
No user picture. Liz Joined: 04/20/2016

Many thanks, Rob. I think I have known how to extract the standarized path coefficients and the coresponding CIs from the results.