Dear OpenMx experts,

May I ask what's the difference between multi-variate ACE models based on paths (Cholesky decomposition) and models based on variance, such as advantages and disadvantages of the two methods? I was told that the models based on variance and covariance will not suffer from the type I error issues, by compared to the Cholesky decomposition, why? Will the variance-based multi-variate ACE models better than the Cholesky decomposition-based models? Could I model a multi-variate ACE model based on variance in OpenMx? Any demo codes would be appreciated.

Thanks and best wishes,

Zhi

The two are equivalent parameterizations. Under the "Cholesky" parameterization, the free parameters that define the model-expected covariance matrix are the elements of the Cholesky factors of the

A,C, andEcovariance matrices. Under the "direct-symmetric" parameterization, the free parameters that define the model-expected covariance matrix are the elements of theA,C, andEcovariance matrices themselves. Much of the time--probably, most of the time--results will not differ between the two parameterizations. But, sometimes resultswilldiffer, because the Cholesky parameterization imposes an implicit constraint that the direct-symmetric parameterization does not, namely, that theA,C, andEcovariance matrices must be positive-definite. But, the model-expected covariance matrix remains within the parameter space of the multivariate-normal distribution as long as it itself is positive-definite, whether or not theA,C, andEcovariance matrices are positive-definite.Thus, the Cholesky parameterization introduces a "boundary condition," which can interfere with the usual asymptotic maximum-likelihood theory. One of the first (perhaps first) researchers to recognize how the Cholesky parameterization can interfere with Type I error rates was Greg Carey, in a 2005 paper, "Cholesky Problems". This year's Boulder Workshop was the first year we taught the direct-symmetric parameterization.

It is true that the direct-symmetric parameterization can result in negative estimates of variance components, but that is not necessarily a bad thing. I'd say the only real advantage the Cholesky parameterization has is its interpretability.

See here for a collection of twin-modeling scripts for one phenotype and one covariate, maintained by Hermine Maes. The scripts that are described as "estimating variance components" use the direct-symmetric parameterization, and those described as "estimating path coefficients" use the Cholesky parameterization.

Thank you very much for your always useful and detailed advices, Rob. May I ask another question which may be not related to this post? When I summarized a full trivariate Cholesky ACE model in OpenMx, the CIs of all the parameters were NA, however, I could get the CIs in the submodels such as AE or CE. In addition, if I add the flag "verbose = T" in the summarization of the full ACE model, I could also get the CIs of most parameters estimation. May I know why and if the CIs of the full model after adding "verbose = T" were trustable? (Actually I have found some previous related posts on this question, however, I cannot reach them not matter what the search engine or computer I use.)

Best wishes,

Zhi

They are not trustworthy, which is why they are not presented in the non-verbose

`summary()`

output. However, they are not necessarily wrong, either, but there is reason to believe that they MAY be wrong. The reason for why each confidence limit may be wrong is given in the 'diagnostic' column of the CI details table. Which optimizer are you using?The best way to search the OpenMx website is to enter your search terms into the search text box in the upper right of each page, click the "Search" button, and then when the search results page opens, click on "Content" above the text box near the top of the page.

Thank you very much, Rob. I have tried both the NPSOL and SLSQP, and their results were similar that the CIs were missing. Here is the version of R and OpenMx I use now:

OpenMx version: 2.11.5 [GIT v2.11.5]

R version: R version 3.5.1 (2018-07-02)

Platform: x86_64-w64-mingw32

Default optimiser: SLSQP

NPSOL-enabled?: Yes

OpenMP-enabled?: No

Please see attached for the part of CIs in my codes. I could run them smoothly and the results seemed reasonable, excepted for the missing CIs. Could you please kindly tell me if there is something wrong about my codes? Thank you very much.

sta <- mxAlgebra(iSD %% a, name="sta") #standarized path coefficients of A

stc <- mxAlgebra(iSD %% c, name="stc")

ste <- mxAlgebra(iSD %% e, name="ste")

StandardizedA <- mxAlgebra(A/V, name="StandardizedA")

StandardizedC <- mxAlgebra(C/V, name="StandardizedC")

StandardizedE <- mxAlgebra(E/V, name="StandardizedE")

CorA <- mxAlgebra(solve(sqrt(IA)) %&% A, name="CorA")

CorC <- mxAlgebra(solve(sqrt(IC)) %&% C, name="CorC")

CorE <- mxAlgebra(solve(sqrt(IE)) %&% E, name="CorE")

CorP <- mxAlgebra(solve(sqrt(I*V)) %&% V, name="CorP")

CI <- mxCI(c('sta', 'stc', 'ste','StandardizedA', 'StandardizedC', 'StandardizedE','CorA', 'CorC', 'CorE', 'CorP'))

dataMZ <- mxData( observed=mzData, type="raw" )

dataDZ <- mxData( observed=dzData, type="raw" )

objMZ <- mxExpectationNormal( covariance="expCovMZ", means="expMean",

dimnames=selVars )

objDZ <- mxExpectationNormal( covariance="expCovDZ", means="expMean",

dimnames=selVars )

funML <- mxFitFunctionML()

pars <- list( pathA, pathC, pathE, covA, covC, covE, covP,

matI, invSD, meanG, meanT, sta, stc, ste,

StandardizedA, StandardizedC, StandardizedE,

CorA, CorC, CorE, CorP)

modelMZ <- mxModel( pars, covMZ, dataMZ, objMZ, funML, name="MZ" )

modelDZ <- mxModel( pars, covDZ, dataDZ, objDZ, funML, name="DZ" )

fitML <- mxFitFunctionMultigroup(c("MZ.fitfunction","DZ.fitfunction") )

CholAceModel <- mxModel( "CholACE", pars, modelMZ, modelDZ, fitML, CI)

I'd need to see more of your script, as well as the verbose

`summary()`

output, to be able to give any advice.Thanks, Rob. Please see below for my script:

The results I got from this script were:

CholACE.StandardizedA[1,1] NA 0.34696886 NA !!!

CholACE.StandardizedA[2,2] NA 0.57933647 NA !!!

CholACE.StandardizedA[3,3] NA 0.27260377 0.5327416 !!!

CholACE.StandardizedC[1,1] NA 0.07631260 NA !!!

CholACE.StandardizedC[2,2] NA 0.02342186 NA !!!

CholACE.StandardizedC[3,3] NA 0.01657537 NA !!!

CholACE.StandardizedE[1,1] NA 0.57671854 NA !!!

CholACE.StandardizedE[2,2] NA 0.39724167 NA !!!

CholACE.StandardizedE[3,3] NA 0.71082086 NA !!!

After I added the verbose, I got the missing CIs:

parameter side value fit diagnostic statusCode

CholACE.StandardizedA[1,1] lower 0.00576769 1388.242545 active box constraint OK

CholACE.StandardizedA[1,1] upper 0.620436827 1388.2481 active box constraint OK

CholACE.StandardizedA[2,2] lower 0.220852389 1388.264244 active box constraint nonzero gradient/red

CholACE.StandardizedA[2,2] upper 0.74937414 1388.245092 active box constraint nonzero gradient/red

CholACE.StandardizedA[3,3] lower 0.015417542 1388.244836 active box constraint nonzero gradient/red

CholACE.StandardizedA[3,3] upper 0.532741586 1388.251373 success nonzero gradient/red

CholACE.StandardizedC[1,1] lower 8.69E-09 1388.23726 active box constraint nonzero gradient/red

CholACE.StandardizedC[1,1] upper 0.402550673 1388.252648 active box constraint nonzero gradient/red

CholACE.StandardizedC[2,2] lower 9.07E-09 1388.23726 active box constraint nonzero gradient/red

CholACE.StandardizedC[2,2] upper 0.290058001 1388.25808 active box constraint nonzero gradient/red

CholACE.StandardizedC[3,3] lower 9.24E-09 1388.23726 active box constraint nonzero gradient/red

CholACE.StandardizedC[3,3] upper 0.236906447 1388.253135 active box constraint nonzero gradient/red

CholACE.StandardizedE[1,1] lower 0.37856246 1388.247897 active box constraint nonzero gradient/red

CholACE.StandardizedE[1,1] upper 0.824304393 1388.252819 active box constraint nonzero gradient/red

CholACE.StandardizedE[2,2] lower 0.248666688 1388.244859 active box constraint nonzero gradient/red

CholACE.StandardizedE[2,2] upper 0.616918327 1388.253597 active box constraint nonzero gradient/red

CholACE.StandardizedE[3,3] lower 0.464477104 1388.251145 active box constraint nonzero gradient/red

CholACE.StandardizedE[3,3] upper 0.961257369 1388.246413 active box constraint nonzero gradient/red

All the CI diagnostics say "active box constraint", which means that one or more free parameters are close to their lbounds or ubounds. Active box constraints represent a boundary condition (described upthread in the context of the Cholesky parameterization), and therefore mean that the confidence limit in question may be wrong. Try eliminating the

`lbound`

s from this code:Thank you very much, Rob. It works now, although the results are very close to the "verbose = T". May I ask how many kinds of diagnostic of CI results in openMx, and how to deal with them? Sorry I did not find it in the manual.

I have also tried the direct-symmetric parameterization. However, the results were hard to interpret:

Results of Cholesky decomposition:

CholACE.StandardizedA[1,1] 5.7629863e-03 3.4696840e-01 0.620436834

CholACE.StandardizedA[2,2] 2.2085240e-01 5.7933417e-01 0.749374149

CholACE.StandardizedA[3,3] 1.5417533e-02 2.7260201e-01 0.532741610

Results of direct-symmetric parameterization:

twoACEvc.StandardizedA[1,1] -0.129377109 0.632041104 1.390309000

twoACEvc.StandardizedA[2,2] 0.580567884 1.316408772 2.054897056

twoACEvc.StandardizedA[3,3] 0.236514247 1.057349468 1.794373138

The parameter estimation of direct-symmetric parameterization could be higher than 1. How could I know the heritability and its significance of phenotype in the variance-based model? Based on the above CIs, can I draw a conclusion that the first phenotype is heritable in the Cholesky decomposition, but not in direct-symmetric parameterization? Looking forward to your kind suggestions.

See the code in the source repository, in src/ComputeGD.cpp , starting at line 1541. Of those eight CI diagnostics, I have only ever seen 4 occur in practice: "success", "alpha level not reached", "bound infeasible", and "active box constraint".

There doesn't look like there is any evidence for heritability of the first phenotype. Even under the Cholesky parameterization, the lower confidence limit is essentially zero (it can't actually reach zero because of the implicit constraint inherent to the Cholesky parameterization).

Thanks Rob. The parameter estimations were the results of the full ACE model. However, if the sub-models were introduced the AE model would survived. In this condition, which one would be more trustable, the CIs of the full ACE or the model comparison between the full genetic model and its sub-models?

How could we get the heritability from the direct-symmetric parameterization and explain the results? The parameters estimation were higher than 1 and the sum of variance of A, C and E seemed not equal to 1.

Assuming that the full ACE model and the submodels converged, and that all of the CIs succeeded, then I would say both are informative. The model-comparison chi-square for ACE versus (say) AE is an omnibus test of the null hypothesis that there is no

C, that is, that the 6(?)C-related parameters are all zero. The CIs instead represent inference about one quantity at a time.I'd like to see those results.

Thanks. So the results of CIs and model comparisons are independent to each other, could I understand it in this way? However, if the AE model survives and the CI of A in AE does not contain zero, could I conclude that this phenotype is heritable even the CI of A in the full ACE contains zero?

Please see below for the results of the direct-symmetric parameterization (the first phenotype):

twoACEvc.StandardizedA[1,1] -0.129377109 0.632041104 1.390309000

twoACEvc.StandardizedC[1,1] -0.851222975 -0.192160561 0.409548567

twoACEvc.StandardizedE[1,1] 0.369370659 0.560119458 0.825937713

How to explain the negative estimates and quantify the heritability based on this results? In addition, the correlation matrix of C were all NA:

twoACEvc.CorC[1,1] NA NaN NA !!!

twoACEvc.CorC[2,1] NA NaN NA !!!

twoACEvc.CorC[3,1] NA NaN NA !!!

twoACEvc.CorC[1,2] NA NaN NA !!!

twoACEvc.CorC[2,2] NA NaN NA !!!

twoACEvc.CorC[3,2] NA NaN NA !!!

twoACEvc.CorC[1,3] NA NaN NA !!!

twoACEvc.CorC[2,3] NA NaN NA !!!

twoACEvc.CorC[3,3] NA NaN NA !!!

I have tried "verbose = T", the diagnostic was "alpha level not reached". Could this be caused by negative estimates of factor C?

If you're going to report and interpret results from a single "best" model, then my usual recommendation is to report point estimates and confidence intervals from the model with the smallest AIC.

That sounds like a reasonable interpretation to me.

The fact that the first phenotype's

c² is estimated as negative in the ACE model is a signal that you should consider modeling the first phenotype as ADE. I would guess that yes, theCcorrelation matrix is full of`NA`

s due to the negative estimate of the first phenotype'sCvariance component.Hi! My similar bivariate ACE model results were as below:

confidence intervals:

CholCE.corC[2,1] NA 1.000000e+00 NA !!!

CholCE.corC[1,2] NA 1.000000e+00 NA !!!

CholCE.corC[2,2] NA 1.000000e+00 NA !!!

CholCE.corE[1,1] NA 1.000000e+00 NA !!!

CholCE.corE[2,1] NA -8.420869e-02 NA !!!

CholCE.corE[1,2] NA -8.420869e-02 NA !!!

CholCE.corE[2,2] NA 1.000000e+00 NA !!!

CI details:

58 CholCE.corC[1,1] 1.00000000 upper 1156.910 5.010251 0.10257746 -4.200825e-02

59 CholCE.corC[2,1] 0.96243518 lower 1153.141 5.238043 0.09977091 -2.814626e-02

60 CholCE.corC[2,1] 1.00000000 upper 1153.069 5.238043 0.09978666 -5.515351e-07

61 CholCE.corC[1,2] 1.00000000 lower 1153.069 5.238043 0.09978666 -5.515351e-07

62 CholCE.corC[1,2] 1.00000000 upper 1153.069 5.238043 0.09978666 -5.515351e-07

63 CholCE.corC[2,2] 1.00000000 lower 1153.069 5.238043 0.09978666 -5.515351e-07

64 CholCE.corC[2,2] 1.00000000 upper 1153.069 5.238043 0.09978666 -5.515351e-07

65 CholCE.corE[1,1] 1.00000000 lower 1153.069 5.238043 0.09978666 -5.515351e-07

66 CholCE.corE[1,1] 1.00000000 upper 1153.069 5.238043 0.09978666 -5.515351e-07

I am confused why there is no CholCE.corE[1,2] & CholCE.corE[2,2] results in CI details. How can I get it ?

Here is my scripts:

*****Here is another part of the results:

58 success infeasible non-linear constraint

59 active box constraint infeasible non-linear constraint

60 alpha level not reached iteration limit/blue

61 alpha level not reached iteration limit/blue

62 alpha level not reached iteration limit/blue

63 alpha level not reached iteration limit/blue

64 alpha level not reached iteration limit/blue

65 alpha level not reached iteration limit/blue

66 alpha level not reached iteration limit/blue

[ reached getOption("max.print") -- omitted 14 rows ]

I don't know how to understand this part?

Wish for your help！

Thank you!

You must be running an out-of-date version of OpenMx. As of version 2.11 (see release announcement here), the columns of the "CI details" table have been reordered to make the table easier to read, and a bug that caused a lot of confidence limits to fail with "iteration limit/blue" was fixed. What is your

`mxVersion()`

output?I bet the missing rows are among the 14 that got cut off. Update to OpenMx v2.11.5. Then, find the CI-details table inside

`CholCeSumm`

, and print only its first 7 columns. If you restore a saved workspace, you'll have to at least re-create`CholCeSumm`

under version 2.11.5.Note that if 'corE' is a proper correlation matrix, 'corE[2,2]' is going to be fixed at 1. Also, the CI diagnostic 'active box constraint' is a signal that there's at least one parameter's

`lbound`

or`ubound`

interfering with confidence-limit optimization.Thank you Rob. I have tried the variance-based ADE model and found that if the parameter estimate is negative, then the genetic/environmental correlation would be NAs. I think you are right about that the negative estimates do influence the estimation of correlation matrix. My question is what we could get from the variance-based ACE/ADE model, except for that it may be not susceptible to the type I error? From the path-based ACE/ADE model, we could quantify the heritability that to which extent genes or environments influence the phenotype, because the estimates of A, D, C, E are positive in the path-based models. How could I we interpret the negative estimates in the variance-based models? In the estimates of variance-based ACE model below, I think we could not conclude that the heritability of this phenotype is 0.63, the ratio of factor A (heritability?) "included" by compared with the path-based ACE model. And how could we interpret the negative estimate of the common environmental factor C?

twoACEvc.StandardizedA[1,1] -0.129377109 0.632041104 1.390309000

twoACEvc.StandardizedC[1,1] -0.851222975 -0.192160561 0.409548567

twoACEvc.StandardizedE[1,1] 0.369370659 0.560119458 0.825937713

It could not be otherwise. Turning a covariance matrix into its corresponding correlation matrix involves taking the square roots of the covariance matrix's diagonal elements. If a diagonal element is negative, its

`sqrt()`

will be`NaN`

, and the`NaN`

s will propagate into at least part of the resulting correlation matrix via multiplication and addition.The simple answer is that you can't really interpret them. They're a signal that the model being fitted is a poor choice for the data. Are you getting negative variance-component estimates in your "AIC-best" model?

I see now. All the variance-component estimates in my best-fitted model were positive. So the variance-based model may help us clarify the fit of certain model and the significance of parameter estimates. We still need the path-based model to quantify the heritability. Thank you very much for your kind help, Rob.

Thank you!Rob!

I have update OpenMx to latest release. However, the results didn't change.

I don't know how to use "[ reached getOption("max.print") -- omitted 14 rows ]" to see the omitted part. Could you please give me an example?

Try something like

`CholCeSumm$CIdetails[,1:7]`

. If that doesn't work, increase the R option, with something like`options(max.print=2000)`

.One more question: When I tried the Cholesky Decomposition back, I found that the shared environmental correlations were all 1, although their CIs contain zero, if this situation is interpretable?

CholACE.CorC[1,1] 1.00000000 1.00000000 1.000000000 !!!

CholACE.CorC[2,1] -1.00000000 1.00000000 1.000000000

CholACE.CorC[3,1] -1.00000000 1.00000000 1.000000000

CholACE.CorC[1,2] -1.00000000 1.00000000 1.000000000

CholACE.CorC[2,2] 1.00000000 1.00000000 1.000000000 !!!

CholACE.CorC[3,2] -1.00000000 1.00000000 1.000000000

CholACE.CorC[1,3] -1.00000000 1.00000000 1.000000000

CholACE.CorC[2,3] -1.00000000 1.00000000 1.000000000

CholACE.CorC[3,3] 1.00000000 1.00000000 1.000000000 !!!

That is certainly a possible result if there is very little shared-environmental variance in the phenotypes, or if very little of their covariance is due to the shared environment. What this result is telling you is that your estimates of the shared-environmental correlations have essentially zero statistical precision.

Very relevant to the topic of this thread: Type I Error Rates and Parameter Bias in Multivariate Behavioral Genetic Models.

Abstract: