You are here

estimating covariance matrix for saturated multivariate ACE

7 posts / 0 new
Last post
Dorothy Bishop's picture
Offline
Joined: 02/04/2010 - 02:20
estimating covariance matrix for saturated multivariate ACE

This is an elementary nonurgent question but an answer would help me understand OpenMx better.
In trying to teach myself a bit more I have been playing around with a couple of multivariate ACE twin scripts. There's one by Hermine Maes that I downloaded from:

http://www.vipbg.vcu.edu/~vipbg/OpenMx/MultivariateTwin_MatrixRaw.R

and another I think by Steve Boker that is in the slides from
http://www.vipbg.vcu.edu/~vipbg/OpenMx/OpenMxSession8.pdf

They differ in how they specify covariance matrices for the saturated model:

The Maes script gives:
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=TRUE, values=pathstartvalue, name="CholMZ" ),
mxAlgebra( expression=CholMZ %*% t(CholMZ), name="ExpCovMZ" ),
mxAlgebra( expression=diag2vec(ExpCovMZ), name="ExpVarMZ"),

The Boker script gives:
mxMatrix( "Diag", ntv, ntv, free=T, values=meanestvalue, lbound=.001, name="EVmzf" ),
mxMatrix( "Stand", ntv, ntv, free=T, values=.5, lb=-.99, ub=.99, name="ERmzf" ),
mxAlgebra( expression= EVmzf %% ERmzf %% EVmzf, name="ECmzf" ),

I assume they are mathematically equivalent, with the Boker script estimating the correlation matrix (ERmzf) with the Stand matrix.
My problem is that when I try to run the Boker script, I get:

Error: The job for model 'multi' exited abnormally with the error message: Expected covariance matrix is not positive definite

I understand what the error message is saying – I'm just wondering why you'd want to estimate the covariance matrix this way, if it can give a 'not positive definite' result?

Or maybe this was an example to demonstrate the perils of not using a regular Cholesky decomposition ?

It's possible that the dataset I'm using is not the one it was written for.

neale's picture
Offline
Joined: 07/31/2009 - 15:14
Hi Dorothy I would say that

Hi Dorothy

I would say that these models are not mathematically equivalent, since, as you have discovered, the correlation + SD approach can generate non-positive definite matrices, whereas the lower triangular (Cholesky) approach cannot, at least as long as the diagonal elements are bounded to be strictly positive.

My sense is that the two models have different intents. One is to estimate the covariance matrix, the other is to estimate the correlation matrix. Both are saturated, in the sense that there are as many parameters used to model the covariance structure as there are observed variances and covariances. They will usually yield the same -2lnL, but the Cholesky is more robust, and it is possible to compute the correlation matrix post-hoc from the Cholesky-generated expected covariance matrix. Note, however, that standard errors of the correlations are obtained directly (via summary()) in the correlation approach, whereas they are not in the Cholesky.

Dorothy Bishop's picture
Offline
Joined: 02/04/2010 - 02:20
Thanks. Admirably clear as

Thanks. Admirably clear as always.
I've not enjoyed trying to do the algebra to compute standard errors of terms derived from the estimated paths so can see the wisdom of getting this estimated directly.

Sanja's picture
Offline
Joined: 12/07/2009 - 05:52
Hi, I'm trying to estimate a

Hi,
I'm trying to estimate a 24x24 covariance matrix using FIML in OpenMx, but am not sure whether OpenMx can handle 24 variables (it runs for about an hour and then kind of crashes). If this is the problem, could classic Mx handle it?
Thanks,
Sanja

Sanja's picture
Offline
Joined: 12/07/2009 - 05:52
#Btw I more-less use

Btw I more-less use Hermine's script from Boulder:

library(OpenMx)
library(psych)
source("GenEpiHelperFunctions.R")
nv=12
ntv=nv*2
data=round(t(matrix(scan('s1.dat',skip=1,na.strings='-999'),ntv+1,)),2)
dim(data)

colnames(data)=c('zyg',paste(rep('rak_',6),rep(c('v_','nv_'),each=3),rep('1_',6),rep(c('5','7','10'),2),sep=''),paste(rep(c('wisc_R_','wais_III_'),each=3),rep(c('VCI_','POI_','FDI_'),2),rep(c('1_12','1_18'),each=3),sep=''),paste(rep('rak_',6),rep(c('v_','nv_'),each=3),rep('2_',6),rep(c('5','7','10'),2),sep=''),paste(rep(c('wisc_R_','wais_III_'),each=3),rep(c('VCI_','POI_','FDI_'),2),rep(c('2_12','2_18'),each=3),sep=''))

mzData <- subset(data, data[,1]==1|data[,1]==3,colnames(data)[-1]);dim(mzData)
dzData <- subset(data, data[,1]==2|data[,1]==4|data[,1]==5|data[,1]==6,colnames(data)[-1]);dim(dzData)

Fit Multivariate Saturated Model

-----------------------------------------------------------------------

multTwinSatModel <- mxModel("multTwinSat",
mxModel("MZ",
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=T, values=5, name="CholMZ" ),
mxAlgebra( expression=CholMZ %*% t(CholMZ), name="expCovMZ" ),
mxMatrix( type="Full", nrow=1, ncol=ntv, free=T, values=20, name="expMeanMZ" ),
mxData( observed=mzData, type="raw" ),
mxFIMLObjective( covariance="expCovMZ", means="expMeanMZ", dimnames=dimnames(mzData)[[2]]),

Algebras needed for equality constraints

mxAlgebra( expression=expMeanMZ[1,1:nv], name="expMeanMZt1"),
mxAlgebra( expression=expMeanMZ[1,(nv+1):ntv], name="expMeanMZt2"),
mxAlgebra( expression=t(diag2vec(expCovMZ)), name="expVarMZ"),
mxAlgebra( expression=expVarMZ[1,1:nv], name="expVarMZt1"),
mxAlgebra( expression=expVarMZ[1,(nv+1):ntv], name="expVarMZt2")
),
mxModel("DZ",
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=T, values=5, name="CholDZ" ),
mxAlgebra( expression=CholDZ %*% t(CholDZ), name="expCovDZ" ),
mxMatrix( type="Full", nrow=1, ncol=ntv, free=T, values=20, name="expMeanDZ" ),
mxData( observed=dzData, type="raw" ),
mxFIMLObjective( covariance="expCovDZ", means="expMeanDZ", dimnames=dimnames(dzData)[[2]]),

Algebra's needed for equality constraints

mxAlgebra( expression=expMeanDZ[1,1:nv], name="expMeanDZt1"),
mxAlgebra( expression=expMeanDZ[1,(nv+1):ntv], name="expMeanDZt2"),
mxAlgebra( expression=t(diag2vec(expCovDZ)), name="expVarDZ"),
mxAlgebra( expression=expVarDZ[1,1:nv], name="expVarDZt1"),
mxAlgebra( expression=expVarDZ[1,(nv+1):ntv], name="expVarDZt2")
),
mxAlgebra( MZ.objective + DZ.objective, name="minus2sumll" ),
mxAlgebraObjective("minus2sumll")
)

multTwinSatFit <- mxRun(multTwinSatModel)
multTwinSatSumm <- summary(multTwinSatFit)
multTwinSatSumm

neale's picture
Offline
Joined: 07/31/2009 - 15:14
So, 24 means and 24*25/2=300

So, 24 means and 24*25/2=300 covariances gives 324 parameters altogether. It will certainly take a long time in either OpenMx or ClassicMx. Inverting 24x24 matrices takes a while for each function evaluation, and the optimizer has to invert 324x324 matrices every now and then during its search for the solution.

At present, OpenMx has not been optimized for speed, and it will likely take longer than ClassicMx.

Sanja's picture
Offline
Joined: 12/07/2009 - 05:52
Thanks! So it is doable but

Thanks! So it is doable but just takes a long time. I'll run it in classic Mx then.
Sanja