estimating covariance matrix for saturated multivariate ACE

Posted on Wed, 03/24/2010 - 12:47

Dorothy Bishop Joined: 02/04/2010

Forums

This is an elementary nonurgent question but an answer would help me understand OpenMx better.
In trying to teach myself a bit more I have been playing around with a couple of multivariate ACE twin scripts. There's one by Hermine Maes that I downloaded from:

http://www.vipbg.vcu.edu/~vipbg/OpenMx/MultivariateTwin_MatrixRaw.R

and another I think by Steve Boker that is in the slides from
http://www.vipbg.vcu.edu/~vipbg/OpenMx/OpenMxSession8.pdf

They differ in how they specify covariance matrices for the saturated model:

The Maes script gives:
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=TRUE, values=pathstartvalue, name="CholMZ" ),
mxAlgebra( expression=CholMZ %*% t(CholMZ), name="ExpCovMZ" ),
mxAlgebra( expression=diag2vec(ExpCovMZ), name="ExpVarMZ"),

The Boker script gives:
mxMatrix( "Diag", ntv, ntv, free=T, values=meanestvalue, lbound=.001, name="EVmzf" ),
mxMatrix( "Stand", ntv, ntv, free=T, values=.5, lb=-.99, ub=.99, name="ERmzf" ),
mxAlgebra( expression= EVmzf %*% ERmzf %*% EVmzf, name="ECmzf" ),

I assume they are mathematically equivalent, with the Boker script estimating the correlation matrix (ERmzf) with the Stand matrix.
My problem is that when I try to run the Boker script, I get:

Error: The job for model 'multi' exited abnormally with the error message: Expected covariance matrix is not positive definite

I understand what the error message is saying – I'm just wondering why you'd want to estimate the covariance matrix this way, if it can give a 'not positive definite' result?

Or maybe this was an example to demonstrate the perils of not using a regular Cholesky decomposition ?

It's possible that the dataset I'm using is not the one it was written for.

Replied on Thu, 03/25/2010 - 13:16

neale Joined: Jul 31, 2009

Hi Dorothy I would say that

Hi Dorothy

I would say that these models are not mathematically equivalent, since, as you have discovered, the correlation + SD approach can generate non-positive definite matrices, whereas the lower triangular (Cholesky) approach cannot, at least as long as the diagonal elements are bounded to be strictly positive.

My sense is that the two models have different intents. One is to estimate the covariance matrix, the other is to estimate the correlation matrix. Both are saturated, in the sense that there are as many parameters used to model the covariance structure as there are observed variances and covariances. They will usually yield the same -2lnL, but the Cholesky is more robust, and it is possible to compute the correlation matrix post-hoc from the Cholesky-generated expected covariance matrix. Note, however, that standard errors of the correlations are obtained directly (via summary()) in the correlation approach, whereas they are not in the Cholesky.

Log in or register to post comments

Replied on Thu, 03/25/2010 - 14:42

Dorothy Bishop Joined: Feb 04, 2010

Thanks. Admirably clear as

Thanks. Admirably clear as always.
I've not enjoyed trying to do the algebra to compute standard errors of terms derived from the estimated paths so can see the wisdom of getting this estimated directly.

Log in or register to post comments

Replied on Tue, 03/30/2010 - 09:36

Sanja Joined: Dec 07, 2009

Hi, I'm trying to estimate a

Hi,
I'm trying to estimate a 24x24 covariance matrix using FIML in OpenMx, but am not sure whether OpenMx can handle 24 variables (it runs for about an hour and then kind of crashes). If this is the problem, could classic Mx handle it?
Thanks,
Sanja

Log in or register to post comments

Replied on Tue, 03/30/2010 - 09:58

Sanja Joined: Dec 07, 2009

#Btw I more-less use

#Btw I more-less use Hermine's script from Boulder:

library(OpenMx)
library(psych)
source("GenEpiHelperFunctions.R")
nv=12
ntv=nv*2
data=round(t(matrix(scan('s1.dat',skip=1,na.strings='-999'),ntv+1,)),2)
dim(data)

colnames(data)=c('zyg',paste(rep('rak_',6),rep(c('v_','nv_'),each=3),rep('1_',6),rep(c('5','7','10'),2),sep=''),paste(rep(c('wisc_R_','wais_III_'),each=3),rep(c('VCI_','POI_','FDI_'),2),rep(c('1_12','1_18'),each=3),sep=''),paste(rep('rak_',6),rep(c('v_','nv_'),each=3),rep('2_',6),rep(c('5','7','10'),2),sep=''),paste(rep(c('wisc_R_','wais_III_'),each=3),rep(c('VCI_','POI_','FDI_'),2),rep(c('2_12','2_18'),each=3),sep=''))

mzData <- subset(data, data[,1]==1|data[,1]==3,colnames(data)[-1]);dim(mzData)
dzData <- subset(data, data[,1]==2|data[,1]==4|data[,1]==5|data[,1]==6,colnames(data)[-1]);dim(dzData)

# Fit Multivariate Saturated Model
# -----------------------------------------------------------------------
multTwinSatModel <- mxModel("multTwinSat",
mxModel("MZ",
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=T, values=5, name="CholMZ" ),
mxAlgebra( expression=CholMZ %*% t(CholMZ), name="expCovMZ" ),
mxMatrix( type="Full", nrow=1, ncol=ntv, free=T, values=20, name="expMeanMZ" ),
mxData( observed=mzData, type="raw" ),
mxFIMLObjective( covariance="expCovMZ", means="expMeanMZ", dimnames=dimnames(mzData)[[2]]),
# Algebras needed for equality constraints
mxAlgebra( expression=expMeanMZ[1,1:nv], name="expMeanMZt1"),
mxAlgebra( expression=expMeanMZ[1,(nv+1):ntv], name="expMeanMZt2"),
mxAlgebra( expression=t(diag2vec(expCovMZ)), name="expVarMZ"),
mxAlgebra( expression=expVarMZ[1,1:nv], name="expVarMZt1"),
mxAlgebra( expression=expVarMZ[1,(nv+1):ntv], name="expVarMZt2")
),
mxModel("DZ",
mxMatrix( type="Lower", nrow=ntv, ncol=ntv, free=T, values=5, name="CholDZ" ),
mxAlgebra( expression=CholDZ %*% t(CholDZ), name="expCovDZ" ),
mxMatrix( type="Full", nrow=1, ncol=ntv, free=T, values=20, name="expMeanDZ" ),
mxData( observed=dzData, type="raw" ),
mxFIMLObjective( covariance="expCovDZ", means="expMeanDZ", dimnames=dimnames(dzData)[[2]]),
# Algebra's needed for equality constraints
mxAlgebra( expression=expMeanDZ[1,1:nv], name="expMeanDZt1"),
mxAlgebra( expression=expMeanDZ[1,(nv+1):ntv], name="expMeanDZt2"),
mxAlgebra( expression=t(diag2vec(expCovDZ)), name="expVarDZ"),
mxAlgebra( expression=expVarDZ[1,1:nv], name="expVarDZt1"),
mxAlgebra( expression=expVarDZ[1,(nv+1):ntv], name="expVarDZt2")
),
mxAlgebra( MZ.objective + DZ.objective, name="minus2sumll" ),
mxAlgebraObjective("minus2sumll")
)

multTwinSatFit <- mxRun(multTwinSatModel)
multTwinSatSumm <- summary(multTwinSatFit)
multTwinSatSumm

Log in or register to post comments

Replied on Tue, 03/30/2010 - 09:56

neale Joined: Jul 31, 2009

So, 24 means and 24*25/2=300

So, 24 means and 24*25/2=300 covariances gives 324 parameters altogether. It will certainly take a long time in either OpenMx or ClassicMx. Inverting 24x24 matrices takes a while for each function evaluation, and the optimizer has to invert 324x324 matrices every now and then during its search for the solution.

At present, OpenMx has not been optimized for speed, and it will likely take longer than ClassicMx.

Log in or register to post comments

Replied on Tue, 03/30/2010 - 10:01

Sanja Joined: Dec 07, 2009

Thanks! So it is doable but

Thanks! So it is doable but just takes a long time. I'll run it in classic Mx then.
Sanja

Log in or register to post comments

Log in or register to post comments

News

Recent Posts

estimating covariance matrix for saturated multivariate ACE

Hi Dorothy I would say that

Thanks. Admirably clear as

Hi, I'm trying to estimate a

#Btw I more-less use

So, 24 means and 24*25/2=300

Thanks! So it is doable but

News

Recent Posts