Hello everyone,
I'm a graduate student currently learning OpenMX, and I am unfortunately still a novice at OpenMX and genetic modeling in general. I've so far successfully run multivariate Cholesky analyses with OpenMX in R studio, and gotten interpretable data.
However, I am now trying to drop parameters on my current Cholesky model, to get a "best fit" reduced model for my data. I am having trouble understanding how I might do this; I understand it would involve fixing certain parameters at zero, but given my code, I am completely at a loss as to how I might go about this.
If anyone has the time, would they be able to help me understand how? Or, alternatively, point me in the direction of a Cholesky model code that could easily fix parameters? My thanks; my code is shown below.
nvar <- 4 # number of variables
tnvar <-8 # number of variablesmax family size
nlower <-tnvar(tnvar+1)/2 # number of free elements in a lower matrix tnvar*tnvar
Vars <- c("v1","v2","v3","v4")
selVars <- paste(Vars,c(rep(1,nvar),rep(2,nvar)),sep="")
This yields a list of the 6 variables for analysis
v11 v21 v31 v41 v12 v22 v32 v42
mzData <- as.matrix(subset(twindata, zygosity==1, selVars))
dzData <- as.matrix(subset(twindata, zygosity==2, selVars))
Print Descriptive Statistics
-------------------------------------------------------------------------
summary(mzData)
summary(dzData)
Fit Multivariate ACE Model with RawData and Matrices Input
ACE_Cholesky_Model <- mxModel("ACE_Cholesky",
mxModel("ACE",
Matrices a, c, and e to store a, c, and e path coefficients
mxMatrix( type="Lower", nrow=nvar, ncol=nvar, free=TRUE, values=.6, name="a" ),
mxMatrix( type="Lower", nrow=nvar, ncol=nvar, free=TRUE, values=.6, name="c" ),
mxMatrix( type="Lower", nrow=nvar, ncol=nvar, free=TRUE, values=.6, name="e" ),
Matrices A, C, and E compute variance components
mxAlgebra( a %% t(a), name="A" ),
mxAlgebra( c %% t(c), name="C" ),
mxAlgebra( e %*% t(e), name="E" ),
Algebra to compute total variances and standard deviations (diagonal only)
mxAlgebra( A+C+E, name="V" ),
mxMatrix( type="Iden", nrow=nvar, ncol=nvar, name="I"),
mxAlgebra( solve(sqrt(I*V)), name="iSD"),
Matrix & Algebra for expected means vector
mxMatrix( type="Full", nrow=1, ncol=nvar, free=TRUE, values= 0, name="Mean" ),
mxAlgebra( cbind(Mean,Mean), name="expMean"),
Algebra for expected variance/covariance matrix in MZ
mxAlgebra( rbind ( cbind(A+C+E , A+C),
cbind(A+C , A+C+E)), name="expCovMZ" ),
Algebra for expected variance/covariance matrix in DZ
mxAlgebra( rbind( cbind(A+C+E , 0.5%x%A+C),
cbind(0.5%x%A+C , A+C+E)), name="expCovDZ" )
),
mxModel("MZ",
mxData( observed=mzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovMZ", means="ACE.expMean", dimnames=selVars )
),
mxModel("DZ",
mxData( observed=dzData, type="raw" ),
mxFIMLObjective( covariance="ACE.expCovDZ", means="ACE.expMean", dimnames=selVars )
),
mxAlgebra( MZ.objective + DZ.objective, name="modelfit" ),
mxAlgebraObjective("modelfit")
)
ACE_Cholesky_Fit <- mxRun(ACE_Cholesky_Model)
ACE_Cholesky_Summ <- summary(ACE_Cholesky_Fit)
ACE_Cholesky_Summ
Generate Multivariate Cholesky ACE Output
parameterSpecifications(ACE_Cholesky_Fit)
expectedMeansCovariances(ACE_Cholesky_Fit)
tableFitStatistics(ACE_Cholesky_Fit)
ACEpathMatrices <- c("ACE.a","ACE.c","ACE.e","ACE.iSD","ACE.iSD %% ACE.a","ACE.iSD %% ACE.c","ACE.iSD %*% ACE.e")
ACEpathLabels <- c("pathEst_a","pathEst_c","pathEst_e","sd","stPathEst_a","stPathEst_c","stPathEst_e")
formatOutputMatrices(ACE_Cholesky_Fit,ACEpathMatrices,ACEpathLabels,Vars,3)
ACEcovMatrices <- c("ACE.A","ACE.C","ACE.E","ACE.V","ACE.A/ACE.V","ACE.C/ACE.V","ACE.E/ACE.V")
ACEcovLabels <- c("covComp_A","covComp_C","covComp_E","Var","stCovComp_A","stCovComp_C","stCovComp_E")
formatOutputMatrices(ACE_Cholesky_Fit,ACEcovMatrices,ACEcovLabels,Vars,3)
Genetic and Environmental Correlations
ACEcorMatrices <- c("solve(sqrt(ACE.IACE.A)) %% ACE.A %% solve(sqrt(ACE.IACE.A))",
"solve(sqrt(ACE.IACE.C)) %% ACE.C %% solve(sqrt(ACE.IACE.C))",
"solve(sqrt(ACE.IACE.E)) %% ACE.E %% solve(sqrt(ACE.IACE.E))",
"solve(sqrt(ACE.IACE.V)) %% ACE.V %% solve(sqrt(ACE.IACE.V))")
ACEcorLabels <- c("corA","corC","corE","cor")
formatOutputMatrices(ACE_Cholesky_Fit,ACEcorMatrices,ACEcorLabels, Vars, 4)
If possible, try not to cut down a Cholesky decomposition to find the best fit. The Cholesky is typically not a model per se but just an algorithmic trick to keep the predicted covariance matrix (and indeed the genetic/environmental ones) positive definite. It would be much better to start from a common pathway model with a given number of factors, and reduce parameters from there. Better still would be to have a hypothesized factor structure, and see if that fits as parsimoniously (say by AIC) as the Cholesky.
Note that the model in his script fits a tetraphenotype ACE twin model. The only "Cholesky" part is that each 4x4 variance component (A, C, & E) is parametrized in terms of its Cholesky factorization.
It sounds like you want to use
omxSetParameters()
. If you have the OpenMx package loaded into R, you can search for help on this function with?omxSetParameters
For instance, suppose you want to drop the shared-environmental "C" component. To use
omxSetParameters()
, you would need to know the labels of the parameters you want to drop. You don't assign any labels to your lower-triangular "c" matrix, but some labels should automatically have been assigned to its free parameters at runtime, so you should see something if you enterACE_Cholesky_Fit$c@labels
at the R prompt. The call toomxSetParameters()
would look something likeHere's an example for a similar case, except involving only one phenotype.
BTW, lengthy R scripts posted to the forums are much easier to read if you attach them to your post as a text file, or if you enclose the syntax in the appropriate HTML tags.
My thanks to the both of you for your insight. The model I'm running is actually a three factor model; I just adapted the Cholesky code for a four factor model that I found on the forums, changing the number of Vars to 3, and total Vars to 6.
The reason I'm using a Cholesky and trying to cut it down to a "best fit" is that I'm trying to replicate the methods of a paper which had previously studied the same thing. This paper is also the only paper at the moment which studies the same construct with genetic analyses.
I will certainly try the ?omxSetParameters. With three factors, however, do you suppose a Common Pathways model would be significantly more fruitful, or indeed, easier, in terms of getting a best fit model?
Thanks so much again.
I've tried using the omxSetParameters function, but there doesn't appear to be any labels. Closer inspection shows that the "labels" within the ACE_Cholesky_Fit are all "NA", which stops me from setting any parameters in the first place.
>ACE_Cholesky_Fit$c@labels
Error: trying to get slot "labels" from an object of a basic class ("NULL") with no slots
Is there a way to go about labelling each pathway so I can fix an individual parameter? As I see it, for example I'm trying to fix only the "stPathEst_c2" of the Cholesky output, and then run it again in that state. Here's an example of my output:
[1] "Matrix ACE.iSD %*% ACE.c"
stPathEst_c1 stPathEst_c2
ffswanhim 0.418 0.000
ffswanin 0.270 0.034
LogSCT 0.117 0.091
stPathEst_c3
ffswanhim 0.000
ffswanin 0.000
LogSCT 0.000
Is "stPathEst_c2" a label I could just go ahead and fix with a code? Like
ACE_Cholesky_Model <- omxSetParameters(ACE_Cholesky_Fit, labels = stpathEst_c2, free=FALSE, values=0, name"ACE_Cholesky") ?
But then, how would I even fix more specific pathways, like just c2 for "LogSCT"?
Sorry, I was mistaken in thinking that OpenMx would assign labels to unlabeled free parameters at runtime. If the umx library has a convenient function to add labels to your matrices, then I second Tim Bates' suggestion above to use umx.
Or, you could do it manually, e.g. define your "c" matrix as
I missed the fact that your MxModel "ACE_Cholesky" contains a submodel "ACE." To access the labels of your "c" matrix, I think you would (assuming you're using OpenMx version 1.x) do
ACE_Cholesky_Fit$ACE$c@labels
.Everything in your script after
ACE_Cholesky_Summ <- summary(ACE_Cholesky_Fit)
appears to be output formatting, and is extraneous to OpenMx's actual model-fitting. So, no, "stPathEst_c2" is not a label that OpenMx would understand.Is "LogSCT" one of your phenotypes?
One way to label the cells in your model is to use the umx library of helper functions.
It contains umxLabel() wbich, given a matrix model as input labels each cell with the matrix name, row and column number like "a_r1_c2"
Play with it and its examples a bit to see how it works.
http://tbates.github.io