Model including groups with both binary and continuous/ordinal data

Posted on Thu, 12/08/2011 - 19:58

ebejer Joined: 03/18/2010

Attachment

Size

twinModels.R

9.21 KB

Forums

Behavioral Genetics Models

Hi,

I have heard that it is now possible to run a model estimating genetic and environmental effects with twin data that includes separate variables with binary and continuous (or ordinal?) coding.

I have been running a thresholds model with age and sex as covariates (which I will attach) and I would like to include the same behavioural measures from different data sets in the one model.

Two different scales have been used to collect this information (binary and continuousish), and I want to equate the A, C and E paths across datasets with the differing scales. I have been told this will necessitate the inclusion of a group for each dataset. But I am also unsure how to do this.

Is this possible and can you help? - and forgive me if I am missing some important technical detail.

Best wishes
Jane

Replied on Thu, 12/08/2011 - 21:24

neale Joined: Jul 31, 2009

Can be done but there's likely a problem.

Jane

Since the scales are different, there is likely to be differences in mean and variance between the two. Possibly, what you want to know is whether the proportions of A C and E are the same for the two scales?

The approach you take depends on whether the two scales were obtained from the same or different pairs of twins (i.e. the same sample for both scales or different samples). In the former case, you would not increase the number of groups, but make a multivariate model. In the latter - different samples - case you would add groups. Equating the proportions of variance could be done at least two ways. The simplest would be to use algebras to compute, e.g., A/(A+C+E) for the two measures, and to add a non-linear constraint that equates the two (all three could be cbind()'ed on each side of the ==, but since they are proportions, two would be sufficient). In the case of measures on the same sample, you would set up a multivariate analysis (say with a Cholesky model) and equate the two diagonal elements of the (now 2x2) A/(A+C+E) matrix and the C/(A+C+E) matrix. Extracting the elements from the diagonal could be done with A[1,1] type notation, so you might try A[1,1]/(A[1,1]+C[1,1}+E[1,1] == A[2,2]/(A[2,2]+C[2,2]+E[2,2]. I think. If it doesn't work then, well then we'll learn something making it do so :).

HTH
Mike

Replied on Sat, 12/10/2011 - 03:26

ebejer Joined: Mar 18, 2010

I am using data from two

I am using data from two separate samples so I need to include separate calculation groups. And I do want to compare the genetic and environmental effects across the approximately same variables in the two samples. One variable represents a normal distribution (continuous) and the other is represented with a binary code.

Unfortunately I haven't been able to find a model running both continuous and ordinal /binary data including separate groups. So what I have done is put together a mishmash of script that I though might work - but I don't really understand how the matrices will combine and the use of certain code. I will attach my script.

There is a matrix used to specify the values for each a, c and e path (cholSVnv) I'm not sure why - and I don't understand the abbreviation. I'm also unsure where and how to add additional groups. I thought perhaps as two sets of mz and dz models (one for each sample). And I'm not really up to interpreting the meaning of your comments yet Mike, sorry.

Can you please give me a tip on (1) how to add a calculation group, (2) whether I need a lower cholesky for 2 vars to represent the a, c and e paths (if that's what they are), and (3) what I need to specify in the objective function to include both variables in the model and (4) the order of how the matrices combine to provide the output.

perhaps I'm asking too much?

Jane

Replied on Sat, 12/10/2011 - 14:36

neale Joined: Jul 31, 2009

Not too much but I'm a bit pressed for time just now

Jane

If you have two univariate scripts, one for the ordinal, and one for the continuous data, I would be tempted to try combining them with something like the following

jointACE <- mxModel(aceOrd, aceCont, mxAlgebraObjective(aceOrd.Objective+aceCont.Objective))

at least this would get you started. In fact I think you are on the right track with your script as is, but you need an mxAlgebraObjective to combine the objective functions from all four groups. Something like

mxAlgebra( expression=MZ1.objective+DZ1.objective+MZ2.objective+DZ2.objective, name="minus2sumloglikelihood" ),
mxAlgebraObjective("minus2sumloglikelihood")

Adding the mxConstraint would be needed to equate the heritabilities, per previous post.

Replied on Mon, 12/12/2011 - 23:21

ebejer Joined: Mar 18, 2010

Hello again, I have found a

Hello again,

I have found a model that seems to combine the two separate datasets, however it returns a code red and there are two SD's that are quite large - I have played around with the starting values and the threshold and what I have in the script appears to work best. I am wanting to get some feedback on the 'reasonableness' of the model I have specified and what I might need to change to change the code red status and reduce the standard errors. Any suggestions will be greatly appreciated (I'll attach the output and script below).

Thanks,
Jane

Replied on Wed, 12/14/2011 - 22:18

smedland Joined: Aug 04, 2009

We have a version of the

We have a version of the script that works now. One data set has continuous data and the other has a binary measure of the same trait. To identify the ordinal model we have fixed e at 1 and estimated a and c.
The only way I can think of to 'equate' the standardised A,C and E estimates is through the mxConstraint function.
This does work and the results look sensible. But because we've used mxConstraint we lose the SE.
We can still get CI but it would be nice to have the se as well.

Can anyone think of an alturnate apporach?

thanks
Sarah (on behalf of Jane)

Replied on Fri, 12/09/2011 - 05:07

ebejer Joined: Mar 18, 2010

Thanks Mike, I'll start

Thanks Mike, I'll start testing models, you may here from me again soon.
j

Replied on Thu, 12/15/2011 - 02:08

neale Joined: Jul 31, 2009

Aha

Set E to 1 for continuous variable also - and estimate an SD parameter for the continuous variable. So you pre- and post-multiply the continuous expCov by diag matrix with sd on diagonal. Then you have the option to equate A and C parameters without nonlinear constraint.

Replied on Fri, 06/15/2018 - 18:41

massimiliano.orri Joined: Jun 11, 2018

Problem with bivariate model

Hi,
I am new to openMx, and I cannot figure out what is wrong in my model. I am trying to estimate a bivariate model in which one variable is continuous and the other is binary (for which I preliminarily used mxFactor). When I try to estimate the model, I obtain an error which says that my continuous variable is not categorical and that I should use mxFactor()...
I am pretty sure something is wrong in my model specification.
Here my model. Thanks for any help!
Massimiliano


# Continuous variable

nvC        <- 1       # number of continuous variables

ntvC       <- nv*2    # number of total continuous variables

selVarsC   <- c('varC')
# Binary variable

nvB        <- 1       # number of binary variables

ntvB       <- nv*2    # number of total binary variables

selVarsB   <- c('varB')
# Set Starting Values

svMe      <- 7                      # start value for means

svTh      <- 0

svPa      <- .2                        # start value for path coefficient

svPe      <- .5                        # start value for path coefficient for e
# ----------------------------------------------------------------------------------------------------------------------

# PREPARE MODEL
# Create Algebra for expected Mean Matrices

meanG     <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
# Create Algebra for expected Thresholds Matrices

threshold 	<- mxMatrix( type="Full", nrow=1, ncol=ntvB, free=TRUE, values=svTh, labels=labVars("thre", selVarsB), name="threshold" )
# Create Matrices for Variance Components

covA      <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPa,nv), labels=labLower("VA",nv), name="VA" )

covC      <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPa,nv), labels=labLower("VC",nv), name="VC" )

covE      <- mxMatrix( type="Symm", nrow=nv, ncol=nv, free=TRUE, values=valDiag(svPe,nv), labels=labLower("VE",nv), name="VE" )
# Create Algebra for expected Threshold Matrices

expThresh   <- mxAlgebra( expression= threshold , name="expThresh" )
# Create Algebra for expected Variance/Covariance Matrices in MZ & DZ twins

covP      <- mxAlgebra( expression= VA+VC+VE, name="V" )

covMZ     <- mxAlgebra( expression= VA+VC, name="cMZ" )

covDZ     <- mxAlgebra( expression= 0.5%x%VA+ VC, name="cDZ" )

expCovMZ  <- mxAlgebra( expression= rbind( cbind(V, cMZ), cbind(t(cMZ), V)), name="expCovMZ" )

expCovDZ  <- mxAlgebra( expression= rbind( cbind(V, cDZ), cbind(t(cDZ), V)), name="expCovDZ" )
# Create Data Objects for Multiple Groups

dataMZ    <- mxData( observed=mzData, type="raw" )

dataDZ    <- mxData( observed=dzData, type="raw" )
# Create Expectation Objects for Multiple Groups

expMZ     <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=selVars ,	thresholds="threshold")

expDZ     <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=selVars ,	thresholds="threshold")

funML     <- mxFitFunctionML()
# Create Model Objects for Multiple Groups

pars      <- list( meanG, threshold, covA, covC, covE, covP )

modelMZ   <- mxModel( pars, covMZ, expCovMZ, dataMZ, expMZ, funML, name="MZ" )

modelDZ   <- mxModel( pars, covDZ, expCovDZ, dataDZ, expDZ, funML, name="DZ" )

multi     <- mxFitFunctionMultigroup( c("MZ","DZ") )
# Create Algebra for Standardization

matI      <- mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I" )

invSD     <- mxAlgebra( expression=solve(sqrt(I*V)), name="iSD" )
# Calculate genetic and environmental correlations

corA      <- mxAlgebra( expression=solve(sqrt(I*VA))%&%VA, name ="rA" ) #cov2cor()

corC      <- mxAlgebra( expression=solve(sqrt(I*VC))%&%VC, name ="rC" )

corE      <- mxAlgebra( expression=solve(sqrt(I*VE))%&%VE, name ="rE" )
# Calculate Phenotypic Correlation
corP      <- mxAlgebra (expression=solve(sqrt(I*V)) %*% V %*% solve(sqrt(I*V)), name="rP")
# Create Algebra for Variance Components

rowUV     <- rep('UV',nv)

colUV     <- rep(c('VA','VC','VE'),each=nv)

estUV     <- mxAlgebra( expression=cbind(VA,VC,VE), name="UV", dimnames=list(rowUV,colUV) )

rowSV     <- rep('SV',nv)

colSV     <- rep(c('SA','SC','SE'),each=nv)

estSV     <- mxAlgebra( expression=cbind(VA/V,VC/V,VE/V), name="SV", dimnames=list(rowSV,colSV) )
# Create Confidence Interval Objects

ciUV      <- mxCI( c("UV") )

ciSV      <- mxCI( c("SV") )
# Build Model with Confidence Intervals

calc      <- list( matI, invSD, corA, corC, corE,corP, estUV, estSV, ciUV, ciSV )

modelACE  <- mxModel( "twoACEvc", pars, modelMZ, modelDZ, multi, calc )
# ----------------------------------------------------------------------------------------------------------------------

# RUN MODEL
# Run ACE Model

fitACE    <- mxRun( modelACE, intervals=T )

sumACE    <- summary( fitACE )

Replied on Mon, 06/18/2018 - 10:11

AdminRobK Joined: Jan 24, 2014

selVars?

You're passing dimnames=selVars as argument to mxExpectationNormal(), but object selVars is nowhere defined in your script. But the fact that your script runs at all means that selVars existed in your R workspace. I bet selVars, however it was defined, doesn't contain the column names for your mxFactor'd binary variable.

Replied on Wed, 06/20/2018 - 11:35

massimiliano.orri Joined: Jun 11, 2018

Follow-up question

Thank you very much Rob. You are right, it was indeed because of that. I fixed it.

A follow up question, as I am still struggling specifying this model...I think my problem is that I don't understand how to specify the expectationObject when I have 1 continuous and 1 binary variable. The dimension of the matrices and the dimnames I specified is always wrong.

If I specify only the 2 binary variables (for twin 1 and twin2) in the dimnames option, it seems that I am missing 2 variables, because the expected covariance matrix has 4 rows and columns. I specified as follows:
varsC <- c('y1') nvC <- 1 # number of continuous variables ntvC <- nvC*2 # number of total continuous variables selVarsC <- paste(varsC,c(rep("_T1",nvC),rep("_T2",nvC)),sep="")


varsB   <- c('u1')

nvB        <- 1       # number of binary variables

ntvB       <- nvB*2    # number of total binary variables

selVarsB   <- paste(varsB,c(rep("_T1",nvB),rep("_T2",nvB)),sep="")
# Create Matrices for expected Mean Matrices

meanG     <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
# Create Matrices for expected Thresholds Matrices

threshold 	<- mxMatrix( type="Full", nrow=1, ncol=ntvB, free=TRUE, values=svTh, labels=labVars("thre",selVarsB), name="threshold" )
[...]

# Create Expectation Objects for Multiple Groups expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB), thresholds="threshold") expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB), thresholds="threshold")
and the error I obtains is:
Error: The expected covariance matrix associated with the expectation function in model 'MZ' is not of the same length as the 'dimnames' argument provided by the expectation function. The 'dimnames' argument is of length 2 and the expected covariance matrix has 4 rows and columns.

If I specify all my variables in the dimnames option (as follows):
expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold") expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold")

I obtain the following error, of course:
Error: The expected means vector associated with the expectation function in model 'MZ' is not of the same length as the 'dimnames' argument provided by the expectation function. The 'dimnames' argument is of length 4 and the expected means vector has 2 columns.

Should I perhaps have different expectation Objects for continuous and binary variables?
Or how can I specify that 2 variables are referring to the mean matrix while the 2 other variables to the threshold matrix?

I looked for an exemple of the openMx documentation, but I could not resolve the issue so far.

Thanks again for any help!

Replied on Wed, 06/20/2018 - 11:55

massimiliano.orri Joined: Jun 11, 2018

Maybe I solved the issue

Perhaps I should specify the threshnames argument in the mxExpectationNormal function:
It seems that now it's running...

# Create Expectation Objects for Multiple Groups expMZ <- mxExpectationNormal( covariance="expCovMZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold", threshnames=selVarsB) expDZ <- mxExpectationNormal( covariance="expCovDZ", means="meanG", dimnames=c(selVarsB,selVarsC), thresholds="threshold", threshnames=selVarsB) funML <- mxFitFunctionML()

Replied on Mon, 06/25/2018 - 14:08

AdminRobK Joined: Jan 24, 2014

still not right

Perhaps I should specify the threshnames argument in the mxExpectationNormal function

That's right.

Do I understand correctly that the dimnames you're passing to mxExpectationNormal() are, in order, "u1_T1", "u1_T2", "y1_T1", "y2_T2"? If so, something's wrong. From the way your script is written, it should be both traits for twin #1 followed by both traits for twin #2, e.g. "u1_T1", "y1_T1", "u1_T2", "y1_T2".

I also don't see anything in your script that fixes either the mean or the threshold for the binary trait, nor anything that identifies the scale (variance) of the binary trait. Just to be clear: when you analyze a threshold variable in OpenMx, the variable is treated as though it reflects an unobserved, normally distributed continuum that has been "chopped" into two or more ordered categories. What is really being modeled is that latent Gaussian trait. In the binary case, in order to identify your model, you need to assume some value for either the latent trait's mean or the value of the threshold, and you need to assume a value for some parameter that scales the latent trait. I can't really make any specific suggestions about that until I'm sure that you have the dimnames for your expectations correct. I am actually surprised that your syntax runs with this,
meanG <- mxMatrix( type="Full", nrow=1, ncol=ntvC, free=TRUE, values=svMe, labels=labVars("mean",selVarsC), name="meanG" )
, which only has 2 columns when it ought to have 4 (i.e., the binary variables need model-expected means, too).

News

Recent Posts

Model including groups with both binary and continuous/ordinal data

Can be done but there's likely a problem.

I am using data from two

Not too much but I'm a bit pressed for time just now

Hello again, I have found a

We have a version of the

Thanks Mike, I'll start

Aha

Problem with bivariate model

selVars?

Follow-up question

Maybe I solved the issue

still not right

News

Recent Posts