ACE Model

Posted on
No user picture. Noandria1 Joined: 01/06/2017
Hey guys. I am trying to do an ACE Model with my own twin data for my master thesis.
So far I had not a lot to do with statistics and also the Program R was new for me, so I spent a lot of time to get into the topic.
As basic skript I used http://openmx.psyc.virginia.edu/docs/OpenMx/2.6.7/GeneticEpi_Matrix.html

I have some questions for you and hope someone can help me with them.

1. I tested the skript with the standard twinData set and tried to achieve the same output as they did in the paper: Estimating Heritability from Twin Studies (from Karin Verweij). I was able to reconstruct the same results expect the c^2 value was different. Could someone tell me how to get the correct c^2 value (compare skript)?

2. In the attachment you will find a file with some variables from my twin study. How do I determine the starting values? Can I use the Falconer's formula to set the start values?

PS: Skript and dataset are attached! Duration 1 stands for sleep duration in twin 1 and duration 2 for the sleep duration twin 2.

I would be greatful for your help.
Best
Andrea

Replied on Wed, 01/18/2017 - 13:39
Picture of user. AdminRobK Joined: 01/24/2014

1. I tested the skript with the standard twinData set and tried to achieve the same output as they did in the paper: Estimating Heritability from Twin Studies (from Karin Verweij). I was able to reconstruct the same results expect the c^2 value was different. Could someone tell me how to get the correct c^2 value (compare skript)?

How different is the c² value from what's in the paper? Also, which version of OpenMx are you running? The paper you reference looks to be from 2011, so its authors would have been using OpenMx v1.x. If you're running v2.2 or newer, then the default optimizer would be different from what it was in v1.x. If you installed OpenMx from our website (and not from CRAN), you can switch to the old optimizer by adding

mxOption(NULL,"Default optimizer","NPSOL")

to your script at some point after loading OpenMx. Also, does the paper say what operating system and CPU its authors used? It's possible to get somewhat different results if you're not using the same platform as they (although NPSOL behaves quite consistently across platforms). If you think that the optimizer is stopping at a local minimum, you can try replacing mxRun() with mxTryHard() in your script.

2. In the attachment you will find a file with some variables from my twin study. How do I determine the starting values? Can I use the Falconer's formula to set the start values?

The start values just need to be reasonable. Since the free parameters in your case are the square roots of the biometric variance components, the square root of the phenotypic variance divided by 3 should be reasonable. You could also use Falconer-style start values calculated from the MZ and DZ covariances. If finding good start values is an issue (I don't think it will be), you could try replacing mxRun() with mxTryHard() in your script.

Edit: you might also be interested in the umxACE() function from the 'umx' package.

Replied on Wed, 01/18/2017 - 15:32
No user picture. Noandria1 Joined: 01/06/2017

In reply to by AdminRobK

First of all, thank you very much for your response. Nobody from our institute worked before with ACE Models and openMx so I was kind of lost.

1. The c² is 2.332136e^-14 instead of 4.377829 e^-14. I installed OpenMx now from our website and add the recommended optimizer code after the require(OpenMx) command. Funny enough I get know the value 7.337437e^12 for c². And also the numbers for the estimated a11, c11 and e11 have the wrong sign ;( But I don't think it is to important because for my results I only have to make a table with -2LL, df, AIC, A, C and E values (and therefore I read it is common to use the standardized ones). They say nothing about the platform they used.

2. I somehow don't understand how it doesn't really matter which starting values I choose (they just need to be reasonable).
Is it okey if I choose for svPa <- 0.422 (Falconer-style start)? What about svMe? Could you make an example with my data how you would make it? (Sorry it is hard to understand for me, I am quite new (2 month) in this field)

I am working on a data set with 10 DZ and 15 MZ twin pairs. I measured 8 different parameters for sleep. Is it right if I do for each parameter a different ACE,CE,AE model (choosing the best according to AIC value) or am I barking up the wrong tree?
(Sorry may a stupid question, but my social enviroment only gave me the data and are not able to help me with the analysis)

Replied on Wed, 01/18/2017 - 17:05
Picture of user. AdminRobK Joined: 01/24/2014

In reply to by Noandria1

1. The c² is 2.332136e^-14 instead of 4.377829 e^-14. I installed OpenMx now from our website and add the recommended optimizer code after the require(OpenMx) command. Funny enough I get know the value 7.337437e^12 for c². And also the numbers for the estimated a11, c11 and e11 have the wrong sign ;( But I don't think it is to important because for my results I only have to make a table with -2LL, df, AIC, A, C and E values (and therefore I read it is common to use the standardized ones). They say nothing about the platform they used.

That discrepancy in c² values is too small to be of concern. All three of those values are basically zero [edit: I think? Should 7.337437e^12 actually be 7.337437e^-12?] Also, the path coefficients labeled "a11", "c11", and "e11" are square roots of the corresponding variance components, so their sign is immaterial.

2. I somehow don't understand how it doesn't really matter which starting values I choose (they just need to be reasonable).

The optimizer is trying to find values of the free parameters that minimize the fitfunction, -2logL (equivalent to maximizing the likelihood). It has to start its search someplace, namely, at the start values we give it. Giving it reasonable start values helps it find a minimum (i.e, a solution) more quickly., and make it more likely that the minimum it finds is the global minimum, out of the whole parameter space. It's possible (depending on the difficulty of the optimization problem) for the optimizer to reach the same minimum from different sets of reasonable start values. By "reasonable" start values, we mean start values that are somewhat close to the free parameter values at the solution, that satisfy all explicit and implicit constraints on the optimization problem, and which do not create numerical difficulties for the optimizer. You might be interested in some relevant presentations from an OpenMx workshop, here and here.

Is it okey if I choose for svPa <- 0.422 (Falconer-style start)? What about svMe? Could you make an example with my data how you would make it? (Sorry it is hard to understand for me, I am quite new (2 month) in this field)

I can't even get as far as the descriptive statistics in your script without making some changes. See below:

# zyg as factor
Data$zyg <- factor(Data$zyg, levels= c(1:2),labels= c("MZ", "DZ")) #<--Probably unnecessary
#OpenMx does not allow periods in data column names:
colnames(Data) <- c("Subjects","zyg","duration_1","duration_2","quality_1","quality_2")

### loading packages
require(OpenMx)
require(psych)

# Load Data
describe(Data, skew=F)

# Prepare Data
Vars <- 'duration' # choosing variable
nv <- 1 # nv = number of variables
ntv <- nv*2 # number of total variables (number types of twins(2) * number of variables
selVars <- c("duration_1", "duration_2")# selecting variables according to colume name
selVars

# Select Data for Analysis
mzData <- subset(Data, zyg=="MZ", selVars) #<--'zyg==1' may have worked in a previous version of R...
dzData <- subset(Data, zyg=="DZ", selVars)

# Generate Descriptive Statistics
colMeans(mzData)
colMeans(dzData,na.rm=TRUE)
mzcov <- cov(mzData,use="complete")
dzcov <- cov(dzData,use="complete")

# Set Starting Values
svMe <- mean(c(Data$duration_1,Data$duration_2),na.rm=T) # start value for means
svPa <- sqrt(2*(mzcov[1,2]-dzcov[1,2]))
svPc <- sqrt(2*dzcov[1,2]-mzcov[1,2])
svPe <- sqrt(mzcov[2,2]-mzcov[1,2])

# ACE Model
# Matrices declared to store a, c, and e Path Coefficients
pathA <- mxMatrix( type="Full", nrow=nv, ncol=nv,
free=TRUE, values=svPa, label="a11", name="a" ) # matrix names with 11 referring to the first row and column of the matrix
pathC <- mxMatrix( type="Full", nrow=nv, ncol=nv,
free=TRUE, values=svPc, label="c11", name="c" )
pathE <- mxMatrix( type="Full", nrow=nv, ncol=nv,
free=TRUE, values=svPe, label="e11", name="e" )

I am working on a data set with 10 DZ and 15 MZ twin pairs. I measured 8 different parameters for sleep. Is it right if I do for each parameter a different ACE,CE,AE model (choosing the best according to AIC value) or am I barking up the wrong tree?
(Sorry may a stupid question, but my social enviroment only gave me the data and are not able to help me with the analysis)

Ordinarily I'd recommend doing a multitrait analysis of all 8 phenotypes, but your sample size is so small that the approach you suggest here--doing 8 separate monophenotype analyses--is probably best in your case.

Also, unless your twins are all the same sex and have minimal variation in age, you're going to want to adjust for age and sex in your model's means.

Replied on Wed, 01/18/2017 - 17:55
No user picture. Noandria1 Joined: 01/06/2017

In reply to by AdminRobK

Thanks Rob for your excellent explications, you are great.
The small data set was as the reason I was thinking about doing 8 separate monphenotype analyses. The focus on our study was the duration of measurement (6 months of measured nights by actigraphy). I also planned to do the ACE analysis separate for weekends and weekdays (to see if genetic effects on the parameters are more domiant during weekends). As I mentioned before I am the first person from our Institute working with SEM. I don't have really a contact person I can discuss the best steps with. Do you think my considerations regarding comparing monphenotype ACE for weekend and weekdays make sense?

All the twins were between 12.25 and 16.5 years old and as well different genders. It seems that I have to adjust for age and sex as well.

I really appreciate your help!