You are here

Theoretical question

9 posts / 0 new
Last post
luna.clara's picture
Offline
Joined: 12/23/2012 - 05:28
Theoretical question

Hi all,
I have a theoretical question.
In what ways apparent qualitative and quantitative sex differences are or are not addressed by simply adjusting for sex in a multivariate model?
The thing is that I observed in different univariate models that 3 of my 6 variables (for the multivariate model) had quantitative or qualitative sex differents... then I adjusted for sex and age in the multivariate model, but I'd like to explain in which way those genetic sex differences in some of the variables are addressed or not in the multivariate model just by adjusting for sex. Does it make sense?
Many thanks in advance!

luna.clara's picture
Offline
Joined: 12/23/2012 - 05:28
Any tips for the question

Any tips for the question above??
Many thanks!

AdminHunter's picture
Offline
Joined: 03/01/2013 - 11:03
more information?

Without being more familiar with certain BG terms or a more detailed explanation from you of how you "adjusted" for sex and age, I wouldn't be able to speculate.

luna.clara's picture
Offline
Joined: 12/23/2012 - 05:28
Thanks for all this help and

Thanks for all this help and information.

Sorry I was not too much informative with muy question.

First I performed the qualitative and quantitative models for each of my variables one by one (univariate analysis). And I saw that some, but not all, of the variables accounted for genetic sex differences.

Second, I did a multivariate twin analysis but using the residuals of each continuous variable after being adjusted by gender and age in a regression model (following other previous papers in the same field).

My question is: In which sense are/are not addressed those genetic sex differences (found in the univariate analysis) by simply adjusting for sex in the multivariate analysis? How can I justify this?

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Let me try again

First, when you say "some, but not all, of the variables accounted for genetic sex differences," you mean that you observed evidence of genetic sex differences for some, but not all, of the phenotypes, correct? I'll assume so, but if not, then I'm still a bit confused about the kind of analysis you're doing.

For reasons explained in McGue & Bouchard (1984), it is usually advisable to adjust for sex (and age) when conducting biometric analysis of twin data. Imagine if the phenotype were height. Males are taller than females on average, so if you ignored the main effect of sex, the phenotypic correlation for MZ twins (and same-sex DZ twins) would be spuriously high. If you didn't adjust for sex in your monophenotype analyses, then their results are harder to interpret, especially if you have both same-sex and opposite-sex DZ twin pairs in your sample, which seems to be the case.

But, the genetic sex differences you refer to are distinct from the main effect of sex on the phenotypic mean (as in a regression). They reflect sex-related differences in the residual variance-covariance structure, once the main effect of sex has been partialled out. If quantitative sex differences are present, then the decomposition of the residual variance into A, C, and E (or whatever) components is not the same for males and females. If qualitative sex differences are present, then the genetically mediated phenotypic resemblance between OS-DZ twins will be less than that between same-sex DZ twins.

In summary, partialling out the main effect of sex is something you should generally be doing in the first place, and if you didn't do so in your monophenotype analyses, I'm not sure how much you should trust their results. However, I don't see any reason why you couldn't do the "multivariate" analysis you describe if you want to investigate quantitative and qualitative sex differences.

Does that clear things up for you?

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Technical note

This is a technical note only tangentially related to your question, so I'm putting it in a separate post.

using the residuals of each continuous variable after being adjusted by gender and age in a regression model (following other previous papers in the same field).

While it is certainly true that this approach has frequently been used by behavior geneticists, it is sub-optimal 1980s way of doing the analysis, and better alternatives exist. Instead, I suggest incorporating the regression onto age and sex into your main analysis; in OpenMx, you would use them as "definition variables," and condition the model-expected means on them.

tbates's picture
Offline
Joined: 07/31/2009 - 14:25
no missing allowed in definition variables...

Bearing in mind that when the model contains definition variables no missing values are allowed in any rows of the covariates. This usually results in numerous whole rows being dropped along with the precious phenotype data, and even twins with no missingness but where the co-twin is missing a covariate, no?

As the list of covariates grows, it's common in lm() type analyses to end up with big reductions in n, which is what we'd like avoid with FIML.

It's common in regression world to see advocacy for multiple imputation as the best response in that case.

PS: Is there a paper showing that using a definition variable to model the means generates outcomes that differ from simply running the analysis on the residuals? If so, I'd like to read it, if not, that sounds like a low-hanging paper to write which would likely be cited a few hundred times (Bouchard = 400+ and counting) if accompanied with example models...

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Missing data
Bearing in mind that when the model contains definition variables no missing values are allowed in any rows of the covariates. This usually results in numerous whole rows being dropped along with the precious phenotype data, and even twins with no missingness but where the co-twin is missing a covariate, no?

I deal with that problem by setting the missing definition variable to a "pseudo-missing" value--something like -999. Then, I set the corresponding phenotype value to NA. Then, with FIML, OpenMx works around that missing value on the phenotype, and the pseudo-missing value on the definition variable is ignored. This is a trick Matt McGue came up with for classic Mx. The idea is that if twin B's age (or whatever) is missing, we cannot condition the model for twin B's phenotype on his/her age, so we cannot use that phenotype score. It amounts to person-wise, rather than pairwise, deletion of incomplete cases, which is what OLS regression would do anyhow.

If you are only interested in the main effects of definition variables, a better strategy (from a missing data standpoint) is to treat them as random regressors, and model their covariance with the phenotypes as a regression of the phenotypes onto them. They will no longer be "definition variables" as far as the OpenMx is concerned, and you will instead be modeling the joint distribution of phenotypes and covariates. This will work around the missing values on those covariates via FIML.

I agree that there is a lot to be said in favor of dealing with missing data via multiple imputation, though I have never used it in a twin analysis. Even so, incorporating the regression into the biometric analysis doesn't preclude using multiple imputation. How is the multiple imputation usually done, in your experience?

I am not aware of any publication that speaks to this matter, but it is one of a few "bad habits of twin researchers" I have thought about writing a paper or giving a talk about. I'm prepared to make statistical-theoretical arguments against separately residuallizing the phenotype prior to the biometric analysis. However, from a practical standpoint, I suspect that it is relatively benign, in that doing it that way versus doing it my way won't usually make a large difference in the results. But that's something that would be best answered via Monte Carlo experiments...

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Hi, Clara. I'll take a stab

Hi, Clara. I'll take a stab at answering your question. I'm pretty sure I understand what you're asking, but not 100% sure, so you might have to clarify (Clara-fy? haha!) a bit.

When you "adjust for sex," you are incorporating sex into the model for the phenotypic means. This is essentially the same thing as regressing the phenotypes onto sex. In contrast, the sex differences to which you refer both pertain to the model for the variance (i.e., the variance-covariance structure). If quantitative sex effects are present, then the decomposition of the phenotypic variance into A, C, and E (or whatever) components is not the same for males and females. For instance, perhaps the additive-genetic component is larger, and the shared-environmental component is smaller, for females than for males.

Qualitative sex effects are detectable if opposite-sex DZ twin pairs are represented in the sample. Specifically, if the genetically-mediated phenotypic resemblance between OS-DZ twins is smaller than for same-sex DZ twins, then the interpretation is that qualitatively different genetic polymorphisms underlie the phenotype for males and females. That is, the genetically-mediated resemblance between OS-DZ twins is deflated because there are mutations that are trait-relevant for males but not for females, and/or vice versa. I suppose in principle a similar analysis could be done for shared-environmentally mediated resemblance, though I don't recall ever seeing it done.

Does that help?