heritability estimation from gene expression data
Posted on

Forums
Hi,
I am new to the twins studies field. I am hoping for help and support from this forum.
I want to analyse the gene expression data from twins in order to estimate the heritability. I have seen many studies with other phenotypes but not much with gene expression data. Has anyone worked with expression data?... can anyone point me in a correct direction?
Thanks
Nobody knows :(
Log in or register to post comments
In reply to Nobody knows :( by pinki999
Good Thing!
What form do the expression data take? I imagine very large number of observations from an expression array, rather like the analysis of voxel level data from MRI scans of the brain. Most of the task is to feed the expression measures through a pipeline (an R function) and summarize the results (perhaps as a heatmap plot like individual expression arrays).
Do you have a specific question?
Log in or register to post comments
In reply to Good Thing! by neale
Hi, The main focus is to
The main focus is to carry out the heritability analysis. In case of gene expression data, I think we will get an estimate for each gene. I am not sure how the data should look like if I want to carry out the heritability analysis. Currently my data is in a matrix format with each row corresponding to a gene and column corresponding to a sample. I also have zygosity information, age and gender information. It would be great if anyone can suggest me how to proceed. An example in this direction would be a great start for me.
Thank you
Log in or register to post comments
In reply to Hi, The main focus is to by pinki999
It has been done, but not in
The implementation was done in R using lmer(). The methodology is based on:
Visscher, P.M., Benyamin, B. & White, I. The use of linear mixed models to estimate variance components from data on twin pairs by maximum likelihood. Twin Res. 7, 670–674 (2004).
Cheers,
Ana.
Log in or register to post comments
In reply to It has been done, but not in by anav
It could be done in OpenMx though
We have discussed keeping the backend objects (a sort of pre-compiled mxModel) to skip the time spent sanity-checking the model in the front end for every analysis. This would be efficient with either MRI data, or say fitting the same model with a large number of SNPs along the genome, or your case of expression data. The advent of big data does make this more of a priority.
The localization step in Visscher et al - by using pairwise identity-by-descent (IBD) sharing estimates for particular regions of the genome - could also be done in OpenMx using a definition variable for the IBD sharing (see this script of the aforementioned course). Also possible would be the slightly better (and more powerful statistically) mixture distribution approach in which the probabilities that sib pairs are IBD 0, 1 or 2 are used as definition variables (see Eaves et al 1996. In either approach it is possible to test for joint linkage and association using Identity by State (IBS) measures as a covariate on the phenotype. This would effectively separate the association with the measured locus from the linkage with nearby loci.
The relative merits of variance components estimation via structural equation modeling or via lmer() are worth discussing, but I haven't the time to fill up "this margin" with elements of that.
Log in or register to post comments
In reply to It could be done in OpenMx though by neale
The link to the script doesn't work.
Also, maybe you would help me, as I am trying to replicate the analysis done by the Grundenberg paper in OpenMX. However, the main advantage of lmer() is that we can include random variables in the model. I have searched around, but I am not sure that is possible in OpenMX yet. Would you confirm if random variables can be added in the models?
Thank you,
Log in or register to post comments
In reply to The link to the script doesn't work. by anav
Oops
http://www.vipbg.vcu.edu/HGEN619_2013/twinAeqg.R
This is from the April 22nd session of this course: http://www.vipbg.vcu.edu/HGEN619_2013/HGEN619_OpenMx.shtml
Yes, covariates can be included. I'd say that an advantage of a regression-based method is likely to be speed, compared to numerical optimization in OpenMx. The disadvantage would seem to be that one can't specify models for linkage/association with latent variables.
Log in or register to post comments