Hi!

I am currently interested in understanding how well a large amount of variables (20-40) which do not represent a construct in the psychometric sense but can be broadly viewed as a certain class of influences explains an outcome in comparison to another set of variables. On the phenotypical level this is quite straight forward - I can fit regression models and look at (incremental) R2-values or train a LASSO or something similar on a training set to compare the predicted R2's on the validation set.

However ideally I would further like to decompose these covariances using a multivariate ACE-model so that I can find out if perhaps my 20-40 variables jointly explain much of A but almost no C or E, especially compared to the other set of variables. Unfortunately I am not quite sure what the best approach would be to achieve such an outcome. I guess I could theoretically run a full multivariate ACE-model but this would be prohibitively expensive and take an extremely long time. On the other hand I really don't need most of the parameters which are blowing up the model. The more pragmatic solution therefore potentially lies in the creation of a best predictor from my regression/LASSO based on all 20-40 variables and just fitting a bivariate (or trivariate, while doing the same for the other set of variables in question) ACE-model. Unfortunately I have a certain feeling that this might not be a valid way of doing things.

My situation doesn't strike me as extremely extraordinary and I guess that something similar happens to researchers all the time, though I was unable to find a distinct discussion of this situation in the literature. Did I miss something basic?

Thanks for your help!

Tobias