You are here

when to standardize?

5 posts / 0 new
Last post
Benny's picture
Offline
Joined: 05/15/2020 - 05:00
when to standardize?
AdminNeale's picture
Offline
Joined: 03/01/2013 - 14:09
Data or Estimates?

Standardizing data is popular, but there are some gotchas. Suppose one standardized a variable in a model for MZ and DZ twin pairs. One would want to standardize all scores with respect to the same sample mean and standard deviation. Doing so would yield means and variances that differ between twin 1 and twin 2 and between MZ and DZ pairs. All four data vectors (T1 and T2 for MZ and DZ) would have sample means and standard deviations close to, but not exactly, zero and one respectively. This is the correct situation for analysis, where model fit would reflect in part the inconsistencies between what are usually replicate statistics (we have 4 means and 4 variances all observed). When those inconsistencies are great enough to cause model failure, it may point to distributional assumption failure (non-multivariate). Or it might be due to something systematic like sibling interaction which predicts different total variance for MZ vs. DZ pairs. Being able to be educated about such possibilities is, IMO, one great advantage of model-fitting.

Standardizing results is a more common procedure, but we should recognize that sometimes it can lead to apparent inconsistencies in multiple group work. Suppose we estimate simply the two variances and covariance for two variables, and do so in two groups where the covariance (but not the variances) is constrained to be equal across groups. If the predicted variances differ between the groups, then the correlation will differ between the groups, which may not be desired. Fear not, however, there are ways to equate the correlation for two groups even if the variances and covariances differ. One way would be to add a non-linear constraint that calculates both correlations (cov2cor say) and equate them. Another would be to respecify the model as a correlation matrix pre- and post-multiplied by a diagonal matrix of standard deviation parameters.

This may be the record longest response given the length of the original post (0 characters) :).

Benny's picture
Offline
Joined: 05/15/2020 - 05:00
oh, I am so sorry! Somehow, I

oh, I am so sorry! Somehow, I managed to delete my original post. Thanks a lot for your answer! I think my question was whether to do the standardization of the data for a twin model zygosity-specific or not. I.e. whether to standardize MZ data with respect to MZ mean and variance and DZ data with respect to DZ mean and variance. I understand your answer as a recommendation not to use zygosity-specific standardization.

AdminNeale's picture
Offline
Joined: 03/01/2013 - 14:09
Correct, don't standardize MZ and DZ separately

As noted, sometimes MZ and DZ total variances differ. It can occur when siblings have an effect on each other, and when the MZ and DZ correlations differ initially (due to, e.g., genetic sources of variance). To be able to test such models, the metrics should be the same for both MZ and DZ groups.

Benny's picture
Offline
Joined: 05/15/2020 - 05:00
Thank you!

Great, thank you for the clarification!