Attachment | Size |
---|---|

estimates.txt [6] | 1.71 KB |

While it's relevant for both the Behavioral Genetics and OpenMX General Help forums, the overall question is rather general so I figured it should be posted here.

I'm currently working on fitting a multivariate twin model with ordinal data using OpenMX. The idea is eventually to look at common and independent pathways for a model with 8 ordinal measures. Before scaling up to this, I was initially working with just 5 of the variables. When trying to fit a ACE model, it took roughly 10 minutes to run and gave me a code RED as output. I then ran Ryne's start value loop (THANK YOU) with a few modifications to see if I could either clear the code red or at least find a best model over the 50 iterations - every run resulted in a code red and the -2ll varied across all runs.

I scaled it back again, to 3 variables, to find similar issues. Running time wasn't as long, but I was still getting code red. This time, I have 8 of the 50 starting value runs coming up error-free or code GREEN with the same answer, so I can at least feel confident about the estimates I'm getting from that best model.

My coworker was asking if I could run using the polychoric correlation matrix instead of raw data to combat some of these issues. I generated MZ and DZ correlation matrices using hetcor in the polychor package. You can see how they compare to the correlation matrices from the Saturated model:

MZ Correlation from Saturated Model

var1A var2A var3A var1B var2B var3B

1A 1.000 0.256 0.284 0.626 0.051 0.343

2A 0.256 1.000 0.448 0.107 0.418 0.383

3A 0.284 0.448 1.000 0.079 0.287 0.721

1B 0.626 0.107 0.079 1.000 0.285 0.326

2B 0.051 0.418 0.287 0.285 1.000 0.543

3B 0.343 0.383 0.721 0.326 0.543 1.000

MZ Correlation from hetcor

var1A var2A var3A var1B var2B var3B

1A 1.000 0.279 0.374 0.627 0.075 0.380

2A 0.279 1.000 0.436 0.101 0.421 0.356

3A 0.374 0.436 1.000 0.143 0.265 0.733

1B 0.627 0.101 0.143 1.000 0.264 0.325

2B 0.075 0.421 0.265 0.264 1.000 0.549

3B 0.380 0.356 0.733 0.325 0.549 1.000

Running with the correlation matrices took a matter of seconds and didn't give an error code. (Estimates comparing the "best" model after a series of 50 start value runs to those from the correlation matrix run are attached. There are differences of >.1 in some estimates...)

Finally, my question: Is it valid to use the correlation matrix like this even though we have the raw data? I'm a bit concerned given the difference in estimates, but I know a lot of people fit structural equation models using correlation or covariance matrices. Any insight would be appreciated.

We've already tried similar with just 4 variables and I couldn't get rid of the code red or get agreement in our -2ll from the start value loop. I see this problem getting much much worse as we add in variables, so it would be lovely if we could spend a fraction of the time running correlation matrices instead.