Expected covariance matrix is non-positive-definite.

Attachment | Size |
---|---|
ExpCovPOSDEF.R | 6.6 KB |
DZcor.csv | 2.56 KB |
MZcor.csv | 2.58 KB |
I'm attempting to model a 'saturated' model for pathway analysis - working with a twin data set. A true saturated model shows all our path diagrams to be a much more horrible fit to the data (p-values < 10^-10), but I believe I'm getting penalized since we have constraints that the pathways are the same for all 4 groups - within (A and B) and across (MZ and DZ). So, my goal was to make a 'saturated' model instead, constraining the necessary parts within the covariance matrix and across the covariance matrices the same.
I've spent a lot of time playing with starting values - choosing some data driven values (which required some massaging to be positive definite to start) and some rather dumb values (off diagonal elements to .5 and diagonal elements to 1).
Is this truly just a starting value problem? Or are my constraints causing the issue? Any advice would be appreciated. I've attached faked correlation matrices for both groups and my code - which I admit is probably horrible to look at and try to parse through. I feel like my error is likely conceptual and completely based on some horrible misunderstanding on my part.
Thanks!
This is a starting values
(a,a), (a,b), (a,c), (b,b), (b,c), (c,c)
That means, if you give a list of A1-A5 with A1-A5, the first five paths put in are A1 with itself, A1 with A2, etc. Then the next four are A2 with itself and then with A3-A5. If you're thinking of the covariance matrix of A1-A5, 'unique.pairs' first does the first column top to bottom, then the second column from the second row to the bottom, and so on. In short, mxPath reads the lower triangular matrix by column (column major).
The lower.tri and upper.tri functions that lowerTriangle and upperTriangle must call are also column major. However, column major operations on the transposed matrices you put into these functions are effectively row major. If you look at the withinS set of starting values, you'll notice that it starts c(1, xxx, 1...). That means you're assigning the correlation between A1 and A3 to 1, which I presume you meant for the A2 variance. Stop using the t() function and your problems should go away. Your starting values look OK; you just have to put them in the right places.
Log in or register to post comments
In reply to This is a starting values by Ryne
Not a transpose issue, I promise. :)
I did iterations where the starting values were identical to the original correlation matrix - with the constraints that those paths that had the same label had the same starting point. This resulted in a non-positive definite matrix for the starting values - regardless of if I started with the MZ correlations, DZ correlations or an average. Which is why I had the line for adding in a constant along the diagonal - but not for the first element which was the fixed term (not sure if I still need the fixed term for the saturated model or if it even matters since I've told OpenMX that I'm feeding it correlation matrices.
Any other ideas to overcome this error code?
Log in or register to post comments
In reply to Not a transpose issue, I promise. :) by kspoon
I looked into this a little
We can make incredibly simple starting values (variances=1, covariances=0) like so:
ModelSAT$MZ$S@values <- diag(12)
ModelSAT$DZ$S@values <- diag(12)
When I do so, the error goes away and I get a converged solution (-2LL = 5547.74). As the whole thing only took 1.4 seconds with those lousy starting values, I wouldn't worry that much about find better ones.
Log in or register to post comments
In reply to I looked into this a little by Ryne
Uuuuuugh.
I know that in my 4 hours of struggles, I made sure to check for positive definiteness by doing eigen(ModelSAT$MZ$S@values), but was still getting the error. Who knows. At this point, I'm just immensely happy it's been resolved.
Log in or register to post comments
In reply to I looked into this a little by Ryne
let's check this into the error message
Log in or register to post comments
General Structural Equation Question
How would you work twinness into a pathway analysis with some family-level and some individual-level variables?
As you can see my original thought was to account for the MZ and DZ groups separately, and to model paths between each variable in the A twin with the corresponding variable with the B twin - and to allow that to differ between the groups (MZ and DZ) while the within twin paths were constrained to be the same for all individuals. However, we are only interested within-individual paths, so I shouldn't need to even add those correlations as parameters - right?
I still feel like the two group design is helpful since the correlations across twins differ between MZs and DZs, but is it really necessary? And I'm correct to run everything pairwise because of the family-level variables, right? Otherwise, I could split it up into 4 groups MZ-A, MZ-B, DZ-A, and DZ-B, but I also kill my degrees of freedom doing that...
Thanks!
Log in or register to post comments