# Multivariate Independent Pathway with Multiple Occasions

10 posts / 0 new
Offline
Joined: 07/14/2016 - 20:50
Multivariate Independent Pathway with Multiple Occasions
AttachmentSize
42.74 KB

I have been analyzing achievement/cognitive test data from the NLSY79 Children data set. I have constructed sibling-cousin pairs for each of the five tests (PPVT, digit memory, reading comprehension, reading recognition and math) at 5 occasions (ages 5-6, 7-8, 9-10, 11-12 &13-15). I have successfully run a independent pathway model for domain scores averaged over occasions (zero to 4) and 5 separate models for each age. However, there are several reasons why a combined model using all the data is more desirable than running separate scripts for each.

However, when I run the model I get the following error message:
Running IndACE with 173 parameters
Error: The job for model 'IndACE' exited abnormally with the error message: fit is not finite (The continuous part of the
model implied covariance (loc2) is not positive definite in data 'G7.data' row 203. Detail:
covariance is too large to print # 38x38)
In model 'IndACE' Optimizer returned a non-zero status code 10. Starting values are not feasible. Consider
mxTryHard()

I don't understand the message but know it is bad. I am not sure what is the problem is maybe the intercept, the path or matrix specifications or is it because the model is too large and there are too many parameters to estimate (although many are fixed at zero) or too much NA. The start values worked for the runs mentioned above, this model is identified according to OpenMX. TryHard returns errors, doesn't even begin.

So I have attached the script. The data is over 2MB so I can't send in this forum but I can send a csv file (3.7MB) by email or leave it on a directory.

Offline
Joined: 01/24/2014 - 12:15
check the covariance matrices

I infer from the comments in your script that excluding group 7 doesn't make any difference. Is that correct, or will your attempt at running modelInd work if you drop group 7? I ask because the error message refers to the data in group 7, though I don't see from your script that the expected covariance matrices depend on definition variables.

I think a good place to start is to inspect the model-expected covariance matrices at the start values, for instance:

 mxGetExpected(modelInd$G1, "covariance") #<--Does it look reasonable? eigen(mxGetExpected(modelInd$G1, "covariance"), only.values=T) #<--Are all of its eigenvalues positive?

My suspicion is that you've inadvertently fixed a diagonal element of one of those matrices to zero.

Offline
Joined: 07/14/2016 - 20:50
Independent Pathway Multiple Occasions

Yes, excluding group 7 doesn't make any difference. The same error comes up for Group 6, etc.
The covariances look fine but there are negative eignenvalues, so the matrix cannot be inverted. I thought the problem was because Digit is missing for all observations at age 5/6 but after excluding all observations for all tests (for PPVT, comp, recog, math as well) at age 5/6 the same error appeared.
I think the type 10 error has something to do with the covariances between off-diagonal elements, e.g. PPVT at age 7, math at age 13 etc. which isn't an issue with separate analyses by age or when I averaged the scores across ages. I don't see how these covariances are catered for in my model. I think you are right it is the covariance matrices so there is likely to be a specification problem in dealing with those elements.
The data is now small enough to send , I deleted observations with more than 35 NaNs as well as no age 5/6 data. The edited the OpenMx script is also attached.

File attachments:
Offline
Joined: 01/24/2014 - 12:15

I tried running your updated script. I'm not 100% sure I'm reproducing your MxModel's setup, because there were a few lines early in the script that depended on datasets other than the one you attached. I had to remove group 2 from the MxModel entirely, since its dataset had zero rows. I also had to change line 266 to this:

values=c(mean_ppvt, mean_digit, mean_comp, mean_recog, mean_math

You definitely seem to have too many free parameters in 'ns' and 'fs'. Those are currently both 20x5 matrices in which all elements are free parameters! Other than that, the starting values corresponding to variance in Digit Span seem rather low.

Offline
Joined: 01/24/2014 - 12:15
Re-specify unique-variance matrices

Actually, I don't think changing start values is going to help. I've attached what the unique non-shared-environmental variance matrix looks like at the start values. The problem is that each row and column has off-diagonal elements equal to the main-diagonal element.

The common-variance matrices are all equal the outer product of vector with itself (as is the case in this kind of model) and are therefore singular. Thus, the unique-variance matrices must be specified so that the sum of the common- and unique-variance matrices ("V", in your script) is positive-definite. That won't be the case here, because as%*%t(as) and es%*%t(es) (I'm ignoring the unidentified ns%*%t(ns) and fs%*%t(fs)) are also singular.

I'm guessing the motivation for your specification of the unique-variance matrices is to allow "correlated residuals" within timepoints (ages), but you'll have to do that a different way. The current specification won't work for ANY set of start values.

File attachments:
Offline
Joined: 07/14/2016 - 20:50
Matrices

Thanks for that. I think I understand the problem. It is with the 20 by 5 matrices which although conformable multiplication of the transpose (as%*%t(as)) produces unwanted off-diagonal elements. The solution is to use diagonal matrices for the specific parts of the model which is why the simpler models ran OK.
This works:
pathAs <- mxMatrix( type="Diag",nrow=nv, ncol=nv, free=TRUE, values=c(st_a_PPVT,st_a_digit,st_a_comp,st_a_recog,st_a_math,
st_a_PPVT,st_a_digit,st_a_comp,st_a_recog,st_a_math,
st_a_PPVT,st_a_digit,st_a_comp,st_a_recog,st_a_math,
st_a_PPVT,st_a_digit,st_a_comp,st_a_recog,st_a_math), labels=AsLabs, name="as" )
And the same for the others.

I really wanted 5 specific factors, one for each domain but I can't see how this can be done using square matrices.

Thanks again

Offline
Joined: 07/14/2016 - 20:50
Speed of analyses, Memory, Chip etc.

The model described above has been churning away since last Tuesday, and is still running this morning (Monday). I have asked confidence for confidence intervals which I know increases the time. When I don't ask for confidence limits it takes about 2 days.
My computer is i7 but only 8GB of RAM and nearly 3 years old. Would a new lap top, say an upper level HP or Dell help much? I assume 16GB instead of 8GB would help but would 32GB make much difference? (I am not interested in gaming) SSD? The chip? Any advice on appropriate specifications for analysis of large data sets with complex models including CIs or is this something I just have to live with?

> modelInd2 <- mxModel(modelInd, name="Modx" )
>
> modInd2 <- omxSetParameters(modelInd2,
+ labels=c("as_6_6", "as_10_10", "as_14_14", "as_15_15", "as_19_19", "as_20_20", "as_24_24",
+ "ec_1_1", "ec_5_1", "ec_7_1"
+ )
+ , free=FALSE, values=0 )
> fitmodInd2 <- mxRun (modInd2, intervals = TRUE)
Running Modx with 144 parameters
CSOLNP 1354886 -240.205 -4.821e-007

Offline
Joined: 05/24/2012 - 00:35
how many cores?

Which platform are you on? If you can get a Linux box then you can speed up your model quite a bit by using multiple cores. Use,
mxOption(NULL, 'Number of Threads', parallel::detectCores())

Offline
Joined: 07/14/2016 - 20:50
Independent Pathway Multiple Occasions NAs for SE

I am not sure how to proceed when there are NA for the SE estimates but all the other estimates are statistically significant.
This occurred after I had removed all statistically insignificant parameter estimates.

fitmodInd2 <- mxRun (modInd2, intervals = FALSE)

Like I said, I am not sure what to do next except, estimate smaller models.

File attachments:
Offline
Joined: 01/24/2014 - 12:15

Standard errors of NA are a sign of exactly what the warning message is telling you about: the Hessian matrix is not positive-definite at the solution...which means that the solution is unlikely to be a local minimum of the fitfunction. Without carefully examining your script, the only advice I can give is to use mxTryHard() in place of mxRun().

A non-PD Hessian can be a sign of underidentification, so you might want to also try mxCheckIdentification(), both at the start values and at the solution.

BTW, are you a FORTRAN programmer? I didn't know until I saw your script that R parses ** as ^. R's documentation for its arithmetic operators says that behavior was undocumented for years.