Attachment | Size |
---|---|
The principal code used for bivariate ACE model | 36.64 KB |
The code used to carry out batch process | 5.29 KB |
table output | 38.33 KB |
Hi all,
I'm new with OpenMx. I have been trying to use a code obtained from others to carry out a bivariate ACE model. However, I encountered a situation that most of the confidence intervals of the model fit are NA. It seems this topic has been discussed thoroughly on this forum. But I still cannot confirm my problem by myself. I tried switching the optimizer to “SLSQP” but didn't solve the problem. In addition to mxCI, I found there are several alternatives, discussed on this forum, to estimate confidence intervals. Before trying the other options, I want to make sure if it is okay to proceed without confidence intervals. As discussed at (https://openmx.ssri.psu.edu/node/4269), there might be three reasons for the missing value of confidence intervals. I wonder if the fact that the confidence interval is too narrow means the estimate is okay? Or how can I set the limit to verify it? As suggested in the abovementioned discussion, I cannot carry out the analysis. So wish somebody can help me to examine the problem. Thanks a lot.
ps: I attach the code I used to this post.
Below is my mxVersion:
mxVersion()
OpenMx version: 2.17.3 [GIT v2.17.3]
R version: R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32
Default optimizer: CSOLNP
NPSOL-enabled?: No
OpenMP-enabled?: No
The Results I got:
summary(CholAeFit,verbose=T)
Summary of CholAE
see attachment "ace output.r"
I can't really tell what the issue might be here, because the CI details table in your post is not the whole table, and is hard to read due to the way the text wraps. If you store the
summary()
output of an MxModel object as another object, e.g.,, you can extract the CI details table from it. Save it as a text file or .csv, and post it as an attachment. Its first 7 columns are the most important.
In the meantime, I can offer some generic advice.
First, a lot of your confidence limits have diagnostic "active box constraint". A simple thing you can do about that is to get rid of the
lbound
argument when you constructpathA
,pathC
, andpathE
.Second, you may want to consider using an alternate parameterization of your models--the "direct-symmetric" parameterization. For example, see 'twoACE5c25raFac.R' on this page. There is a lot of prior discussion on these forums about the two parameterizations. You currently use the "Cholesky" parameterization, in which variance components are the squares of one-headed path coefficients; the alternative is the "direct-symmetric" parameterization, under which the variance components are themselves free parameters of the model. There's a good chance that switching parameterizations will eliminate most of your problematic confidence limits, but your results may be much less interpretable.
Third, one of your models got a status RED. Try replacing
mxRun()
withmxTryHard()
for at least that model. Note thatmxTryHard()
behaves stochastically, so you might want to useset.seed()
to make your results reproducible.Finally, do you know about the 'umx' package? You might find it really useful!
Thanks for your kindness and patience. Your advice is very informative and helpful. Since I'm new to the code, it took me some time to try to understand and try to fix the problem according to your advice.
I got rid of the lbound argument of pathA, pathC, and pathE, it helped me solve most of the cases without a value for the confidence intervals. However, there are still some confidence limits left as NA.
I tried mxTryHard().
I have tried to save the summary results to a text file without confirming if I performed the procedure correctly.
I have been trying to understand direct-symmetric & switching parameterizations. However, it is difficult for me to understand it thoroughly in a short time.
The reference to umx also opened a new area for me to study. I am still learning it.
I attached the results to this post. If possible, could you help me to take a further examination? I want to know if it is okay to proceed with current results.
Thanks for the informative comments. I am grateful for that.
The CSV file you posted is no more legible than your OP. Your code for making the file should be something like this:
That will create CSOLNP.csv in R's working directory.
It looks like I still haven't found the results report correctly.
I have CholAceModel &
< CholAceFit <- mxRun(CholAceModel, intervals = TRUE)>,
Performing
summary(CholAceModel, verbose=T) , or
summary(CholAceFit), $$
turns out that the model$CIdetails is NULL,
while $CI shows the values of lbound, estimate, ubound and note for each parameter.
Use
modelsumm$CIdetail
, notmodelsumm$CIdetails
.It is so nice of you. I am quite new with SEM and OpenMx. Thanks for your guidance keeping me upbeat about what I am gonna do.
I attached the $CIdetail to this post. The name of the file indicates which optimizer was used to produce the results. Wish you can help me to figure out the problems. Thanks!
It looks as though the only really tricky reference quantities are the off-diagonals of 'PropVA', 'PropVC', and 'PropVE'. They're the only ones that don't succeed with CSOLNP. There are other quantities that fail with SLSQP, but they're easily explainable. First, the [1,2] elements of the path matrices are fixed to 0, so 0 is their correct upper and lower confidence limit. Likewise, the diagonal elements of the correlation matrices are necessarily fixed to 1, so 1 is their correct upper and lower confidence limit. And, the off-diagonal elements of the correlation matrices can only take values between -1 and 1, so it looks as though in this case those are their correct upper and lower confidence limits.
I don't know what to make of the off-diagonal 'Prop*' elements. Both optimizers seem to really have difficulty with them (are you able to install an NPSOL-enabled build of OpenMx on your system?). If you really need confidence intervals for those elements, you might need to use their standard errors to do so. You can get standard errors for them with
mxSE()
, but you should run your model throughimxRobustSE()
first if you can.Alternately, you could bootstrap.
It is an informative post, which I have been trying to learn from it.
I cannot install the optimizer NPSOL. At first, it seemed my R cannot connect to (http://openmx.ssri.psu.edu/getOpenMx.R). Actually, it is true. However, I tried to copy the command on the internet page to R command window to run it, it turned out that the OpenMx is not available (perhaps for the R version 3.6.2 or 3.5.2).
I am now studying SEM with youtube videos. It is really a pleasure to learn more about it. I will try the methods you mentioned, mxSE and bootstrap, to estimate the Prop* parameter.
It is really a pleasant environment giving kind messages and detailed suggestions.
All of a sudden, a question was coming out. Given the bivariate model I used, I wonder if the model only estimates the path from latent variable A1 to the observed measurement 1 (such as structural connectivity), the path from latent variable A1 to the observed measurement 2 ( such as functional connectivity), and the path from latent variable 2 to the observed measurement 2, but the path from latent variable A2 to the observed measurement 1?--This question came out of my mind while I was watching a video to understand Cholesky bivariate model.
That path is not estimated under the Cholesky parameterization. The model would be unidentified if that path were a free parameter.
I was trying to run the batch process of a bivariate ACE model for structural connectivity (DTI) and functional connectivity. However, the process abnormally terminated after hours. I suppose it might have something to do with the input data containing NaN values?
data shouldn't contain NaN (NA is fine)
Possibly that should be caught by OpenMx as invalid data rather than running for hours...?
Thank you for your advice. I have replaced the NaN with NA and tried what Robert said to use mxTryHard instead of mxRun to fit the model. It looked like going well so far.
Since I have a large dataset to run the bivariate model, it might be the reason for running hours. I need to run 30,135 bivariate models for 30, 135 pairs of features.
Are you calculating confidence intervals for the ACE, AE, and CE models as in the script in your OP? There is a known issue concerning CI calculations being unnecessarily slow. Also, you can use
omxRunCI()
to calculate CIs for a model you've already run.Since you're trying to automate running all your models as a batch, you should make your code more error-tolerant. For example:
Also, consider replacing
mxRun()
withmxTryHard()
.I doubt the presence of
NaN
in your data explains why your CE model terminated abnormally. Instead, what I think is happening is that, once you drop A from the ACE model, the model-expected DZ covariance matrix is computationally singular, and the fitfunction is evaluated as non-finite. The optimizer can cope with non-finite fit in the middle of optimization, but not right at the beginning. You will need to tweak the values of the free parameters for your CE model before running it. Or, again, usemxTryHard()
in place ofmxRun()
.Thanks a lot. It is a comprehensive explanation! Without knowing the exact reason, I feel like your inference. It was easy for me to replace mxRun with mxTryHard. But I haven't figured out how to follow your instruction that tweaking the values of the free parameters.
I want to solve the questions myself. In addition to asking questions on this forum, it is necessary for me to take time to know more about SEM and OpenMx. I wonder if you could recommend me some study materials. At present, I can only use OpenMx manual and the resources on this forum. I really want to take time to learn more about SEM and OpenMx, wishing I can debug the code myself at last. Appreciated if you could recommend me some study materials. Thanks.
You might get some value from
1. The umx paper. Bates, T. C., Maes, H., & Neale, M. C. (2019). umx: Twin and Path-Based Structural Equation Modeling in R. Twin Res Hum Genet, 22(1), 27-41. doi:10.1017/thg.2019.2
2. Examples in
?umx
package.3. Online umx tutorial at https://tbates.github.io If anything needs improving, let me know: feedback good bad or ugly greatly valued!)
Thanks a lot. I quite like these references.
Hi Yovan
The Boulder workshops have a lot of material, including presentations and accompanying scripts and exercise materials. A reasonable place to start is https://www.colorado.edu/ibg/international-workshop/2020-international-statistical-genetics-workshop/workshop-2020-file
Hi Micheal
Thanks for the sharing. I am trying to google if there are related videos but haven't found any. I will make use of the materials after some video study, which might help me to better understand the materials.