Lower bound confidence interval NaN from time to time

Posted on Mon, 06/14/2021 - 15:56

mirusem Joined: 11/19/2018

Forums

Hi,

I was wondering about some lower bound NaN values I've come across (or also when, for example the AE model has C set to 0, but the variance component has a lower bound CI NaN value even though all of them should have a value of 0 (lower bound, estimate, upper bound)). Specifically, I have had NaN values for the A standardized variance component lower bound of the confidence interval from time to time. Does anyone know why this may be? Is there a work-around on how to adjust it? It's for relatively few estimates, but it does come up.

I am using the latest OpenMx version and it seems to be running fine otherwise in general (on a univariate model, etc.).

I know there is also calculating a CI via bootstrapping, but I am not entirely sure how to do that as of yet (but maybe that would lend itself toward resolving this issue).

Definitely appreciate any advice!

Replied on Tue, 06/15/2021 - 11:59

AdminJosh Joined: Dec 12, 2012

verbose output?

Have you looked at the `summary(..., verbose=TRUE)` output?

Replied on Tue, 06/15/2021 - 14:15

mirusem Joined: Nov 19, 2018

Thanks for the reply! I just

Thanks for the reply! I just included it and it lists the following:

R[write to console]: OpenMx version: 2.19.5 [GIT v2.19.5]
R version: R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu
Default optimizer: SLSQP
NPSOL-enabled?: No
OpenMP-enabled?: Yes

I thought NPSOL was the default optimizer (at some point, though maybe I am mistaken)? Not sure if that may have to do with it. It also gives NaN values for the upper CI bound for the C estimate (when C is fixed to 0) at times as well.

Replied on Tue, 06/15/2021 - 18:18

AdminJosh Joined: Dec 12, 2012

verbose CI output

That looks fine. I think SLSQP is used for confidence intervals regardless of which optimizer you select. The output I was after is the detailed CI output. Here's what it outputs by default,

```
confidence intervals:
lbound estimate ubound note
common.A[1,1] 0.5566175 6.173024e-01 0.68400870
common.C[1,1] NA 2.406416e-13 0.05269798 !!!
common.E[1,1] 0.1537491 1.730463e-01 0.19563705
```

and here's the detailed output with `summary(model, verbose=TRUE)`:

```
CI details:
parameter side value fit diagnostic statusCode method a c e mean
1 common.A[1,1] lower 0.55661745 4071.507 success OK neale-miller-1997 0.7460680 5.399907e-04 0.4244517 21.39288
2 common.A[1,1] upper 0.68400870 4071.519 success OK neale-miller-1997 0.8270482 4.229033e-06 0.4095962 21.39341
3 common.C[1,1] lower 0.00000000 4067.663 alpha level not reached infeasible non-linear constraint neale-miller-1997 0.7856859 0.000000e+00 0.4159883 21.39293
4 common.C[1,1] upper 0.05269798 4071.549 success infeasible non-linear constraint neale-miller-1997 0.7560895 2.295604e-01 0.4181163 21.39237
5 common.E[1,1] lower 0.15374906 4071.505 success infeasible non-linear constraint neale-miller-1997 0.7968068 2.489554e-08 0.3921085 21.39306
6 common.E[1,1] upper 0.19563705 4071.512 success infeasible non-linear constraint neale-miller-1997 0.7729641 9.786281e-08 0.4423088 21.39289
```

Replied on Tue, 06/15/2021 - 18:59

mirusem Joined: Nov 19, 2018

Oh I see, yeah that looks

Oh I see, yeah that looks very in-depth, I didn't print it out, so now I see it. Thanks a lot for the clarification. Here are two outputs (for a C and A lower bound respectively where it's NaN) (hopefully the formatting is okay):

1) (for C lower bound NaN)

confidence intervals:
lbound estimate ubound note
MZ.StdVarComp[1,1] 0.07422233 0.3950067 0.6354808
MZ.StdVarComp[2,1] NA 0.0000000 0.0000000 !!!
MZ.StdVarComp[3,1] 0.36451917 0.6049933 0.9257777

CI details:
parameter side value fit diagnostic
1 MZ.StdVarComp[1,1] lower 0.07422233 -37.95743 success
2 MZ.StdVarComp[1,1] upper 0.63548083 -37.95903 success
3 MZ.StdVarComp[2,1] lower 0.00000000 -41.80563 alpha level not reached
4 MZ.StdVarComp[2,1] upper 0.00000000 -37.93677 success
5 MZ.StdVarComp[3,1] lower 0.36451917 -37.95903 success
6 MZ.StdVarComp[3,1] upper 0.92577767 -37.95743 success
statusCode method a e
1 OK neale-miller-1997 0.05779659 0.2041213
2 OK neale-miller-1997 0.18255742 0.1382638
3 infeasible non-linear constraint neale-miller-1997 0.13495746 0.1670206
4 infeasible non-linear constraint neale-miller-1997 0.17279164 0.1425522
5 infeasible non-linear constraint neale-miller-1997 0.18255742 0.1382638
6 infeasible non-linear constraint neale-miller-1997 0.05779659 0.2041213

2) for the A component case:

confidence intervals:
lbound estimate ubound note
MZ.StdVarComp[1,1] NA 4.752728e-19 0.184599 !!!
MZ.StdVarComp[2,1] 0.0000000 0.000000e+00 0.000000 !!!
MZ.StdVarComp[3,1] 0.8153433 1.000000e+00 1.000000

CI details:
parameter side value fit diagnostic
1 MZ.StdVarComp[1,1] lower 2.842761e-41 -252.6202 alpha level not reached
2 MZ.StdVarComp[1,1] upper 1.845990e-01 -248.7728 success
3 MZ.StdVarComp[2,1] lower 0.000000e+00 -248.7756 success
4 MZ.StdVarComp[2,1] upper 0.000000e+00 -248.7447 success
5 MZ.StdVarComp[3,1] lower 8.153433e-01 -248.7712 success
6 MZ.StdVarComp[3,1] upper 1.000000e+00 -248.8087 success
statusCode method a e
1 infeasible non-linear constraint neale-miller-1997 -5.366488e-22 0.10065144
2 infeasible non-linear constraint neale-miller-1997 -4.407308e-02 0.09262844
3 infeasible non-linear constraint neale-miller-1997 -2.513257e-02 0.10927582
4 infeasible non-linear constraint neale-miller-1997 -2.463694e-02 0.10243448
5 infeasible non-linear constraint neale-miller-1997 -4.407965e-02 0.09262449
6 infeasible non-linear constraint neale-miller-1997 -3.161669e-10 0.09930529

Replied on Tue, 06/15/2021 - 19:37

AdminJosh Joined: Dec 12, 2012

CI interpretation

> 1) (for C lower bound NaN)

If the upper bound is zero then you can probably just regard the lower bound as zero. The algorithm is very particular and wants to find the correct amount of misfit, but the model is already backed up into a corner and the optimizer gets stuck.

> 2) for the A component case:

This is similar to the first case. Here, the optimizer got closer the target fit of -248.77, but didn't quite make it because the parameters got cornered again. 2.842761e-41 can be regarded as zero.

It looks like you're using the ACE model that does not allow variance components to go negative. This model is miscalibrated and will result in biased intervals. For better inference, you should use the model that allows the variance components to go negative. You can truncate the intervals at the [0,1] interpretable region for reporting.

Replied on Wed, 06/16/2021 - 20:17

mirusem Joined: Nov 19, 2018

Got it. That makes a lot of

Got it. That makes a lot of sense. I guess I can go with that in this case. I really appreciate the clarification and suggestion. Is that normal practice to go ahead and just turn NaN values to zero (in that circumstance), etc.? Otherwise, I can just list that as being what was done in this circumstance (unless there are other workarounds).

Do you have any references on setting it so the variance components are allowed to go negative? I am not too familiar with this, and have seen some posts, but am not entirely sure where would be the best place to look.

Replied on Thu, 06/17/2021 - 11:59

AdminRobK Joined: Jan 24, 2014

reference

Got it. That makes a lot of sense. I guess I can go with that in this case. I really appreciate the clarification and suggestion. Is that normal practice to go ahead and just turn NaN values to zero (in that circumstance), etc.? Otherwise, I can just list that as being what was done in this circumstance (unless there are other workarounds).

If a model fixes the shared-environmental component to zero, then under that model, the lower confidence limit for the shared-environmental component is trivially zero (as is the upper confidence limit).

Do you have any references on setting it so the variance components are allowed to go negative? I am not too familiar with this, and have seen some posts, but am not entirely sure where would be the best place to look.

See here.

Replied on Mon, 06/21/2021 - 13:24

AdminNeale Joined: Mar 01, 2013

Verhulst & Neale

There's a paper in Behavior Genetics about estimating unbounded A C and E variance components, instead of the usual implicitly-bounded path coefficient specification, which constrains variance components to be non-negative. It's here: https://pubmed.ncbi.nlm.nih.gov/30569348/
Please let us know what you think, and if there are any remaining questions we could answer that would help you further.

Replied on Fri, 06/25/2021 - 05:33

mirusem Joined: Nov 19, 2018

Thank you both (and I

Thank you both (and I apologize for the delay in response). That definitely makes sense. I still need to read the paper more carefully, but I understand the need / benefits for it. Ideally I would like to use it given its benefits, so I am trying to figure out what best way to incorporate it may be. I usually have been using path based implementations of the univariate ACE, and from what I've seen the direct variance approach can either be done in umx or there is a script which is more non-path based (which would require more familiarity, though both do).

As for umx (I am new to this), would it be sufficient to use the umxACEv function, along with the umxConfint function (to get the standardized estimates + standardized CIs)? Or is there some other pointer to look at? That's what I've gathered from one of the tutorials and also the documentation of it.

And if I am to go with this other (direct variance approach) approach, in order to limit the range of the bounds, would I just do that after obtaining the estimates and associated CIs?

And, lastly, it is okay to just approximate the bounds which are very close to 0 but the optimizer fails (like was said above, outside of the triviality of it already being constrained to 0)? So for example the A estimate?

I really appreciate all the great amount of help you all have given!

Replied on Tue, 07/06/2021 - 09:16

mirusem Joined: Nov 19, 2018

So, I figured out most of

So, I figured out most of these answers. One question that is coming up, is I am using umx (just for convenience at this point), and I run into an issue as in this thread: https://openmx.ssri.psu.edu/node/4593 for at least one of the estimates. Is it possible to adjust either using umxModify or umxACE the lower bound for the C parameter as in that post? Or is that strictly limited to doing it directly without umx (because it cannot be adjusted with umx as that's not a feature)? I know there is a umx_scale() function, which maybe can help (to a degree with this)--on that note, though, does that not affect the ultimate result and bias it in ways?

I appreciate it. If not, I hope there is a way for umx to skip the estimate instead of crashing, but I am not sure that is feasible, also.

Replied on Tue, 07/06/2021 - 11:52

mirusem Joined: Nov 19, 2018

So it looks like this

So it looks like this crashing issue ends up being an issue a lot when it comes to submodels (AE/CE). For now, since I am not sure how that would be addressed consistently using the direct variance approach (without biasing it in some sense), I will just stick with the older approach that is biased since it doesn't throw any errors. But I would definitely be curious if there is a consistent / fair work-around for the above which still is using the direct variance approach based on any of your advice, since that would ultimately be unbiased relative the older approach which would be ideal.

If anyone has any advice, I appreciate it as always!

Replied on Tue, 07/06/2021 - 12:02

jpritikin Joined: May 23, 2012

crashing?

In the thread you reference, the problem is starting values. If you use a large enough error variance then you should be able to get the model to optimize. In contrast to the referenced thread, I would not add lower or upper bounds since the whole point of the variance parameterization is to not have these bounds. When you report the confidence interval, you can apply the bounds. For example, if the A variance is estimated as -.2 to .2 then you can report the interval [0,.2].

Replied on Tue, 07/06/2021 - 12:34

mirusem Joined: Nov 19, 2018

Hi jpritikin, thank you for

Hi jpritikin, thank you for the response! That is helpful (as to how to apply the bounds after the fact).

Hmm, if I am using umx, I am not sure of how to get large enough error variance. By error, do you mean the environmental variance in this case (or just in general the variance that needs to be large for the MZ/DZ twins)? Or would this be something to explore by multiplying the actual values of the data by 100 or so (and would that be okay to do with no other alterations, or would that cause issues elsewhere if it's not renormalized, so to speak)?

I am not sure if the freeToStart, or value variables of umxModify are relevant in this instance (with the freeToStart parameter, for example, or tryHard which didn't seem to work too well). Also maybe xmuValues or xmu_starts would be relevant in this case?

For reference, here are two errors I am explicitly getting:

Error: The job for model 'AE' exited abnormally with the error message: fit is not finite (The continuous part of the model implied covariance (loc2) is not positive definite in data 'MZ.data' row 20. Detail:
covariance = matrix(c( # 2x2
0.0132678091792278, -0.0151264409931152
, -0.0151264409931152, 0.0132678091792278), byrow=TRUE, nrow=2, ncol=2)
)

and

Error: The job for model 'CE' exited abnormally with the error message: fit is not finite (The continuous part of the model implied covariance (loc2) is not positive definite in data 'DZ.data' row 53. Detail:
covariance = matrix(c( # 2x2
0.142828309693162, 0.18385055831094
, 0.18385055831094, 0.142828309693162), byrow=TRUE, nrow=2, ncol=2)
)

Replied on Tue, 07/06/2021 - 13:11

mirusem Joined: Nov 19, 2018

I just used the umxModify

I just used the umxModify function to set the starting value for E (just as you would for any fixed parameter, before running, and it seems to work for CE (I will try also for AE). If that sounds right to you, let me know. Otherwise, for now it sounds about right to me. Thanks so much!!

Replied on Tue, 07/06/2021 - 13:30

jpritikin Joined: May 23, 2012

starting values

Yeah, that sounds like the correct approach. Based on what you wrote, I'm not sure that you realize that umx models are OpenMx models. If you do `umxACEv(..., autoRun=FALSE)` then you get the OpenMx model which you can adjust before running.

Replied on Tue, 07/06/2021 - 13:38

mirusem Joined: Nov 19, 2018

That's actually exactly what

That's actually exactly what I did (so I had two lines one with FALSE and the other with not) :) but I am sure my understanding could always be better (umx is still quite new to me).
One question that came to mind is, does it matter if one identifies the regex parameter for any given component of interest as A_r.c. vs A_r1c1? I saw this post https://openmx.ssri.psu.edu/node/4229 where it's used to drop the entire free parameter set, but I am not sure if it makes any technical difference.

Replied on Tue, 07/06/2021 - 14:14

jpritikin Joined: May 23, 2012

regex

Regex matching is just for ease of use. You can always list out all the parameters one-by-one.

Replied on Tue, 07/06/2021 - 19:00

mirusem Joined: Nov 19, 2018

I see. I was actually

I see. I was actually wondering if there is any difference between the notation for removing all of C (C_r.c.) versus just one part of the matrix (C_r1c1), but I looked into it and I guess that since it is symmetric it is okay (is my guess)?

Also, in the path version of OpenMx the starting values I did by taking, say V = sqrt(phenotypic variance across all twin pairs)/3. Should this equivalently be set for all the parameters A, C and E in umx/is it possible/needed, with the direct variance approach? Right now I have only directly set the E parameter as needed for the subnested models for AE/CE to 0.5 in umxmodify, but am wondering if there is a more systematic approach (i.e. should the 0.5 just be replaced with the V listed above)? Since I know the start value can affect the ultimate CI bounds at the very least.

And, can this be set prior to running ACEv? I am not sure if the xmu_start_value_list() function is relevant.

Finally, is there a way to suppress warnings, or the attempt to print to browser for umx?

Replied on Tue, 07/06/2021 - 19:14

mirusem Joined: Nov 19, 2018

quick follow-up

My mistake, from here it looks like since it is concerned directly with variance the square root shouldn't be taken, right? https://github.com/tbates/umx/issues/38

But does this mean that it's already included directly in umx internally? If not, is there a rule of thumb then for making E large or small (when it comes to doing the subnested models), or any starting values in general when doing ACEv?

And the warning suppression + browser suppression may still be helpful (since it prints out browser related information even if I have options(Browser = 'NULL').

Replied on Tue, 07/06/2021 - 19:15

jpritikin Joined: May 23, 2012

starting values

Regarding starting values, the critical thing is to start with a larger enough error/environmental variance. If your error variance is too small then residuals will get ridiculously large Z scores which can cause optimization failure.

Replied on Tue, 07/06/2021 - 19:22

mirusem Joined: Nov 19, 2018

Got it, I see. So just

Got it, I see. So just arbitrarily set it high I am guessing (when it comes to say, umxModify since that's where the issue comes up). The only other question I have related to this is can this be done prior to running umxACEv/is it needed, or should it only be done using umxModify() when it comes to the submodels (using regex, etc. for example)?

Thanks so much for the quick responses!

And I will try and submit a bug within the week for sure regarding the output issue.

Replied on Tue, 07/06/2021 - 19:17

jpritikin Joined: May 23, 2012

warnings etc

If you can't figure out how to suppress umx output then you might want to file a bug in the [umx](https://github.com/tbates/umx) github.

Replied on Tue, 07/06/2021 - 19:44

tbates Joined: Jul 31, 2009

umxModify, umx_set_silent, umx_set_auto_plot, umxSetParameters

> Is there any difference in `umxModify` between regex = "C_r.c." and update="C_r1c1"

As the matrix is symmetric these are equivalent. It's always easy to check what you've done with
`m1$top$C`, or `parameters(m1)` , or, for path-based models, try `tmx_show(m1)` - it shows all the matrix properties in nice browser tables with roll-overs for properties.

> Do I need to set start values in umx models?

No - umx takes care this for you. But if you want to, you can set them directly. They are just parameters, so just set them: for instance if you wondered about sensitivity to the start value for C, just set the C values quite high to start, e.g. see what the parameters are with `parameters(m1)`, and set with, e.g. `umxSetParameters(m1, "C_r1c1", values=1)`

> Finally, is there a way to suppress warnings, or the attempt to print to browser for umx?

Yes:

```Rsplus
umx_set_silent(TRUE)
umx_set_auto_plot(FALSE)
```

Replied on Tue, 07/06/2021 - 20:44

mirusem Joined: Nov 19, 2018

Thanks so much for the

Thanks so much for the response!

I just figured out about the auto_plot before I saw this, that is exactly what one of them was (and I had umx_set_silent prior too). The only thing that is left at the moment seems to be as a result of the xmu functions (as I found online to match what I am getting). One of them isn't a warning, but the other is. I am wondering if it might be possible to disable these kinds of messages since I am looking into quite a few phenotypes.

The specific popups are from: xmu_show_fit_or_comparison which automatically outputs the log likelihood estimate (this isn't as major but everything adds into computation in terms of print out), and more apparently from
xmu_check_variance:

Polite note: Variance of variable(s) '' and '' is < 0.1.
You might want to express the variable in smaller units, e.g. multiply to use cm instead of metres.
Alternatively umx_scale() for data already in long-format, or umx_scale_wide_twin_data for wide data might be useful.

Given that the phenotypes I am working with are already in their native form, I see this note a lot, and am not sure if it could be suppressed.

And thanks a lot for those clarifications--that all makes plenty of sense.

Replied on Tue, 07/06/2021 - 14:29

AdminRobK Joined: Jan 24, 2014

I partially disagree

In contrast to the referenced thread, I would not add lower or upper bounds since the whole point of the variance parameterization is to not have these bounds.

I partially disagree, in that it's not a bad idea to use a bound to ensure that the _E_ variance is strictly positive.

For example, if the A variance is estimated as -.2 to .2 then you can report the interval [0,.2].

Won't that cause CIs to have smaller coverage probability than they're supposed to?

Replied on Tue, 07/06/2021 - 15:57

jpritikin Joined: May 23, 2012

coverage probability

> > For example, if the A variance is estimated as -.2 to .2 then you can report the interval [0,.2].
>
> Won't that cause CIs to have smaller coverage probability than they're supposed to?

No because variance proportions are proportions. The true values are always between 0 and 1. Or you could regard values outside of 0 and 1 as rejections of the model. For example, if DZ twins are more correlated than MZ twins then there is something else going on besides genetic effects. Hence, it is inappropriate to use the classical ACE model to analyze such data.

Replied on Tue, 07/20/2021 - 03:12

mirusem Joined: Nov 19, 2018

Lower/Upper bound for E with direct variance approach

Is it normal to get NaN values (instead of 1) for the upper/lower bound of the E model? This comes up frequently, with Code 3. I am not sure if this is the same case as mentioned by AdminRobK where C is trivially 0, etc. I am guessing this is the case, though and can be safely regarded as 1 (upper or lower) with corresponding log likelihood that is produced?

Replied on Tue, 07/20/2021 - 03:29

mirusem Joined: Nov 19, 2018

Example

As an example:

lbound estimate ubound lbound Code ubound Code
top.A_std[1,1] NA 0 0 NA 3
top.C_std[1,1] NA 0 0 NA 3
top.E_std[1,1] 1 1 NA 3 NA

Replied on Tue, 07/20/2021 - 04:11

mirusem Joined: Nov 19, 2018

Phenotypes where E lower bound gets code 3 in alternative models

And, one last question hopefully (outside of the E bound question):
I get very few code 3 NaNs for the lower bound of the E estimate in any model in general (AE, etc.) of the ones I select for. Sometimes this is fixable by a change in the E starting value (a bit higher than what I had already set and not too high in certain cases, though this doesn't always work), and sometimes it is fixable by changing the seed. Are these alterations okay to do in this circumstance, even though it's not necessarily consistent with the rest of what I would be using for the rest of the phenotypes? I definitely appreciate it!

Replied on Tue, 07/20/2021 - 05:01

mirusem Joined: Nov 19, 2018

CSOLNP optimizer

I also had type 6 error codes in a few cases (absent upper bound of an A estimate, for example), and along with the type 3 error codes, I could fix these by switching the optimizer to CSOLNP. I guess, again, going to the other question, would this be acceptable even if the vast majority of estimates involve SLSQP as the primary optimizer (it's a bit inconsistent across phenotypes)?

The packages are really nice :)

Replied on Tue, 07/20/2021 - 12:34

AdminRobK Joined: Jan 24, 2014

confidence intervals

Is it normal to get NaN values (instead of 1) for the upper/lower bound of the E model? This comes up frequently, with Code 3. I am not sure if this is the same case as mentioned by AdminRobK where C is trivially 0, etc. I am guessing this is the case, though and can be safely regarded as 1 (upper or lower) with corresponding log likelihood that is produced?

Yes, in an _E_-only model, the upper and lower limits of the confidence interval for the standardized _E_ variance component are trivially 1 (because the standardized _E_ component is fixed to 1 under that model).

And, one last question hopefully (outside of the E bound question):
I get very few code 3 NaNs for the lower bound of the E estimate in any model in general (AE, etc.) of the ones I select for. Sometimes this is fixable by a change in the E starting value (a bit higher than what I had already set and not too high in certain cases, though this doesn't always work), and sometimes it is fixable by changing the seed. Are these alterations okay to do in this circumstance, even though it's not necessarily consistent with the rest of what I would be using for the rest of the phenotypes? I definitely appreciate it!

I also had type 6 error codes in a few cases (absent upper bound of an A estimate, for example), and along with the type 3 error codes, I could fix these by switching the optimizer to CSOLNP. I guess, again, going to the other question, would this be acceptable even if the vast majority of estimates involve SLSQP as the primary optimizer (it's a bit inconsistent across phenotypes)?

If you're getting different results for your CIs by changing the start values, the RNG seed, and/or the optimizer, then I'm concerned that you're also getting a different solution in the primary optimization (i.e., to find the MLE). The fact that changing the start values apparently affects your CI results is especially concerning, since every confidence-limit search begins at the MLE and not at the initial start values. Have you checked whether or not you're getting substantially equivalent point estimates, standard errors, and -2logL each time you try? You might want to first run your MxModel with `intervals=FALSE` to get a good initial solution, and then use omxRunCI() to subsequently get confidence intervals.

Replied on Tue, 07/20/2021 - 12:58

mirusem Joined: Nov 19, 2018

Thanks a lot for the reply.

Thanks a lot for the reply. Definitely helpful.

From what I can tell, it looks like the point estimates are stable (I will look into this more, though). The - 2logL seems consistent as well. The only thing that seems to change is, for example, when changing the E start value, I will get a different lower bound CI (not upper bound, for example) in the cases in which this did not succeed for say, the AE model. Here is an example where it doesn't succeed for the upper bound of the ACE model (info column is -2logLL top row, and AIC second row). This is with no change in the starting value.

lbound estimate ubound note info
top.A_std[1,1] -1.062900 -0.245001 NaN !!! 139.933203
top.C_std[1,1] -0.182275 0.492349 1.059306 -271.866406
top.E_std[1,1] 0.489673 0.752652 1.078756 0.000000

This is the case when I change the optimizer to CSOLNP

lbound estimate ubound note info
top.A_std[1,1] -1.064765 -0.245001 0.571626 139.933203
top.C_std[1,1] -0.186698 0.492349 1.060668 -271.866406
top.E_std[1,1] 0.489466 0.752652 1.078679 0.000000

When I look into the AE model with the error and change the starting value for E (from 0.5 to 0.8) for an estimate I get this:

lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.625025 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] NaN 0.569847 0.820538 !!! 0.000000

to this:
lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.625025 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] 0.393639 0.569847 0.820538 0.000000

and if I change the above (same AE model case) to the CSOLNP optimizer (instead of the starting value to 0.8, so keep that at 0.5), I get:

lbound estimate ubound note info
top.A_std[1,1] 0.178853 0.430153 0.624640 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] 0.375360 0.569847 0.821147 0.000000

Curious about your insight on this.

Replied on Tue, 07/20/2021 - 16:13

mirusem Joined: Nov 19, 2018

top row - log likelihood

My mistake, small correction: top row/last column is just the log likelihood.

Replied on Tue, 07/20/2021 - 21:03

mirusem Joined: Nov 19, 2018

umx equivalent

Outside of the rest of the examples I posted, etc. is there an equivalent option for what you are suggesting specifically in umx? All I know of is the addCI parameter when running the ACE model / variants, so I am not sure how that method translates from OpenMx to the umx interface, etc. So far I haven't changed anything in that respect. The optimizer seems to fail for very few cases, specifically with respect to the Cis (usually it's just one bound of one parameter estimate that it has difficulty with). But out of the phenotypes I am checking, it's relatively few where this occurs (though ideally there would be none).

Thanks a lot for all of the help--I really appreciate it.

Replied on Tue, 07/20/2021 - 13:21

mirusem Joined: Nov 19, 2018

Another phenotype example if perturbed

For another estimate (which didn't have any errors), these are the following differences it has if I change the optimizer/starting E value/seed (below with a hashtag):

lbound estimate ubound note info
top.A_std[1,1] 0.063975 0.342624 0.565213 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434787 0.657376 0.936025 0.000000
#E_start = 0.5, CSOLNP

lbound estimate ubound note info
top.A_std[1,1] 0.063876 0.342624 0.565174 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434826 0.657376 0.936090 0.000000
#E_start = 0.5, default optimizer

lbound estimate ubound note info
top.A_std[1,1] 0.063897 0.342624 0.565175 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434825 0.657376 0.936082 0.000000
#E_start = 0.8, default optimizer

Replied on Tue, 07/20/2021 - 17:36

mirusem Joined: Nov 19, 2018

More comprehensive output

Here is a more comprehensive version (with standard errors). It does look like it's relatively pretty consistent even with SEs (if I am not mistaken).

Here is an example of an estimate where there is a CI bound NaN error.
#Estimate with error (NaN lowerbound, AE).

# seed = first, optimizer = SLSQP, E = 0.5
free parameters:
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.60409619 0.018456489
2 A_r1c1 top.A 1 1 0.01609547 0.005227711
3 E_r1c1 top.E 1 1 0.02132252 0.004326436
lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.625025 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] NaN 0.569847 0.820538 !!! 0.000000

# seed = first, optimizer = SLSQP, E = 0.8
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.60409620 0.018456490
2 A_r1c1 top.A 1 1 0.01609547 0.005227712
3 E_r1c1 top.E 1 1 0.02132252 0.004326436
lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.625025 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] 0.393639 0.569847 0.820538 0.000000

# seed = second, optimizer = SLSQP, E = 0.5
free parameters:
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.60409619 0.018456489
2 A_r1c1 top.A 1 1 0.01609547 0.005227711
3 E_r1c1 top.E 1 1 0.02132252 0.004326436
lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.625016 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] 0.375239 0.569847 0.820538 0.000000

# seed = first, optimizer = CSOLNP, E = 0.5
free parameters:
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.60409614 0.018456474
2 A_r1c1 top.A 1 1 0.01609544 0.005227694
3 E_r1c1 top.E 1 1 0.02132248 0.004326420

lbound estimate ubound note info
top.A_std[1,1] 0.179462 0.430153 0.624903 36.693141
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -67.386281
top.E_std[1,1] 0.375201 0.569847 0.820538 0.000000

Here is an example of the working case:

#Working estimate (AE).

# seed = first, optimizer = SLSQP, E = 0.5

ree parameters:
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.42486799 0.016192053
2 A_r1c1 top.A 1 1 0.01035623 0.004402426
3 E_r1c1 top.E 1 1 0.01986998 0.004057314

lbound estimate ubound note info
top.A_std[1,1] 0.063876 0.342624 0.565174 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434826 0.657376 0.936090 0.000000

# seed = first, optimizer = SLSQP, E = 0.8
free parameters:
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.42486797 0.016192049
2 A_r1c1 top.A 1 1 0.01035622 0.004402423
3 E_r1c1 top.E 1 1 0.01986997 0.004057314

lbound estimate ubound note info
top.A_std[1,1] 0.063897 0.342624 0.565175 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434825 0.657376 0.936082 0.000000

# seed = second, optimizer = SLSQP, E = 0.5
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.42486799 0.016192053
2 A_r1c1 top.A 1 1 0.01035623 0.004402426
3 E_r1c1 top.E 1 1 0.01986998 0.004057314
lbound estimate ubound note info
top.A_std[1,1] 0.063876 0.342624 0.565174 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434826 0.657376 0.936090 0.000000

# seed = first, optimizer = CSOLNP, E = 0.5
name matrix row col Estimate Std.Error A
1 expMean_var1 top.expMean means var1 0.42486797 0.016192034
2 A_r1c1 top.A 1 1 0.01035620 0.004402407
3 E_r1c1 top.E 1 1 0.01986994 0.004057299
lbound estimate ubound note info
top.A_std[1,1] 0.063819 0.342624 0.565174 50.340513
top.C_std[1,1] 0.000000 0.000000 0.000000 !!! -94.681026
top.E_std[1,1] 0.434826 0.657376 0.936118 0.000000

Based on this, should there be any concerns as to switching the optimizer, or maybe an idea as to what really is going on in the few cases in which one of the CI bounds are not calculated?

Replied on Wed, 07/21/2021 - 10:55

AdminRobK Joined: Jan 24, 2014

One miscellaneous hint I

One miscellaneous hint I forgot to mention yesterday: if you're using SLSQP, try setting mxOption "Feasibility tolerance" to a smaller value than its on-load default (say, 1e-3 or 1e-4). That might help, at least as of OpenMx v2.19. What is your `mxVersion()` output, come to think of it? **Edit:** Never mind. I just saw your `mxVersion()` output upthread.

Based on this, should there be any concerns as to switching the optimizer, or maybe an idea as to what really is going on in the few cases in which one of the CI bounds are not calculated?

I do not think there are any concerns relating to switching the optimizer or changing the RNG seed. Neither of those things is considerably changing the point estimates or standard errors, right? But, from what you've included in your posts, I can't really tell what's going on when a confidence limit is reported as `NaN`. I would need to see at least the first seven columns of the 'CI details' table (which prints when you use `summary()` with argument `verbose=TRUE`). I would also need the -2logL at the MLE (which you seem to have intended to include in your posts?). The information in your posts is not very easy to read, either. The tables would be easier to read if they displayed in a fixed-width font, which can be done with Markdown or with HTML tags.

Outside of the rest of the examples I posted, etc. is there an equivalent option for what you are suggesting specifically in umx?

I don't know. Sorry.

Replied on Wed, 07/21/2021 - 15:46

mirusem Joined: Nov 19, 2018

That sounds promising, and it

That sounds promising, and it does work with the alternate optimizer. I have tried changing the tolerance using mxOptions() but it doesn't look like that makes a difference (setting it as such mxOption(key="Feasibility tolerance", value=1e-3)), unless umx might be overriding this, etc (which I wouldn't think it would, but I don't know).

I've never really typed in html, takes more time, but yeah I agree it looks much nicer (and I personally also didn't like how it was saving prior).

confidence intervals:

lbound	estimate	ubound	note
top.A_std[1,1]	0.1794616	0.4301532	0.6244231
top.C_std[1,1]	0.0000000	0.0000000	0.0000000	!!!
top.E_std[1,1]	NA	0.5698468	0.8205384	!!!

CI Details:

parameter	side	value	fit	diagnostic
1 top.A_std[1,1]	lower	0.1794616	-69.54434	success
2 top.A_std[1,1]	upper	0.6244231	-69.54440	success
1 top.C_std[1,1]	lower	0.0000000	-69.54471	success
1 top.C_std[1,1]	upper	0.0000000	-69.54464	success
1 top.E_std[1,1]	lower	0.4012878	-69.82063	alpha level not reached
1 top.E_std[1,1]	upper	0.8205384	-69.54434	success

StatusCode	method	expMean_var1	A_r1c1	E_r1c1
1 OK	neale-miller-1997	0.6039050	0.006549955	0.02994786
2 OK	neale-miller-1997	0.6042243	0.026113065	0.01570644
3 OK	neale-miller-1997	0.6330415	0.010966001	0.02533962
4 OK	neale-miller-1997	0.6384895	0.018864979	0.02289753
5 infeasible non-linear constraint	neale-miller-1997	0.6040962	0.022071510	0.01479346
6 infeasible non-linear constraint	neale-miller-1997	0.6039050	0.006549955	0.02994786

This is when it fails.
Also there is this information:
Model Statistics:

Parameters	Degrees of Freedom	Fit (-2lnL units)
3	141	-73.38628

And, outside of that, it looks like the std errors are NA under the "free parameters" column, specifically when I run summary(verbose = TRUE) on the umxConfint result (which is what gives the results/CIs above). But when I run summary(verbose=TRUE) on just the model prior to umxConfint, the SEs (albeit no CIs) are stable (and can be referred to in the other post, etc.).

And no worries--you've helped me a lot! Hopefully this last bit will give a bit of closure. I will say switching the optimizer fixes the issue though for those few estimates this occurs.

Replied on Wed, 07/21/2021 - 20:38

mirusem Joined: Nov 19, 2018

Optimizer for CI vs optimizer for the model itself including CI

It looks like I can change the optimizer specifically for the CI or also in general for the model estimate / calculation itself, and maybe these two would give slightly different output. Is there a preference of doing either, or is there a benefit, etc. since I would only be doing this for a few phenotypes out of the majority, anyway?

Replied on Thu, 07/22/2021 - 12:04

AdminRobK Joined: Jan 24, 2014

confidence intervals

I've never really typed in html, takes more time, but yeah I agree it looks much nicer (and I personally also didn't like how it was saving prior).

OK, now I can see what's wrong with the lower confidence limit for _E_. The optimizer was unable to adequately worsen the -2logL. For a 95% CI, the target worsening of fit at both limits is about 3.841. For _E_'s lower limit, the worsening was only by about 3.566, which isn't too far off.

It looks like I can change the optimizer specifically for the CI or also in general for the model estimate / calculation itself, and maybe these two would give slightly different output. Is there a preference of doing either, or is there a benefit, etc. since I would only be doing this for a few phenotypes out of the majority, anyway?

It's basically impossible to give a general recommendation about that. Do whatever seems to work best for you.

Replied on Thu, 07/22/2021 - 14:13

mirusem Joined: Nov 19, 2018

Got it--that all makes sense.

Got it--that all makes sense. Thanks for all of the help!!

Replied on Thu, 07/22/2021 - 15:26

AdminRobK Joined: Jan 24, 2014

News

Lower bound confidence interval NaN from time to time

News