# Weird model comparison results when using WLS

Attachment | Size |
---|---|

WLS_error_SATtests.R | 4.55 KB |

WLS_error_ACEvAE.R | 5.52 KB |

**UPDATE!**

**I have found the issue**. A colleague tried to reproduce my error with the code below, unsuccessfully. For him, the results were as expected: *SBchisq* showed the simple difference in chi-squares, and the comparison was not significant. We therefore concluded that this is a bug which is only apparent in the most recent version of OpenMx - yet to be released as a binary. I run an OpenMx-version compiled from source code (OpenMx version number: 2.20.6.19), as the [bug fix I depend on](https://openmx.ssri.psu.edu/node/4812) is yet to be released as a binary.

Thanks nonetheless!

------------------------------------------------------------------------

Hi everyone

I'm currently building an extended children-of-twins-and-siblings model with partly dichotomous data. The number of free parameters (>10) make maximum likelihood unfeasible (i.e., the processing time is too long), so I therefore use the WLS fit function instead.

I have gotten to the point where I am comparing nested models, but unfortunately, I get results that I do not quite understand. Despite a higher AIC, the unrestricted model is seemingly significantly better than the restricted one. When using WLS, the `mxCompare()`

function reports *SBchisq* instead of *-2LL* (i.e., *the Satorra-Bentler (2001) scaled difference chi-squared test statistics*). On the [info page for mxCompare](https://rdrr.io/cran/OpenMx/man/mxCompare.html), it says that *"Under full weighted least squares, the Satorra-Bentler chi-squared value is equal to the difference in the model chi-squared values"*. Despite using full weighted least squares, the reported *SBchisq* is very much larger than the difference between the chi squares of the two models.

I have reproduced the problem with a simple simulation. I have simulated binary data where the AE model is the true model. I then run and compare an ACE model with an AE model. As expected, the C-component in the ACE model has confidence intervals that contain zero, and the AE model has lower AIC indicating better fit. Despite this, `mxCompare()`

reports that the ACE model is significantly better than the AE-model. The true chi square difference is 0.156, but *SBchisq* reports 777.55 (which is what the p-value seems to be based on). Surely this is not correct?

base comparison ep chisq df AIC SBchisq diffdf p

1 ACE

2 ACE AE 2 13.64318 4 17.64318 777.5465 1 4.110884e-171

The code is reproduced at the end of the post.

**So the question is:** Why am I getting the wrong p-value? Have I misspecified something, or is it a bug? Or is it something I don't understand about chi-square tests and WLS which means that this is indeed the true result? Is there a reason I cannot simply take the difference in chi squares and test it manually, like this:?

> pchisq(0.15551, 1, lower.tail = FALSE)

[1] 0.6933245

Thank you very much for your help. I really appreciate the help that is available from this forum

Sincerely

Hans Fredrik

Here is the code (also uploaded as "WLS_error_ACEvAE.R"):

library(OpenMx)

mxOption(key="Default optimizer", value="NPSOL")

```
```# PREPARE HELPER FUNCTION

print.CI <- function(fit) {

x <- fit$output$confidenceIntervals

rownames(x) <- colnames(fit$est_var$result)

return(round(x,2))

}

# ----------------------------------------------------------------------------------------------------------------------

# PREPARE DATA (SIMULATION)

# Set parameters

set.seed(123456789)

n <- 10000 # n/2 per zygosity

VA1 <- 0.4 # VE1 = 1-VA1

cut <- 0.8 # z-value

# Prepares data frame

dt <- data.frame(matrix(NA,nrow=n,ncol=3,dimnames = list(NULL,c("x1","x2","rel"))))

# Simulates genetic components

A1_mz <- MASS::mvrnorm(n/2,c(0,0), matrix(c(1,1,

1,1),2,2,T))

A1_dz <- MASS::mvrnorm(n/2,c(0,0), matrix(c( 1,.5,

.5, 1),2,2,T))

# Weighs genetic component and adds environmental component.

dt[1:(n/2) , 1:2] <- sqrt(VA1)*A1_mz + rnorm(n,0,sqrt(1-VA1))

dt[(n/2+1):n , 1:2] <- sqrt(VA1)*A1_dz + rnorm(n,0,sqrt(1-VA1))

# Adds relatedness coefficients to data frame

dt[,"rel"] <- c(rep(1,n/2),rep(0.5,n/2))

# Dichotomises the data

dt[,1:2] <- ifelse(dt[,1:2] > cut,1,0)

# Select Variables for Analysis

selVars <- c('x1','x2') # list of variables names

ntv <- length(selVars) # number of total variables

# Converts to factor variables

dt[,selVars] <- mxFactor(dt[,selVars], levels = c(0,1))

# ----------------------------------------------------------------------------------------------------------------------

# PREPARE MODEL

#####################################################

#### Creates Data Objects ####

#####################################################

dataMZ <- mxData( observed=subset(dt, select = selVars, subset = rel == 1), type="raw" )

dataDZ <- mxData( observed=subset(dt, select = selVars, subset = rel == 0.5), type="raw" )

#####################################################

#### Creates Free Parameters ####

#####################################################

varA1 <- mxMatrix(name="VA1", type="Symm", nrow=1, ncol=1, free=TRUE, values=0, label="est_VA1" )

varC1 <- mxMatrix(name="VC1", type="Symm", nrow=1, ncol=1, free=TRUE, values=0, label="est_VC1" )

varE1 <- mxMatrix(name="VE1", type="Symm", nrow=1, ncol=1, free=FALSE, values=1, label="est_VE1" )

#######################################################

#### Creates Algebra for Expected (Co-)Variances ####

#######################################################

var1 <- mxAlgebra(expression = VA1+VC1+VE1, name="var1" )

covMZ <- mxAlgebra(expression = VA1+VC1, name="covMZ")

covDZ <- mxAlgebra(expression = 0.5*VA1+VC1, name="covDZ")

#######################################################

#### Creates Expectation Objects ####

#######################################################

expCovMZ <- mxAlgebra( expression= rbind( cbind( var1, covMZ),

cbind( covMZ, var1)), name="expCovMZ" )

expCovDZ <- mxAlgebra( expression= rbind( cbind( var1, covDZ),

cbind( covDZ, var1)), name="expCovDZ" )

threG <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=TRUE, values=0, labels="threshold", name="threG")

meanG <- mxMatrix( type="Full", nrow=1, ncol=ntv, free=FALSE, values=0, labels="mean", name="meanG" )

expMZ <- mxExpectationNormal( covariance = "expCovMZ",

means = "meanG",

thresholds = "threG",

dimnames = selVars )

expDZ <- mxExpectationNormal( covariance = "expCovDZ",

means = "meanG",

thresholds = "threG",

dimnames = selVars )

#######################################################

#### FINAL PREPARATIONS ####

#######################################################

# Creates standardized estimates + confidence intervals

est_var <- mxAlgebra(expression=cbind(VA1/var1, VC1/var1, VE1/var1, threshold/sqrt(var1)),

name="est_var", dimnames=list("est",c("VA1","VC1","VE1","threshold")) )

CIs <- mxCI(c("est_var"))

fitFun <- mxFitFunctionWLS("WLS")

modelMZ <- mxModel(threG, meanG, varA1, varC1, varE1, var1, covMZ, expCovMZ, expMZ, dataMZ, fitFun, name = "MZ")

modelDZ <- mxModel(threG, meanG, varA1, varC1, varE1, var1, covDZ, expCovDZ, expDZ, dataDZ, fitFun, name = "DZ")

multi <- mxFitFunctionMultigroup( c("MZ","DZ") )

#######################################################

#### Run first model ####

#######################################################

model1 <- mxModel(threG, meanG, expCovMZ, expCovDZ,

varA1, varC1, varE1, var1, covMZ, covDZ,

modelMZ, modelDZ, multi, CIs, est_var,

name="ACE" )

fit1 <- mxRun(model1, intervals = T)

print.CI(fit1)

#######################################################

#### Prepare and run constrained model ####

#######################################################

model2 <- mxModel(name = "AE", fit1)

model2 <- omxSetParameters( model2, label="est_VC1", free=FALSE, values=0 )

fit2 <- mxRun(model2, intervals = T)

print.CI(fit2)

# Compares models

mxCompare(fit1,fit2)

I have also attached a similar code ("WLS_error_SATtests.R") which compares a saturated model with equality-constrained thresholds. This simulation has similar problems, but the *SBchisq* and p-value is returned as NaN (Not a number). I guess this has something to do with the chi square for the saturated model being 0.

## Hmm ... looking into this

Thanks for the clear code and working example! I'll have to take some time to check into this. The Satorra-Bentler chi-squared value seems way off, but I'm not sure what's causing it yet. The

`mxCompare()`

behavior for WLS models is still relatively new; I've tested it, but not exhaustively. There is mostly likely a bug that I didn't catch in my initial testing.At least one mystery is easy to solve: the p-value is for the

`SBchisq`

, not the chi-squared in the`chisq`

column.> pchisq(777.5465, 1, lower.tail = FALSE)

[1] 4.110957e-171

Satorra and Bentler (2001), found that this scaled difference chi-squared behaved better in smaller samples than the standard chi-squared, and performed equally well-for large samples.

I'll update you on this thread as I find out more!

Cheers,

Mike Hunter

Log in or register to post comments

In reply to Hmm ... looking into this by mhunter

## Thanks!

Cheers!

Log in or register to post comments

## Fixed

Log in or register to post comments