Row wise log-likelihood contributions from classes

Posted on
Picture of user. maugavilla Joined: 10/19/2021
Hi all

We are working on calculating Vuong LRT between mixture models from OpenMx. Specifically, we are adjusting the function vuongtest() from the package nonnest2.

But we are stuck in one step so far, getting the row-wise log-likelihood contributions to the overall model LL. Because of the structure on OpenMx we are able to get the row-wise contributions for each class, but we havent been able to see how these are combined into the overall model fit

Here is an example, were we get to the point of saving the row-wise contibutions, so far my sum of the row contibutions gets us -4680.073 instead of the overal LL -4251.208

Could someone point us in the right way to combine the class contributions to get the total row-wise contributions?


library(tidySEM)

## Import and prepare data
dat <- rio::import("https://stats.idre.ucla.edu/wp-content/uploads/2016/02/lca1.dat")
dat <- dat[,-1]

dat <- as.data.frame(lapply(dat, ordered))

## use tidySEM to write 2 class LCA model
res0 <- mx_lca(dat, classes = 2, run=F)

## add rowDiagnostics to save row-wise results for each class
res0$class1$fitfunction$rowDiagnostics <- T
res0$class2$fitfunction$rowDiagnostics <- T

## estimate
rr <- mxTryHardOrdinal(res0)
summary(rr, verbose=T)

## see class proportions
wgts <- rr$expectation$output$weights

## see the individual likelihood contributions for each class
lc1 <- rr@submodels[["class1"]]$fitfunction$result
lc2 <- rr@submodels[["class2"]]$fitfunction$result

## overall model fit
rr$fitfunction$result ## -2LL = 8502.416
rr$fitfunction$result*-.5 ## LL = -4251.208

cc <- cbind(l1=log(lc1)*wgts[1],
l2=log(lc2)*wgts[2])

sum(rowSums(cc )) ## -4680.073

Replied on Wed, 06/28/2023 - 14:33
Picture of user. mhunter Joined: 07/31/2009

The row-wise (minus two) log likelihood in a mixture model is the (minus two) log of the weighted sum of the row likelihoods (i.e., densities) from the mixture classes. I think people get confused between the likelihood and the log likelihood. You also have to be careful about the sum of the logs (incorrect) versus the log of the sum (correct) which are not generally equal.

If l1i is the row likelihood for row i in class 1, then -2*log(l1i) is the minus two log likelihood for row i in class 1. See the subject line for the equation. Below is the updated correct code.


library(tidySEM)

## Import and prepare data
dat <- rio::import("https://stats.idre.ucla.edu/wp-content/uploads/2016/02/lca1.dat")
dat <- dat[,-1]

dat <- as.data.frame(lapply(dat, ordered))

## use tidySEM to write 2 class LCA model
res0 <- mx_lca(dat, classes = 2, run=F)

## add rowDiagnostics to save row-wise results for each class
res0$class1$fitfunction$rowDiagnostics <- T
res0$class2$fitfunction$rowDiagnostics <- T

## estimate
rr <- mxTryHardOrdinal(res0)
summary(rr, verbose=T)

## see class proportions
wgts <- rr$expectation$output$weights

## see the individual likelihood contributions for each class
lc1 <- attr(mxEval(class1.fitfunction, rr), 'likelihoods')
lc2 <- attr(mxEval(class2.fitfunction, rr), 'likelihoods')

## overall model fit
-2*logLik(rr) ## -2LL = 8502.416
logLik(rr) ## LL = -4251.208

cc <- cbind(l1=lc1*wgts[1],
l2=lc2*wgts[2])

# rowSums(cc) is the full row-wise likelihood
# log(rowSums(cc)) is the full row-wise log likelihood
# sum(log(rowSums(cc))) is the full model log likelihood

sum(log(rowSums(cc))) ## -4251.208