# Multilevel Logistic Regression

8 posts / 0 new
Offline
Joined: 02/10/2014 - 16:18
Multilevel Logistic Regression

Hi,

I'm wondering if I could get some tips on how to fit a multilevel logistic regression. I can fit a multilevel model with a continuous outcome without any problems. However, once I set up a threshold, I get an error code of 5 or 6. I am trying to fit a three-level model, so I went with the example posted at https://github.com/OpenMx/OpenMx/blob/3cb6593af0754be134e107cf4f42315d8c18db2b/inst/models/passing/xxm-3.R and modified the outcome to be binary. The code is below:

data('Pastes')
Pastes\$strength = if_else(Pastes\$strength < 60, 1, 0)
Pastes\$strength = mxFactor(Pastes\$strength, levels = c(0, 1))

batch <- mxModel(
'batch', type="RAM",
latentVars = c('batch'),
mxData(data.frame(batch=unique(Pastes$batch)), 'raw', primaryKey='batch'), mxPath('batch', arrows=2, values=1)) sample <- mxModel( 'sample', type="RAM", batch, latentVars = c('sample'), mxData(as.data.frame(Pastes[!duplicated(Pastes$sample), c('batch','sample')]),
'raw', primaryKey='sample'),
mxPath('sample', arrows=2, values=1),
mxPath('batch.batch', 'sample', values=1, free=FALSE, joinKey='batch'))

strength <- mxModel(
'strength', type='RAM', sample,
manifestVars = c('strength'),
mxData(Pastes, 'raw'),
mxPath('one', 'strength'),
mxPath('strength', arrows=2, values=1),
mxThreshold('strength', nThresh = 1, values = 1),
mxPath('sample.sample', 'strength', free=FALSE, values=1, joinKey="sample"))

strength <- mxRun(strength)

I assumed it was because I am estimating the residual for strength, but if I fix the the residual to 1, with

  mxPath('strength', arrows=2, values=1, free = F),

I get the model to converge, but the parameters are totally off from the equivalent model with lme4

(fm01 <- glmer(strength ~ 1 + (1 | sample) + (1 | batch), Pastes, family = binomial(link = 'logit')))

In fact, it appears that I get a negative variance for sample.

I get identical results to lme4 when I don't treat strength as a factor.

Any help would be greatly appreciated.

Offline
Joined: 01/24/2014 - 12:15
OpenMx's multilevel feature

OpenMx's multilevel feature doesn't work with threshold variables. I'm surprised your code runs. It should raise an error instead. That might be a bug.

Offline
Joined: 02/10/2014 - 16:18
re: OpenMx's multilevel feature

Thanks for the reply! Is there any way to run a multilevel logistic regression in OpenMx?

Offline
Joined: 01/24/2014 - 12:15
kluge
Is there any way to run a multilevel logistic regression in OpenMx?

Currently, there's no easy way. It might be possible to kluge something together with an algebra fitfunction and mxEvaluateOnGrid(), but I've never actually tried.

Offline
Joined: 02/10/2014 - 16:18
Resources

Thanks! Do you have any resources you could point me towards to learn how to do that?

Offline
Joined: 03/01/2013 - 14:09
Likely yes

Sam - yes, there is probably is a way via user-defined fit functions, but these might prove tedious to implement.

It is difficult to see how the OpenMx approach to multilevel would work with ordinal data under the threshold model, because to my knowledge there aren't shortcuts (like marginal maximum likelihood) built in, which in turn would require numerical integration over as many dimensions as there are people within a batch.

Offline
Joined: 02/10/2014 - 16:18
Resources

Thanks! Do you have any resources/examples I could look into to learn how to code that? Also, would a probit model be equally as difficult? The reason I want to use OpenMx, as opposed to lme4, is that I have a substantial amount of missing data (~10-20%), and I wanted to take advantage of OpenMx's Full Information Maximum Likelihood.

Offline
Joined: 01/24/2014 - 12:15
use multiple imputation or go Bayes
Also, would a probit model be equally as difficult?

Holding all else constant, doing a probit model in OpenMx would be easier. Doing a non-multilevel probit regression in OpenMx is pretty straightforward.

The reason I want to use OpenMx, as opposed to lme4, is that I have a substantial amount of missing data (~10-20%), and I wanted to take advantage of OpenMx's Full Information Maximum Likelihood.

In OpenMx, if you model the binary response variable conditional on the regressors (i.e., you treat the covariates as "definition variables"), you won't gain any benefit. Observations with missing data on the response will just be ignored, as they would be in lme4. And if you try to run an MxModel when you have NAs on covariates, you'll get an error. The only way OpenMx's FIML estimation is gong to help you is if you model the joint distribution of the response and the covariance, and I'm not sure how to do that in OpenMx given the multilevel nature of your data, and the fact that the response is binary (it would be fairly simple to do with non-multilevel data).

I would suggest one of two things. One is to use multiple imputation to deal with your missing data problem, and stick with lme4 to actually fit your model. The other alternative is to use MCMC Bayes, which treats missing values the way it treats unknown parameters: as random variables. In particular, MCMC Bayes is especially useful for really complicated generalized linear mixed-effects models that just won't converge when fitted by maximum likelihood.