You are here

Two-parts mixed effects model for longitudinal data with mixture distribution for random effects

4 posts / 0 new
Last post
Vivifj03's picture
Offline
Joined: 04/25/2019 - 22:14
Two-parts mixed effects model for longitudinal data with mixture distribution for random effects

Hi you all,
My outcome variable is a semicontinuous variable measured over time with a bunch of zero values. Since it is continuous data I cannot use poison or other zero-inflated models. However, I am implementing a two-parts model. The first part is a logistic mixed effects model for modeling the zero currency. Then, the second part is a linear mixed effects model. Now, there are two main issues. First, the random effects of both models are correlated. Second, the distribution of the random effects is a mixture of normal distributions.

Before the mixture of normals was involved in the model, I was using SAS NLMIXED that uses a quasi-Newton optimization of the likelihood approximated by adaptive Gaussian quadrature. In some publications, this approach has been used since it is less complicated than EM - E step requires the expectation of nonlinear quantities with respect to nonstandard distribution and the M step cannot be express in a closed form - and is faster than using Bayes approach.

Now that I am considering the mixture of normals for the distribution of the random effects, I have troubles using the SAS NLMIXED since I do not have information about the membership to the normal distributions - something that is not a big concern when using EM.

Then, here I am looking for ideas!. Somebody suggested that a path model might be an option. So I was wondering if somebody has used OpenMx for doing two-parts models with mixed effects.
Thanks

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
interesting question

Interesting question. Could you say more about the random effects?

mxEvaluateOnGrid() can be used to do something like quadrature.

Now that I am considering the mixture of normals for the distribution of the random effects, I have troubles using the SAS NLMIXED since I do not have information about the membership to the normal distributions

So you need the mixture proportions to be global rather than specific to each observation?

Vivifj03's picture
Offline
Joined: 04/25/2019 - 22:14
Thanks for looking this!

Hi
I attached a small write-up with my model. I am really interested in the estimates for the fixed effects in both parts (logistic and linear mixed models). However, after running the model without the mixture distribution for the random effects the model fit was poor and looking at the density plot of the residuals, they looked bimodal. Then, there was when the heterogeneity model came into my problem.
So far, I've seen that the EM algorithm is used for estimating the likelihood when the mixture of normals is involved, using an unobserved indicator variable (I called $\Delta_{ij}$ in my model). I saw that the EM handle this situation in the E step taking the expectation over all the unobserved $\Delta_{ij}$.
I guess that my goal isn't to clasify each observation, but I need it to get better performance of my model.
Now, since the random effects of the two models are correlated, then the likelihood cannot be evaluated exactly because of the intractable integrals. SAS NLMIXED offers a "fast" estimation of the parameters using approximations for the integrals. But when I tried to include the normal mixture, I have to give also $\Delta_{ij}$, which is unknown. While I'm writing this, I think that I might create a random Delta inside my NLMIXED, but I will like to see other ideas, particularly in R since I will have to do some extension of this model and code it in R might be easier for me.

File attachments: 
AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
some remarks

What will the total dimension of the random effects be? What is it about the data that called for random effects in the first place?

You are probably already aware, but using random effects with a non-identity link function (such as in logistic regression) complicates interpretation of the fixed effects (see attachment).

So far, I've seen that the EM algorithm is used for estimating the likelihood when the mixture of normals is involved, using an unobserved indicator variable (I called $\Delta_{ij}$ in my model). I saw that the EM handle this situation in the E step taking the expectation over all the unobserved $\Delta_{ij}$.
I guess that my goal isn't to clasify each observation, but I need it to get better performance of my model.

If you don't care about classifying individual observations, then it isn't necessary to use EM to fit a mixture model in OpenMx. Consider mm2 in this script.