You are here

Plans for Mixture model package - help in the process

9 posts / 0 new
Last post
maugavilla's picture
Offline
Joined: 10/19/2021 - 09:48
Plans for Mixture model package - help in the process

Hi all

As part of a bigger project we are working with applications of mixture models. And we have found that no R package provides all the functionality we need. The only programs so far with all the functionality is LatentGold, but this is a propietary and have working with the license has been complicated

For these reasons we decided to find a way to work as many of these models as possible in R, and found that the most promising way is to develop a package around OpenMx for the types of models we need. So, now we are the process of developing the models, and then have the user friendly functions around it

I wanted to start this thread to ask questions as I find go on this process

So far I have developed Latent Class Analysis (LCA) models for continuous, binary, ordered and nominal indicators. These results are close to the results from LatentGold

The first question I have is about the optimizers, so far I have worked with the mxExpectationMixture() function, which uses the quasi-newton optimization routines (if I understand properly). But in LatentGold they use EM with Newton-Raphson (page 56)

Could you help me with these?
- Do you think think this optmization methods would lead to substantial result differences?
- I saw in the mxComputeEM() example how to estimate a LCA with EM. But I cant figure out how to add the mxComputeNewtonRaphson() to it, in a similar way as it is done for IRT models mxComputeEM('expectation', 'scores', mxComputeNewtonRaphson())

Thank you

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Newton-Raphson

First off, let me say that I'm pleased to hear of your interest in developing a package around OpenMx. OpenMx has a lot of capabilities that rarely see use, since only a power user would know how to write a script to use them.

You probably know this already, but OpenMx's Newton-Raphson optimizer requires analytic first and second derivatives of the fitfunction being optimized. Presently, only two kinds of models are able to automatically provide such analytic derivatives to the optimizer: IRT models (as you saw), and models that use the GREML expectation and fitfunction. However, mxFitFunctionAlgebra() has support for user-provided analytic fitfunction derivatives. For perspective, let's take a look at a modification of the example syntax for mxComputeEM() :

mm <- mxModel(
    "Mixture", data4mx, class1, class2,
    mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"),
    mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"),
    mxAlgebra(PL1 + PL2, name="PL"),
    mxAlgebra(PL2 / PL,  recompute='onDemand',
              initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"),
    mxAlgebra(-2*sum(log(PL)), name="FF"),
    mxAlgebra(
    # write some expression here that evaluates to a row vector containing the first partial derivatives of 'FF' w/r/t all free parameters 
    name="grad"),
mxAlgebra(
    # write some expression here that evaluates to a matrix of second partial derivatives of 'FF' w/r/t all free parameters 
    name="hess"),
    mxFitFunctionAlgebra(algebra="FF",gradient="grad",hessian="hess"),
    mxComputeEM(
      estep=mxComputeOnce("Mixture.Posteriors"),
      mstep=mxComputeNewtonRaphson(fitfunction="Mixture.fitfunction")))

The above should work, once the comments have been replaced with appropriate MxAlgebra expressions. However, I have never tried anything quite like this before.

As to whether or not it makes a difference to use Newton-Raphson versus quasi-Newton ("gradient descent")...? Well, since Newton-Raphson uses analytic first and second fitfunction derivatives, it will typically be faster (fewer major iterations, fewer fitfunction evaluations) and more numerically accurate. However, OpenMx's Newton-Raphson optimizer does not handle lower and upper bounds ("box constraints") on parameters very well, and is incompatible with explicit MxConstraints. In contrast, OpenMx's three quasi-Newton optimizers have full support for constrained optimization. Also, note that the quasi-Newton optimizers can all optionally use analytic first derivatives of the fitfunction.

maugavilla's picture
Offline
Joined: 10/19/2021 - 09:48
Newton-Raphson and HMM

I think the same, we want to make some of these complex models more accesible.

Thank you, I see how to make some examples from these models run with Newton-Raphson, but it would take me longer than I want for the first edition of the package. So that it works for all the models I want to make available. Will work on that addition for a future version.

On another issue that is higher priority in our models, is the addition of Hiddent Markov Models (HMM). I am working on applying HMM with multiple subjects in time series. The example from the mxExpectationHiddenMarkov() function is applied with a single time series

start_prob <- c(.2,.4,.4)
transition_prob <- matrix(c(.8, .1, .1,
                            .3, .6, .1,
                            .1, .3, .6), 3, 3)
noise <- .05
 
# simulate a trajectory
state <- sample.int(3, 1, prob=transition_prob %*% start_prob)
trail <- c(state)
 
for (rep in 1:500) {
  state <- sample.int(3, 1, prob=transition_prob[,state])
  trail <- c(trail, state)
}
 
# add noise
trailN <- sapply(trail, function(v) rnorm(1, mean=v, sd=sqrt(noise)))
 
classes <- list()
 
for (cl in 1:3) {
  classes[[cl]] <- mxModel(paste0("cl", cl), type="RAM",
                           manifestVars=c("ob"),
                           mxPath("one", "ob", value=cl, free=FALSE),
                           mxPath("ob", arrows=2, value=noise, free=FALSE),
                           mxFitFunctionML(vector=TRUE))
}
 
m1 <-  
  mxModel("hmm", classes,
          mxData(data.frame(ob=trailN), "raw"),
          mxMatrix(nrow=3, ncol=1, free=c(F,T,T),
                   lbound=0.001, ubound=.99,
                   labels=paste0('i',1:3), name="initial"),
          mxMatrix(nrow=length(classes), ncol=length(classes),
                   labels=paste0('t', 1:(length(classes) * length(classes))),
                   name="transition"),
          mxExpectationHiddenMarkov(
            components=sapply(classes, function(m) m$name),
            initial="initial",
            transition="transition", scale="softmax"),
          mxFitFunctionML())
 
m1$transition$free[1:(length(classes)-1), 1:length(classes)] <- TRUE
 
m1 <- mxRun(m1)
summary(m1)
m1$expectation$output

But for most scenarios we have multiple subjects time series. My idea was to define the classes as random intercept models as multilevel models based on the time variables. Or should the HMM be define in another way with multiple subjects?

But when I define the model like this I get an error

Running hmm with 9 parameters
Error in runHelper(model, frontendStart, intervals, silent, suppressWarnings,  : 
  hmm.fitfunction: component class1.fitfunction must be in probability units

Here is my model so far

dat <- rio::import("schizophrenia_markov.sav")
head(dat)
 
dat$time <- as.factor(dat$time)
 
nclass <- 2
class <- list()
for(cl in 1:nclass){
  class[[cl]] <- mxModel(paste0("class",cl), 
                         type='RAM', 
                         mxModel(paste0('time',cl), type="RAM",
                           latentVars = c('time'),
                           mxData(data.frame(time=unique(dat$time)), 'raw', primaryKey='time'),
                           mxPath('time', arrows=2, values=1)),
                         manifestVars = c('DEP'),
                         mxData(dat[,c("DEP","time")], 'raw'),
                         mxPath('one', 'DEP'),
                         mxPath('DEP', arrows=2, values=1),
                         mxPath(paste0('time',cl,'.time'), 'DEP', values=1, 
                                free=FALSE, joinKey='time'),
                         mxFitFunctionML(vector=TRUE)
                         )
}
 
 
hmm <- mxModel("hmm", class,
               mxData(dat[,c("DEP","time")], 'raw'),
               mxMatrix(nrow=length(class), ncol=1, free=c(F,T),
                        lbound=0.001, ubound=.999, values=1,
                        labels=paste0('i',1:length(class)), 
                        name="initial"),
               mxMatrix(nrow=length(class), ncol=length(class),
                        labels=paste0('t', 1:(length(class) * length(class))),
                        name="transition"),
               mxExpectationHiddenMarkov(
                 components=sapply(class, function(m) m$name),
                 initial="initial",
                 transition="transition", scale="softmax"),
               mxFitFunctionML())
 
hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE
 
hmmFit <- mxRun(hmm)
summary(hmmFit)

Thanks for all the help!

File attachments: 
jpritikin's picture
Offline
Joined: 05/24/2012 - 00:35
multilevel

One way to get this model working is to group all the observations that happen at the same time into a single row. This would remove the multilevel structure from the data. Each row would be independent. The way you coded it, the rows are not independent because there are multiple rows that happened at the same time.

maugavilla's picture
Offline
Joined: 10/19/2021 - 09:48
HMM with multiple subjects and multiple time points

Hi

I was able to make the random intefcept mode work with the wide data format

library(tidyr)
library(OpenMx)
 
dat <- rio::import("schizophrenia_markov.sav")
head(dat)
 
###### random intercept SEM structure
dat$time <- as.factor(dat$time)
 
dat2 <- spread(dat[,c("ID","DEP","time")], key="time", value="DEP")
head(dat2)
colnames(dat2) <- c("ID", paste0("t",0:6))
 
dat3 <- data.frame(dat2[,2:8])
 
dataRaw <- mxData( observed=dat3, type="raw" )
vars <- colnames(dat3)
vars
 
nclass <- 2
class <- list()
for(cl in 1:nclass){
  class[[cl]] <- mxModel(paste0("class",cl), 
                         type="RAM",
                         latentVars = c("int"),
                         manifestVars=vars,
                         dataRaw, 
                         mxPath( from="int", arrows=2,
                                 free=T, values=1),
                         mxPath( from="one", to="int", arrows=1,
                                 free=TRUE, values=1 ),
                         mxPath( from="int", to=vars, arrows=1,
                                 free=F, values=1 ),
                         mxPath( from=vars, arrows=2,
                                 free=T, values=rep(0, length(vars)), 
                                 labels=paste("e",cl),
                                 lbound=0.001
                                 ),
                         mxPath( from="one", to=vars,
                                 arrows=1,
                                 free=F, values=rep(0, length(vars)),
                                 labels=c(paste0("mean",cl,1:length(vars))) ),
                         mxFitFunctionML(vector=T)
                         )
}
 
hmm <- mxModel("hmm", class,
               dataRaw,
               mxMatrix(nrow=length(class), ncol=1, free=c(F,T),
                        lbound=0.0001, ubound=.9999, 
                        values=1,
                        labels=paste0('i',1:length(class)), 
                        name="initial"),
               mxMatrix(nrow=length(class), ncol=length(class),
                        labels=paste0('t', 1:(length(class) * length(class))),
                        name="transition"),
               mxExpectationHiddenMarkov(
                 components=sapply(class, function(m) m$name),
                 initial="initial",
                 transition="transition", scale="softmax"),
               mxFitFunctionML())
 
hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE
 
hmmFit <- mxRun(hmm)
hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)
 
summary(hmmFit)
hmmFit$expectation$output

But this is still not the model I am looking to replicate from HMM. As the model I am looking to replicate is to have multiple subjects with multiple time points. From here the "initial" would be the proportions in each latent class at the first time point, and the "transition" would be the transition between latent classes over time.
So, as another option I attempted to use the long format data, and add the time variable as second level predictor to the initial and transition. But this didnt work as I could only add the time predictor to the initial states and from the data, instead of the second level structure model (similar to this post )

dat$time <- as.factor(dat$time)
 
nclass <- 2
class <- list()
for(cl in 1:nclass){
  class[[cl]] <- mxModel(paste0("class",cl), 
                         type='RAM',
                         manifestVars = c('DEP'),
                         mxData(dat[,c("DEP","time")], 'raw'),
                         mxPath('one', 'DEP'),
                         mxPath('DEP', arrows=2, values=1),
                         #mxPath(paste0('time',cl,'.time'), 'DEP', values=1,free=FALSE,joinKey='time'),
                         mxFitFunctionML(vector=TRUE)
  )
}
 
hmm <- mxModel("hmm", class,
               ## MLM model structure
               mxModel(paste0('time'), type="RAM",
                       latentVars = c('time'),
                       mxData(data.frame(time=unique(as.factor(dat$time))), 'raw', 
                              primaryKey='time'),
                       mxPath('time', arrows=2, values=1)),
               mxData(dat[,c("DEP","time")], 'raw'),
               mxMatrix( type = "Full", nrow = 2, ncol = 2,
                         free=c(TRUE, FALSE,TRUE,TRUE), values=1,
                         labels = c("p11","p21", "p12", "p22"), 
                         name = "initialM" ),
               mxMatrix(nrow=2, ncol=1, labels=c(NA, "data.time"), values=1, 
                        name="initialV"),#,joinKey='time'),
               mxAlgebra(initialM %*% initialV, name="initial"),
               mxMatrix(nrow=length(class), ncol=length(class),
                        labels=paste0('t', 1:(length(class) * length(class))),
                        name="transitionM"),
               mxMatrix(nrow=2, ncol=2, labels=c("data.time", "data.time","data.time", "data.time"), values=1, 
                        name="transitionV"),#,joinKey='time'),
               mxAlgebra(transitionM %*% transitionV, name="transition"),
               #mxPath('time.time', 'initial', free=FALSE, values=1, joinKey="time"),
               mxExpectationHiddenMarkov(
                 components=sapply(class, function(m) m$name),
                 initial="initial",
                 transition="transition", scale="softmax"),
               mxFitFunctionML())
 
hmm$transitionM$free[1:(length(class)-1), 1:length(class)] <- TRUE
 
hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)
 
summary(hmmFit)
hmmFit$expectation$output

From the schizophrenia_markov.sav, the results I am trying to rplicate are
Initial states
Class[=0]
1 2
0.9919 0.0081

Transitions
Class
Class[-1] 1 2
1 0.8351 0.1649
2 0.0013 0.9987

Not sure how to set up the mxExpectationHiddenMarkov() correctly here

Appreciate the help

File attachments: 
AdminNeale's picture
Offline
Joined: 03/01/2013 - 14:09
Similar to this post?

The “this post” you refer to is just a test script. Can you please edit the link?

maugavilla's picture
Offline
Joined: 10/19/2021 - 09:48
Right link, wrong wording

Sorry for the lack of clarity. Did menat to include that link, as the example code for multilevel regression. But use the wrong wording calling it a post

Trying to apply the multilevel structure in the HMM for estimation with multiple subjects and multiple time points

maugavilla's picture
Offline
Joined: 10/19/2021 - 09:48
R package in course

After talking with colleagues, I found that Caspar van Lissa has added some user friendly mixture modeling with OpenMx in his package tidySEM.

I am working with him on improving and adding more features to this package, including tutorial vignettes.

But I havent been able to work out a Hidden Markov Model when I have multiple items over time, and estimate the transitions between latent states.

Have try:
- random intercept model
- transpose the data
- CFA over time
- growth curve

But none of this have match the expected basic results for a HMM. Would anyone here have a new idea on how to implement this?

Once I have a working example we can translate this into the tidySEM package

Thank you

mhunter's picture
Offline
Joined: 07/31/2009 - 15:26
See example script

If I understand correctly what you're after, then there's an example script for that. It's https://github.com/OpenMx/OpenMx/blob/master/inst/models/passing/HMM-multigroup.R which is a multi-subject HMM test script. Essentially, each person has a tall format data set with different rows being different times, and different columns being different variables. Each person has a hidden Markov model that happens to be the same model. Then the people are combined into a multigroup model where each "group" is a person.

One thing to be careful about: free parameters with the same labels are constrained to be equal. If you leave the free parameters unlabeled, they will be different. If you leave your parameters unlabeled by accident, then you could end up estimating separate free parameters for everyone, and it could be quite a large number of parameters.

Good luck!