# Plans for Mixture model package - help in the process

As part of a bigger project we are working with applications of mixture models. And we have found that no R package provides all the functionality we need. The only programs so far with all the functionality is LatentGold, but this is a propietary and have working with the license has been complicated

For these reasons we decided to find a way to work as many of these models as possible in R, and found that the most promising way is to develop a package around OpenMx for the types of models we need. So, now we are the process of developing the models, and then have the user friendly functions around it

I wanted to start this thread to ask questions as I find go on this process

So far I have developed Latent Class Analysis (LCA) models for continuous, binary, ordered and nominal indicators. These results are close to the results from LatentGold

The first question I have is about the optimizers, so far I have worked with the **mxExpectationMixture()** function, which uses the quasi-newton optimization routines (if I understand properly). But in LatentGold they use [EM with Newton-Raphson](https://www.statisticalinnovations.com/wp-content/uploads/LGtechnical.pdf) (page 56)

Could you help me with these?

- Do you think think this optmization methods would lead to substantial result differences?

- I saw in the **mxComputeEM()** example how to estimate a [LCA with EM](https://rdrr.io/cran/OpenMx/man/mxComputeEM.html). But I cant figure out how to add the **mxComputeNewtonRaphson()** to it, in a similar way as it is done for [IRT models](https://vipbg.vcu.edu/vipbg/OpenMx2/docs//OpenMx/latest/ItemFactorAnalysis.html) **mxComputeEM('expectation', 'scores', mxComputeNewtonRaphson())**

Thank you

## Newton-Raphson

You probably know this already, but OpenMx's Newton-Raphson optimizer requires analytic first and second derivatives of the fitfunction being optimized. Presently, only two kinds of models are able to automatically provide such analytic derivatives to the optimizer: IRT models (as you saw), and models that use the GREML expectation and fitfunction. However, `mxFitFunctionAlgebra()` has support for user-provided analytic fitfunction derivatives. For perspective, let's take a look at a modification of the example syntax for `mxComputeEM()` :

mm <- mxModel(

"Mixture", data4mx, class1, class2,

mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"),

mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"),

mxAlgebra(PL1 + PL2, name="PL"),

mxAlgebra(PL2 / PL, recompute='onDemand',

initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"),

mxAlgebra(-2*sum(log(PL)), name="FF"),

mxAlgebra(

# write some expression here that evaluates to a row vector containing the first partial derivatives of 'FF' w/r/t all free parameters

name="grad"),

mxAlgebra(

# write some expression here that evaluates to a matrix of second partial derivatives of 'FF' w/r/t all free parameters

name="hess"),

mxFitFunctionAlgebra(algebra="FF",gradient="grad",hessian="hess"),

mxComputeEM(

estep=mxComputeOnce("Mixture.Posteriors"),

mstep=mxComputeNewtonRaphson(fitfunction="Mixture.fitfunction")))

The above should work, once the comments have been replaced with appropriate MxAlgebra expressions. However, I have never tried anything quite like this before.

As to whether or not it makes a difference to use Newton-Raphson versus quasi-Newton ("gradient descent")...? Well, since Newton-Raphson uses analytic first and second fitfunction derivatives, it will typically be faster (fewer major iterations, fewer fitfunction evaluations) and more numerically accurate. However, OpenMx's Newton-Raphson optimizer does not handle lower and upper bounds ("box constraints") on parameters very well, and is incompatible with explicit MxConstraints. In contrast, OpenMx's three quasi-Newton optimizers have full support for constrained optimization. Also, note that the quasi-Newton optimizers can all optionally use analytic first derivatives of the fitfunction.

Log in or register to post comments

In reply to Newton-Raphson by AdminRobK

## Newton-Raphson and HMM

Thank you, I see how to make some examples from these models run with Newton-Raphson, but it would take me longer than I want for the first edition of the package. So that it works for all the models I want to make available. Will work on that addition for a future version.

On another issue that is higher priority in our models, is the addition of Hiddent Markov Models (HMM). I am working on applying HMM with multiple subjects in time series. The example from the **mxExpectationHiddenMarkov()** function is applied with a single time series

start_prob <- c(.2,.4,.4)

transition_prob <- matrix(c(.8, .1, .1,

.3, .6, .1,

.1, .3, .6), 3, 3)

noise <- .05

# simulate a trajectory

state <- sample.int(3, 1, prob=transition_prob %*% start_prob)

trail <- c(state)

for (rep in 1:500) {

state <- sample.int(3, 1, prob=transition_prob[,state])

trail <- c(trail, state)

}

# add noise

trailN <- sapply(trail, function(v) rnorm(1, mean=v, sd=sqrt(noise)))

classes <- list()

for (cl in 1:3) {

classes[[cl]] <- mxModel(paste0("cl", cl), type="RAM",

manifestVars=c("ob"),

mxPath("one", "ob", value=cl, free=FALSE),

mxPath("ob", arrows=2, value=noise, free=FALSE),

mxFitFunctionML(vector=TRUE))

}

m1 <-

mxModel("hmm", classes,

mxData(data.frame(ob=trailN), "raw"),

mxMatrix(nrow=3, ncol=1, free=c(F,T,T),

lbound=0.001, ubound=.99,

labels=paste0('i',1:3), name="initial"),

mxMatrix(nrow=length(classes), ncol=length(classes),

labels=paste0('t', 1:(length(classes) * length(classes))),

name="transition"),

mxExpectationHiddenMarkov(

components=sapply(classes, function(m) m$name),

initial="initial",

transition="transition", scale="softmax"),

mxFitFunctionML())

m1$transition$free[1:(length(classes)-1), 1:length(classes)] <- TRUE

m1 <- mxRun(m1)

summary(m1)

m1$expectation$output

But for most scenarios we have multiple subjects time series. My idea was to define the classes as random intercept models as multilevel models based on the time variables. Or should the HMM be define in another way with multiple subjects?

But when I define the model like this I get an error

Running hmm with 9 parameters

Error in runHelper(model, frontendStart, intervals, silent, suppressWarnings, :

hmm.fitfunction: component class1.fitfunction must be in probability units

Here is my model so far

dat <- rio::import("schizophrenia_markov.sav")

head(dat)

dat$time <- as.factor(dat$time)

nclass <- 2

class <- list()

for(cl in 1:nclass){

class[[cl]] <- mxModel(paste0("class",cl),

type='RAM',

mxModel(paste0('time',cl), type="RAM",

latentVars = c('time'),

mxData(data.frame(time=unique(dat$time)), 'raw', primaryKey='time'),

mxPath('time', arrows=2, values=1)),

manifestVars = c('DEP'),

mxData(dat[,c("DEP","time")], 'raw'),

mxPath('one', 'DEP'),

mxPath('DEP', arrows=2, values=1),

mxPath(paste0('time',cl,'.time'), 'DEP', values=1,

free=FALSE, joinKey='time'),

mxFitFunctionML(vector=TRUE)

)

}

hmm <- mxModel("hmm", class,

mxData(dat[,c("DEP","time")], 'raw'),

mxMatrix(nrow=length(class), ncol=1, free=c(F,T),

lbound=0.001, ubound=.999, values=1,

labels=paste0('i',1:length(class)),

name="initial"),

mxMatrix(nrow=length(class), ncol=length(class),

labels=paste0('t', 1:(length(class) * length(class))),

name="transition"),

mxExpectationHiddenMarkov(

components=sapply(class, function(m) m$name),

initial="initial",

transition="transition", scale="softmax"),

mxFitFunctionML())

hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE

hmmFit <- mxRun(hmm)

summary(hmmFit)

Thanks for all the help!

Log in or register to post comments

## multilevel

Log in or register to post comments

In reply to multilevel by jpritikin

## HMM with multiple subjects and multiple time points

I was able to make the random intefcept mode work with the wide data format

library(tidyr)

library(OpenMx)

dat <- rio::import("schizophrenia_markov.sav")

head(dat)

###### random intercept SEM structure

dat$time <- as.factor(dat$time)

dat2 <- spread(dat[,c("ID","DEP","time")], key="time", value="DEP")

head(dat2)

colnames(dat2) <- c("ID", paste0("t",0:6))

dat3 <- data.frame(dat2[,2:8])

dataRaw <- mxData( observed=dat3, type="raw" )

vars <- colnames(dat3)

vars

nclass <- 2

class <- list()

for(cl in 1:nclass){

class[[cl]] <- mxModel(paste0("class",cl),

type="RAM",

latentVars = c("int"),

manifestVars=vars,

dataRaw,

mxPath( from="int", arrows=2,

free=T, values=1),

mxPath( from="one", to="int", arrows=1,

free=TRUE, values=1 ),

mxPath( from="int", to=vars, arrows=1,

free=F, values=1 ),

mxPath( from=vars, arrows=2,

free=T, values=rep(0, length(vars)),

labels=paste("e",cl),

lbound=0.001

),

mxPath( from="one", to=vars,

arrows=1,

free=F, values=rep(0, length(vars)),

labels=c(paste0("mean",cl,1:length(vars))) ),

mxFitFunctionML(vector=T)

)

}

hmm <- mxModel("hmm", class,

dataRaw,

mxMatrix(nrow=length(class), ncol=1, free=c(F,T),

lbound=0.0001, ubound=.9999,

values=1,

labels=paste0('i',1:length(class)),

name="initial"),

mxMatrix(nrow=length(class), ncol=length(class),

labels=paste0('t', 1:(length(class) * length(class))),

name="transition"),

mxExpectationHiddenMarkov(

components=sapply(class, function(m) m$name),

initial="initial",

transition="transition", scale="softmax"),

mxFitFunctionML())

hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE

hmmFit <- mxRun(hmm)

hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)

summary(hmmFit)

hmmFit$expectation$output

But this is still not the model I am looking to replicate from HMM. As the model I am looking to replicate is to have multiple subjects with multiple time points. From here the "initial" would be the proportions in each latent class at the first time point, and the "transition" would be the transition between latent classes over time.

So, as another option I attempted to use the long format data, and add the time variable as second level predictor to the initial and transition. But this didnt work as I could only add the time predictor to the initial states and from the data, instead of the second level structure model (similar to this [post](https://github.com/OpenMx/OpenMx/blob/3cb6593af0754be134e107cf4f42315d8c18db2b/inst/models/passing/xxm-3.R) )

dat$time <- as.factor(dat$time)

nclass <- 2

class <- list()

for(cl in 1:nclass){

class[[cl]] <- mxModel(paste0("class",cl),

type='RAM',

manifestVars = c('DEP'),

mxData(dat[,c("DEP","time")], 'raw'),

mxPath('one', 'DEP'),

mxPath('DEP', arrows=2, values=1),

#mxPath(paste0('time',cl,'.time'), 'DEP', values=1,free=FALSE,joinKey='time'),

mxFitFunctionML(vector=TRUE)

)

}

hmm <- mxModel("hmm", class,

## MLM model structure

mxModel(paste0('time'), type="RAM",

latentVars = c('time'),

mxData(data.frame(time=unique(as.factor(dat$time))), 'raw',

primaryKey='time'),

mxPath('time', arrows=2, values=1)),

mxData(dat[,c("DEP","time")], 'raw'),

mxMatrix( type = "Full", nrow = 2, ncol = 2,

free=c(TRUE, FALSE,TRUE,TRUE), values=1,

labels = c("p11","p21", "p12", "p22"),

name = "initialM" ),

mxMatrix(nrow=2, ncol=1, labels=c(NA, "data.time"), values=1,

name="initialV"),#,joinKey='time'),

mxAlgebra(initialM %*% initialV, name="initial"),

mxMatrix(nrow=length(class), ncol=length(class),

labels=paste0('t', 1:(length(class) * length(class))),

name="transitionM"),

mxMatrix(nrow=2, ncol=2, labels=c("data.time", "data.time","data.time", "data.time"), values=1,

name="transitionV"),#,joinKey='time'),

mxAlgebra(transitionM %*% transitionV, name="transition"),

#mxPath('time.time', 'initial', free=FALSE, values=1, joinKey="time"),

mxExpectationHiddenMarkov(

components=sapply(class, function(m) m$name),

initial="initial",

transition="transition", scale="softmax"),

mxFitFunctionML())

hmm$transitionM$free[1:(length(class)-1), 1:length(class)] <- TRUE

hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)

summary(hmmFit)

hmmFit$expectation$output

From the **schizophrenia_markov.sav**, the results I am trying to rplicate are

Initial states

Class[=0]

1 2

0.9919 0.0081

Transitions

Class

Class[-1] 1 2

1 0.8351 0.1649

2 0.0013 0.9987

Not sure how to set up the **mxExpectationHiddenMarkov()** correctly here

Appreciate the help

Log in or register to post comments

## Similar to this post?

Log in or register to post comments

In reply to Similar to this post? by AdminNeale

## Right link, wrong wording

Trying to apply the multilevel structure in the HMM for estimation with multiple subjects and multiple time points

Log in or register to post comments

## R package in course

I am working with him on improving and adding more features to this package, including tutorial vignettes.

But I havent been able to work out a Hidden Markov Model when I have multiple items over time, and estimate the transitions between latent states.

Have try:

- random intercept model

- transpose the data

- CFA over time

- growth curve

But none of this have match the expected basic results for a HMM. Would anyone here have a new idea on how to implement this?

Once I have a working example we can translate this into the tidySEM package

Thank you

Log in or register to post comments

## See example script

One thing to be careful about: free parameters with the same labels are constrained to be equal. If you leave the free parameters unlabeled, they will be different. If you leave your parameters unlabeled by accident, then you could end up estimating separate free parameters for everyone, and it could be quite a large number of parameters.

Good luck!

Log in or register to post comments