Plans for Mixture model package - help in the process
As part of a bigger project we are working with applications of mixture models. And we have found that no R package provides all the functionality we need. The only programs so far with all the functionality is LatentGold, but this is a propietary and have working with the license has been complicated
For these reasons we decided to find a way to work as many of these models as possible in R, and found that the most promising way is to develop a package around OpenMx for the types of models we need. So, now we are the process of developing the models, and then have the user friendly functions around it
I wanted to start this thread to ask questions as I find go on this process
So far I have developed Latent Class Analysis (LCA) models for continuous, binary, ordered and nominal indicators. These results are close to the results from LatentGold
The first question I have is about the optimizers, so far I have worked with the **mxExpectationMixture()** function, which uses the quasi-newton optimization routines (if I understand properly). But in LatentGold they use [EM with Newton-Raphson](https://www.statisticalinnovations.com/wp-content/uploads/LGtechnical.pdf) (page 56)
Could you help me with these?
- Do you think think this optmization methods would lead to substantial result differences?
- I saw in the **mxComputeEM()** example how to estimate a [LCA with EM](https://rdrr.io/cran/OpenMx/man/mxComputeEM.html). But I cant figure out how to add the **mxComputeNewtonRaphson()** to it, in a similar way as it is done for [IRT models](https://vipbg.vcu.edu/vipbg/OpenMx2/docs//OpenMx/latest/ItemFactorAnalysis.html) **mxComputeEM('expectation', 'scores', mxComputeNewtonRaphson())**
Thank you
Newton-Raphson
You probably know this already, but OpenMx's Newton-Raphson optimizer requires analytic first and second derivatives of the fitfunction being optimized. Presently, only two kinds of models are able to automatically provide such analytic derivatives to the optimizer: IRT models (as you saw), and models that use the GREML expectation and fitfunction. However, `mxFitFunctionAlgebra()` has support for user-provided analytic fitfunction derivatives. For perspective, let's take a look at a modification of the example syntax for `mxComputeEM()` :
mm <- mxModel(
"Mixture", data4mx, class1, class2,
mxAlgebra((1-Posteriors) * Class1.fitfunction, name="PL1"),
mxAlgebra(Posteriors * Class2.fitfunction, name="PL2"),
mxAlgebra(PL1 + PL2, name="PL"),
mxAlgebra(PL2 / PL, recompute='onDemand',
initial=matrix(runif(N,.4,.6), nrow=N, ncol = 1), name="Posteriors"),
mxAlgebra(-2*sum(log(PL)), name="FF"),
mxAlgebra(
# write some expression here that evaluates to a row vector containing the first partial derivatives of 'FF' w/r/t all free parameters
name="grad"),
mxAlgebra(
# write some expression here that evaluates to a matrix of second partial derivatives of 'FF' w/r/t all free parameters
name="hess"),
mxFitFunctionAlgebra(algebra="FF",gradient="grad",hessian="hess"),
mxComputeEM(
estep=mxComputeOnce("Mixture.Posteriors"),
mstep=mxComputeNewtonRaphson(fitfunction="Mixture.fitfunction")))
The above should work, once the comments have been replaced with appropriate MxAlgebra expressions. However, I have never tried anything quite like this before.
As to whether or not it makes a difference to use Newton-Raphson versus quasi-Newton ("gradient descent")...? Well, since Newton-Raphson uses analytic first and second fitfunction derivatives, it will typically be faster (fewer major iterations, fewer fitfunction evaluations) and more numerically accurate. However, OpenMx's Newton-Raphson optimizer does not handle lower and upper bounds ("box constraints") on parameters very well, and is incompatible with explicit MxConstraints. In contrast, OpenMx's three quasi-Newton optimizers have full support for constrained optimization. Also, note that the quasi-Newton optimizers can all optionally use analytic first derivatives of the fitfunction.
Log in or register to post comments
In reply to Newton-Raphson by AdminRobK
Newton-Raphson and HMM
Thank you, I see how to make some examples from these models run with Newton-Raphson, but it would take me longer than I want for the first edition of the package. So that it works for all the models I want to make available. Will work on that addition for a future version.
On another issue that is higher priority in our models, is the addition of Hiddent Markov Models (HMM). I am working on applying HMM with multiple subjects in time series. The example from the **mxExpectationHiddenMarkov()** function is applied with a single time series
start_prob <- c(.2,.4,.4)
transition_prob <- matrix(c(.8, .1, .1,
.3, .6, .1,
.1, .3, .6), 3, 3)
noise <- .05
# simulate a trajectory
state <- sample.int(3, 1, prob=transition_prob %*% start_prob)
trail <- c(state)
for (rep in 1:500) {
state <- sample.int(3, 1, prob=transition_prob[,state])
trail <- c(trail, state)
}
# add noise
trailN <- sapply(trail, function(v) rnorm(1, mean=v, sd=sqrt(noise)))
classes <- list()
for (cl in 1:3) {
classes[[cl]] <- mxModel(paste0("cl", cl), type="RAM",
manifestVars=c("ob"),
mxPath("one", "ob", value=cl, free=FALSE),
mxPath("ob", arrows=2, value=noise, free=FALSE),
mxFitFunctionML(vector=TRUE))
}
m1 <-
mxModel("hmm", classes,
mxData(data.frame(ob=trailN), "raw"),
mxMatrix(nrow=3, ncol=1, free=c(F,T,T),
lbound=0.001, ubound=.99,
labels=paste0('i',1:3), name="initial"),
mxMatrix(nrow=length(classes), ncol=length(classes),
labels=paste0('t', 1:(length(classes) * length(classes))),
name="transition"),
mxExpectationHiddenMarkov(
components=sapply(classes, function(m) m$name),
initial="initial",
transition="transition", scale="softmax"),
mxFitFunctionML())
m1$transition$free[1:(length(classes)-1), 1:length(classes)] <- TRUE
m1 <- mxRun(m1)
summary(m1)
m1$expectation$output
But for most scenarios we have multiple subjects time series. My idea was to define the classes as random intercept models as multilevel models based on the time variables. Or should the HMM be define in another way with multiple subjects?
But when I define the model like this I get an error
Running hmm with 9 parameters
Error in runHelper(model, frontendStart, intervals, silent, suppressWarnings, :
hmm.fitfunction: component class1.fitfunction must be in probability units
Here is my model so far
dat <- rio::import("schizophrenia_markov.sav")
head(dat)
dat$time <- as.factor(dat$time)
nclass <- 2
class <- list()
for(cl in 1:nclass){
class[[cl]] <- mxModel(paste0("class",cl),
type='RAM',
mxModel(paste0('time',cl), type="RAM",
latentVars = c('time'),
mxData(data.frame(time=unique(dat$time)), 'raw', primaryKey='time'),
mxPath('time', arrows=2, values=1)),
manifestVars = c('DEP'),
mxData(dat[,c("DEP","time")], 'raw'),
mxPath('one', 'DEP'),
mxPath('DEP', arrows=2, values=1),
mxPath(paste0('time',cl,'.time'), 'DEP', values=1,
free=FALSE, joinKey='time'),
mxFitFunctionML(vector=TRUE)
)
}
hmm <- mxModel("hmm", class,
mxData(dat[,c("DEP","time")], 'raw'),
mxMatrix(nrow=length(class), ncol=1, free=c(F,T),
lbound=0.001, ubound=.999, values=1,
labels=paste0('i',1:length(class)),
name="initial"),
mxMatrix(nrow=length(class), ncol=length(class),
labels=paste0('t', 1:(length(class) * length(class))),
name="transition"),
mxExpectationHiddenMarkov(
components=sapply(class, function(m) m$name),
initial="initial",
transition="transition", scale="softmax"),
mxFitFunctionML())
hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE
hmmFit <- mxRun(hmm)
summary(hmmFit)
Thanks for all the help!
Log in or register to post comments
multilevel
Log in or register to post comments
In reply to multilevel by jpritikin
HMM with multiple subjects and multiple time points
I was able to make the random intefcept mode work with the wide data format
library(tidyr)
library(OpenMx)
dat <- rio::import("schizophrenia_markov.sav")
head(dat)
###### random intercept SEM structure
dat$time <- as.factor(dat$time)
dat2 <- spread(dat[,c("ID","DEP","time")], key="time", value="DEP")
head(dat2)
colnames(dat2) <- c("ID", paste0("t",0:6))
dat3 <- data.frame(dat2[,2:8])
dataRaw <- mxData( observed=dat3, type="raw" )
vars <- colnames(dat3)
vars
nclass <- 2
class <- list()
for(cl in 1:nclass){
class[[cl]] <- mxModel(paste0("class",cl),
type="RAM",
latentVars = c("int"),
manifestVars=vars,
dataRaw,
mxPath( from="int", arrows=2,
free=T, values=1),
mxPath( from="one", to="int", arrows=1,
free=TRUE, values=1 ),
mxPath( from="int", to=vars, arrows=1,
free=F, values=1 ),
mxPath( from=vars, arrows=2,
free=T, values=rep(0, length(vars)),
labels=paste("e",cl),
lbound=0.001
),
mxPath( from="one", to=vars,
arrows=1,
free=F, values=rep(0, length(vars)),
labels=c(paste0("mean",cl,1:length(vars))) ),
mxFitFunctionML(vector=T)
)
}
hmm <- mxModel("hmm", class,
dataRaw,
mxMatrix(nrow=length(class), ncol=1, free=c(F,T),
lbound=0.0001, ubound=.9999,
values=1,
labels=paste0('i',1:length(class)),
name="initial"),
mxMatrix(nrow=length(class), ncol=length(class),
labels=paste0('t', 1:(length(class) * length(class))),
name="transition"),
mxExpectationHiddenMarkov(
components=sapply(class, function(m) m$name),
initial="initial",
transition="transition", scale="softmax"),
mxFitFunctionML())
hmm$transition$free[1:(length(class)-1), 1:length(class)] <- TRUE
hmmFit <- mxRun(hmm)
hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)
summary(hmmFit)
hmmFit$expectation$output
But this is still not the model I am looking to replicate from HMM. As the model I am looking to replicate is to have multiple subjects with multiple time points. From here the "initial" would be the proportions in each latent class at the first time point, and the "transition" would be the transition between latent classes over time.
So, as another option I attempted to use the long format data, and add the time variable as second level predictor to the initial and transition. But this didnt work as I could only add the time predictor to the initial states and from the data, instead of the second level structure model (similar to this [post](https://github.com/OpenMx/OpenMx/blob/3cb6593af0754be134e107cf4f42315d8c18db2b/inst/models/passing/xxm-3.R) )
dat$time <- as.factor(dat$time)
nclass <- 2
class <- list()
for(cl in 1:nclass){
class[[cl]] <- mxModel(paste0("class",cl),
type='RAM',
manifestVars = c('DEP'),
mxData(dat[,c("DEP","time")], 'raw'),
mxPath('one', 'DEP'),
mxPath('DEP', arrows=2, values=1),
#mxPath(paste0('time',cl,'.time'), 'DEP', values=1,free=FALSE,joinKey='time'),
mxFitFunctionML(vector=TRUE)
)
}
hmm <- mxModel("hmm", class,
## MLM model structure
mxModel(paste0('time'), type="RAM",
latentVars = c('time'),
mxData(data.frame(time=unique(as.factor(dat$time))), 'raw',
primaryKey='time'),
mxPath('time', arrows=2, values=1)),
mxData(dat[,c("DEP","time")], 'raw'),
mxMatrix( type = "Full", nrow = 2, ncol = 2,
free=c(TRUE, FALSE,TRUE,TRUE), values=1,
labels = c("p11","p21", "p12", "p22"),
name = "initialM" ),
mxMatrix(nrow=2, ncol=1, labels=c(NA, "data.time"), values=1,
name="initialV"),#,joinKey='time'),
mxAlgebra(initialM %*% initialV, name="initial"),
mxMatrix(nrow=length(class), ncol=length(class),
labels=paste0('t', 1:(length(class) * length(class))),
name="transitionM"),
mxMatrix(nrow=2, ncol=2, labels=c("data.time", "data.time","data.time", "data.time"), values=1,
name="transitionV"),#,joinKey='time'),
mxAlgebra(transitionM %*% transitionV, name="transition"),
#mxPath('time.time', 'initial', free=FALSE, values=1, joinKey="time"),
mxExpectationHiddenMarkov(
components=sapply(class, function(m) m$name),
initial="initial",
transition="transition", scale="softmax"),
mxFitFunctionML())
hmm$transitionM$free[1:(length(class)-1), 1:length(class)] <- TRUE
hmmFit <- mxTryHard(hmm, extraTries=50, maxMajorIter=5000, exhaustive=T)
summary(hmmFit)
hmmFit$expectation$output
From the **schizophrenia_markov.sav**, the results I am trying to rplicate are
Initial states
Class[=0]
1 2
0.9919 0.0081
Transitions
Class
Class[-1] 1 2
1 0.8351 0.1649
2 0.0013 0.9987
Not sure how to set up the **mxExpectationHiddenMarkov()** correctly here
Appreciate the help
Log in or register to post comments
Similar to this post?
Log in or register to post comments
In reply to Similar to this post? by AdminNeale
Right link, wrong wording
Trying to apply the multilevel structure in the HMM for estimation with multiple subjects and multiple time points
Log in or register to post comments
R package in course
I am working with him on improving and adding more features to this package, including tutorial vignettes.
But I havent been able to work out a Hidden Markov Model when I have multiple items over time, and estimate the transitions between latent states.
Have try:
- random intercept model
- transpose the data
- CFA over time
- growth curve
But none of this have match the expected basic results for a HMM. Would anyone here have a new idea on how to implement this?
Once I have a working example we can translate this into the tidySEM package
Thank you
Log in or register to post comments
See example script
One thing to be careful about: free parameters with the same labels are constrained to be equal. If you leave the free parameters unlabeled, they will be different. If you leave your parameters unlabeled by accident, then you could end up estimating separate free parameters for everyone, and it could be quite a large number of parameters.
Good luck!
Log in or register to post comments