Posterior Class Probabilities and Entropy: Example
Attachment | Size |
---|---|
gmm.R | 6.77 KB |
The first function calculates posterior class probabilities for a mixture model based on the following equation:
P(class=k|i, p_k, L_k) = p_k * L_ik / sum_k=1^m (p_k * L_ik)
where classes 1 through M are indexed by k, persons are indexed by i, p_k is the model estimated probability of being in class k and L_k is a vector of person specific likelihood for class k. Posterior class probabilities are given by Bayes theorum, with the equation above coming from Ramaswamy, Desarbo, Reibstein, and Robinson, 1993. In words, the individual class probabilities are the ratio of that person's raw likelihood for any one class divided by the sum of their likelihoods across all classes, each weighted by the model's estimated class probabilities for the sample.
The function below takes two arguments: an MxModel that was used to fit a mixture model, and the name of the matrix or algebra in that model that defines the class probabilities. It is assumed that the submodels in the provided model are the models for each of the classes that use FIML or RowObjective and return a vector of likelihoods. For instance, if you fit a mixture model with two classes, your structure should be that some object (''gmm'') that contains a model ("GrowthMixtureModel") with two submodels ("class1" and "class2") representing the models for each class. There should be a matrix in the parent model ("GrowthMixtureModel") that contains class probabilities.
indClassProbs <- function(model, classProbs, round=NA){
# this function takes a mixture model in OpenMx
# and returns the posterior class probabilities
# using Bayes rule, individual person-class likelihoods
# and the model class probability matrix, as described in
# Ramaswamy, Desarbo, Reibstein, and Robinson, 1993
cp <- mxEval(classProbs, model)
cp2 <- as.vector(cp)
cps <- diag(length(cp2))
diag(cps) <- cp2
subs <- model@submodels
if(min(dim(cp))!=1)stop("Class probabilities matrix must be a row or column vector.")
if(max(dim(cp))==1)stop("Class probabilities matrix must contain two or more classes.")
of <- function(num){
return(mxEval(objective, subs[[num]]))
}
rl <- sapply(1:length(names(subs)), of)
raw <- (rl%*%cps)
tot <- 1/apply(raw, 1, sum)
div <- matrix(rep(tot, length(cp2)), ncol=length(cp2))
icp <- raw * div
if (is.numeric(round)){icp <- round(icp, round)}
return(icp)
}
The output of this function is a persons x classes matrix of posterior class probabilities.
Next is a function for entropy (again, formula cribbed from Ramaswamy, Desarbo, Reibstein, and Robinson, 1993), which takes a persons x classes matrix of posterior class probabilities and calculates entropy. That equation is:
1-sum_i=1^n sum_k=1^M (-P_ik * ln(P_ik))/(n*ln(k))
And the code:
entropy <- function(classProbs){
# this function takes a matrix of class probabilities from a
# mixture model with people in rows and classes in columns
# and returns the entropy statistic, as given by
# Ramaswamy, Desarbo, Reibstein, and Robinson 1993
n <- dim(classProbs)[1]
k <- dim(classProbs)[2]
e <- 1-sum(-classProbs*log(classProbs))/(n*log(k))
return(e)
}
Entropy is scaled between zero and one, and I don't believe there are consistent criterion for what constitutes "good" entropy, but maybe some users can chime in.
I have questions to prompt discussion:
-Does everything look correct?
-What other mixture-specific stats do we want?
-Is the assumed structure for mixture models tenable?
-Is there a better way to make math in forum posts? The equations above are pretty unreadable.
Thanks for this contribution.
A small remark: variable t1 in line 140 is uninitialized.
Log in or register to post comments
In reply to Thanks for this contribution. by lands101
Thanks! t1 should be icp[,1].
Log in or register to post comments
In reply to Thanks! t1 should be icp[,1]. by Ryne
update gmm.R?
Here is the error message I now get:
> indClassProbs(gmm.fit)
Error in mxEval(classProbs, model) :
'expression' argument is mandatory in call to mxEval function
Log in or register to post comments
In reply to update gmm.R? by paologhisletta
Oops. There was insufficient
indClassProbs <- function(model, classProbs, round=NA){
# this function takes a mixture model in OpenMx
# and returns the posterior class probabilities
# using Bayes rule, individual person-class likelihoods
# and the model class probability matrix, as described in
# Ramaswamy, Desarbo, Reibstein, and Robinson, 1993
if (missing(model) || !(isS4(model) && is(model, "MxModel"))) {
stop("'model' argument must be an MxModel object")
}
if (missing(classProbs) || !(is.character(classProbs) && length(classProbs) == 1)) {
stop("'classProbs' argument must be a character string")
}
cp <- eval(substitute(mxEval(x, model), list(x = as.symbol(classProbs))))
cp2 <- as.vector(cp)
cps <- diag(length(cp2))
diag(cps) <- cp2
subs <- model@submodels
if(min(dim(cp))!=1)stop("Class probabilities matrix must be a row or column vector.")
if(max(dim(cp))==1)stop("Class probabilities matrix must contain two or more classes.")
of <- function(num) {
return(mxEval(objective, subs[[num]]))
}
rl <- sapply(1:length(names(subs)), of)
raw <- (rl %*% cps)
tot <- 1 / apply(raw, 1, sum)
div <- matrix(rep(tot, length(cp2)), ncol=length(cp2))
icp <- raw * div
if (is.numeric(round)){icp <- round(icp, round)}
return(icp)
}
Log in or register to post comments
NIDA workshop to
http://openmx.psyc.virginia.edu/workshops
Log in or register to post comments
LaTeX online compiler
This image is generated by the following link:
http://latex.codecogs.com/gif.latex?P(class=k|i,%20p_k,%20L_k)%20=%20\frac{%20\left%20(%20p_k%20\cdot%20L_{ik}%20\right%20)}{\sum_{j=1}^m%20(p_j%20\cdot%20L_{ij})
The drawbacks with this method are, that this image is now rendered every time someone sees this post and whenever they decide to stop the service, all image links will be broken. So it might be more helpful to generate the picture, download it, and attach it to the post. Nevertheless, inline LaTeX in the forum would be awesome!
Log in or register to post comments