Posterior Class Probabilities and Entropy: Example

Posted on Wed, 10/20/2010 - 14:10

Ryne Joined: 07/31/2009

Attachment

Size

gmm.R

6.77 KB

Forums

Mixture Distribution and Latent Class Models

Here's an example growth mixture model from the NIDA workshop this week. I've copied the two functions I wrote and placed them in the text of this post for further discussion. Everything else in the code has been added to the GMM demo and documentation pages for the next binary release.

The first function calculates posterior class probabilities for a mixture model based on the following equation:

P(class=k|i, p_k, L_k) = p_k * L_ik / sum_k=1^m (p_k * L_ik)

where classes 1 through M are indexed by k, persons are indexed by i, p_k is the model estimated probability of being in class k and L_k is a vector of person specific likelihood for class k. Posterior class probabilities are given by Bayes theorum, with the equation above coming from Ramaswamy, Desarbo, Reibstein, and Robinson, 1993. In words, the individual class probabilities are the ratio of that person's raw likelihood for any one class divided by the sum of their likelihoods across all classes, each weighted by the model's estimated class probabilities for the sample.

The function below takes two arguments: an MxModel that was used to fit a mixture model, and the name of the matrix or algebra in that model that defines the class probabilities. It is assumed that the submodels in the provided model are the models for each of the classes that use FIML or RowObjective and return a vector of likelihoods. For instance, if you fit a mixture model with two classes, your structure should be that some object (''gmm'') that contains a model ("GrowthMixtureModel") with two submodels ("class1" and "class2") representing the models for each class. There should be a matrix in the parent model ("GrowthMixtureModel") that contains class probabilities.

indClassProbs <- function(model, classProbs, round=NA){ # this function takes a mixture model in OpenMx # and returns the posterior class probabilities # using Bayes rule, individual person-class likelihoods # and the model class probability matrix, as described in # Ramaswamy, Desarbo, Reibstein, and Robinson, 1993 cp <- mxEval(classProbs, model) cp2 <- as.vector(cp) cps <- diag(length(cp2)) diag(cps) <- cp2 subs <- model@submodels if(min(dim(cp))!=1)stop("Class probabilities matrix must be a row or column vector.") if(max(dim(cp))==1)stop("Class probabilities matrix must contain two or more classes.") of <- function(num){ return(mxEval(objective, subs[[num]])) } rl <- sapply(1:length(names(subs)), of) raw <- (rl%*%cps) tot <- 1/apply(raw, 1, sum) div <- matrix(rep(tot, length(cp2)), ncol=length(cp2)) icp <- raw * div if (is.numeric(round)){icp <- round(icp, round)} return(icp) }


The output of this function is a persons x classes matrix of posterior class probabilities.
Next is a function for entropy (again, formula cribbed from Ramaswamy, Desarbo, Reibstein, and Robinson, 1993), which takes a persons x classes matrix of posterior class probabilities and calculates entropy. That equation is:
1-sum_i=1^n sum_k=1^M (-P_ik * ln(P_ik))/(n*ln(k))
And the code:


entropy <- function(classProbs){

	# this function takes a matrix of class probabilities from a

	# mixture model with people in rows and classes in columns

	# and returns the entropy statistic, as given by

	# Ramaswamy, Desarbo, Reibstein, and Robinson 1993

	n <- dim(classProbs)[1]

	k <- dim(classProbs)[2]

	e <- 1-sum(-classProbs*log(classProbs))/(n*log(k))

	return(e)

	}

Entropy is scaled between zero and one, and I don't believe there are consistent criterion for what constitutes "good" entropy, but maybe some users can chime in.
I have questions to prompt discussion:

-Does everything look correct?

-What other mixture-specific stats do we want?

-Is the assumed structure for mixture models tenable?

-Is there a better way to make math in forum posts? The equations above are pretty unreadable.


      
  
    

    

  
    
      
        Replied on Mon, 02/14/2011 - 04:59
      
    
    
      
                      
                                              lands101 Joined: Oct 15, 2009
                            
    
    
      
                
                      
            Thanks for this contribution.
            
                    
            Thanks for this contribution. It works as advertised.
A small remark: variable t1 in line 140 is uninitialized.

      

    
Log in or register to post comments
        
      
    
  


  

    

  
    
      
        Replied on Mon, 02/14/2011 - 09:45
      
    
    
      
                      
                                              Ryne Joined: Jul 31, 2009
                            
    
    
      
                  In reply to Thanks for this contribution. by lands101
                
                      
            Thanks! t1 should be icp[,1].
            
                    
            Thanks! t1 should be icp[,1]. I'll replace the file.

      

    
Log in or register to post comments
        
      
    
  


  

    

  
    
      
        Replied on Thu, 03/29/2012 - 09:37
      
    
    
      
                      
                                              paologhisletta Joined: Jan 19, 2010
                            
    
    
      
                  In reply to Thanks! t1 should be icp[,1]. by Ryne
                
                      
            update gmm.R?
            
                    
            Does Ryne's function 'gmm.R' need updated? It used to work with older version of OpenMx, but it doesn't anymore with v. 1.2.2-1986.
Here is the error message I now get:

> indClassProbs(gmm.fit)

Error in mxEval(classProbs, model) :

  'expression' argument is mandatory in call to mxEval function

      

    
Log in or register to post comments
        
      
    
  


  

    

  
    
      
        Replied on Thu, 03/29/2012 - 10:28
      
    
    
      
                      
                                              mspiegel Joined: Jul 31, 2009
                            
    
    
      
                  In reply to update gmm.R? by paologhisletta
                
                      
            Oops. There was insufficient
            
                    
            Oops. There was insufficient magic to the internal mxEval() call. Don't forget that the function indClassProbs requires two arguments, and the second argument is the name of the classProbs MxMatrix or MxAlgebra in your model. Here's an updated version of the script:
indClassProbs <- function(model, classProbs, round=NA){

	# this function takes a mixture model in OpenMx

	# and returns the posterior class probabilities

	# using Bayes rule, individual person-class likelihoods

	# and the model class probability matrix, as described in

	# Ramaswamy, Desarbo, Reibstein, and Robinson, 1993

	if (missing(model) || !(isS4(model) && is(model, "MxModel"))) {

		stop("'model' argument must be an MxModel object")

	}

	if (missing(classProbs) || !(is.character(classProbs) && length(classProbs) == 1)) {

		stop("'classProbs' argument must be a character string")

	}

	cp <- eval(substitute(mxEval(x, model), list(x = as.symbol(classProbs))))

	cp2 <- as.vector(cp)

	cps <- diag(length(cp2))

	diag(cps) <- cp2

	subs <- model@submodels

	if(min(dim(cp))!=1)stop("Class probabilities matrix must be a row or column vector.")

	if(max(dim(cp))==1)stop("Class probabilities matrix must contain two or more classes.")

	of <- function(num) {

		return(mxEval(objective, subs[[num]]))

	}

	rl <- sapply(1:length(names(subs)), of)

	raw <- (rl %*% cps)

	tot <- 1 / apply(raw, 1, sum)

	div <- matrix(rep(tot, length(cp2)), ncol=length(cp2))

	icp <- raw * div

	if (is.numeric(round)){icp <- round(icp, round)}

	return(icp)

}

      

    
Log in or register to post comments
        
      
    
  

  

    

  
    
      
        Replied on Mon, 02/14/2011 - 09:58
      
    
    
      
                      
                                              tbates Joined: Jul 31, 2009
                            
    
    
      
                
                      
            NIDA workshop to
            
                    
            NIDA workshop to here?

http://openmx.psyc.virginia.edu/workshops

      

    
Log in or register to post comments
        
      
    
  

  

    

  
    
      
        Replied on Thu, 03/29/2012 - 11:39
      
    
    
      
                      
                                              brandmaier Joined: Feb 04, 2010
                            
    
    
      
                
                      
            LaTeX online compiler
            
                    
            Answering the last question: Maybe your forum software supports inline LaTex? If not, we can use a LaTeX generator like http://www.codecogs.com/latex/eqneditor.php to generate formulas and attach them as a picture. For example, I can simply add the formula as linked image (I allowed to alter the formula a little bit to avoid the double usage of 'k'):

This image is generated by the following link:

http://latex.codecogs.com/gif.latex?P(class=k|i,%20p_k,%20L_k)%20=%20\frac{%20\left%20(%20p_k%20\cdot%20L_{ik}%20\right%20)}{\sum_{j=1}^m%20(p_j%20\cdot%20L_{ij})
The drawbacks with this method are, that this image is now rendered every time someone sees this post and whenever they decide to stop the service, all image links will be broken. So it might be more helpful to generate the picture, download it, and attach it to the post. Nevertheless, inline LaTeX in the forum would be awesome!

      

    
Log in or register to post comments
        
      
    
  



  

  
    
    
Log in or register to post comments


             
       
            
    
  
    
      
  
      Search
        

        

    





  

  
    
      
              Login

      
  

  
       News
    
      
  
    
      
      
      
  
  

          OpenMx 2.22 released!
          OpenMx 2.21.13 released!
          Problem with Satorra-Bentler Chi-Squared in WLS
    
  



    
  
          


  

  
       Recent Posts
    
      Interpreting bivariate ACE output
Bivariate model — non-linear association
This forum is about the metaSEM package for meta-analysis
plotting the mediation model
Cluster-robust SEs in State Space Model
More