You are here

Poisson Counts

8 posts / 0 new
Last post
rabil's picture
Offline
Joined: 01/14/2010 - 16:47
Poisson Counts

I've asked about this before but I don't think I ever received any replies. Can OpenMx handle Poisson counts as observed variables? I have counts that are too small to treat as continuous Normally distributed outcomes and I need a way to model them. Any insights would be greatly appreciated.

salarim1's picture
salarim1 (not verified)
poisson gmm

hi
I want to add poisson likelihood function in growth mixture model for dependent variable.I need to model count variable.
Any insights would be greatly appreciated.

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Yes, but it's not built-in

Yes, OpenMx can utilize the Poisson likelihood function, but it is not built-in. You will have to set up your model using an algebra objective or an R objective. See my examples below. Note that if your count variable takes relatively few unique values in your sample, you could consider ordinalizing it and doing the analysis as an ordinal thresholds model instead.

Let's first simulate some data for some simple examples:

dat <- cbind( rpois(1000,1), 1, rnorm(1000) )
dimnames(dat) <- list(NULL,c("y","x0","x1"))

If we want to do a simple Poisson regression, setting up the model to use a row-algebra objective would look something like this:

PoisAlgModel <- mxModel(
  "PoissonAlgebra",
  mxData(observed=dat,type="raw"),
  mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
  mxMatrix(type="Full",nrow=1,ncol=2,free=F,labels=c("data.x0","data.x1"),name="xt"),
  mxAlgebra( exp(xt%*%b), name="yhat", dimnames=list(NULL,"y")),
  mxAlgebra( -2*filteredDataRow*log(yhat) + 2*yhat, name="rowAlgebra" ),
  mxAlgebra( sum(rowResults), name="reduceAlgebra" ),
  mxRowObjective(rowAlgebra="rowAlgebra",reduceAlgebra="reduceAlgebra",dimnames=c("y"))
)
PoisAlgFit <- mxRun(PoisAlgModel)

If we want to do the same analysis, but with an R objective, the code would look something like this:

Poisobj <- function(model,state){
  modeldata <- model$data@observed
  b0 <- model$b@values[1,1]
  b1 <- model$b@values[2,1]
  return(-2*sum(apply(modeldata,1,function(row){dpois(x=row[1],lambda=exp(b0*row[2] + b1*row[3]),log=T)})))
}
PoisRModel <- mxModel(
  "PoissonAlgebra",
  mxData(observed=dat,type="raw"),
  mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
  mxRObjective(Poisobj)
)
PoisRFit <- mxRun(PoisRModel)

Note that the two models' likelihood functions are not equal, but merely proportional. So, the parameter estimates should be the same (or very very close), but the value of the -2loglikelihood won't be. It sounds as though you have a longitudinal mixture analysis in mind, so naturally, your objective function will be way more complicated than the ones I demonstrated here, but I hope I've given enough to get you started. I might be able to help you more if you say more about the analysis you want to do.

Finally, I feel I certainly should warn you about the problem of overdispersion, in case you don't know. If your count variable's variance appreciably exceeds its mean, even after conditioning on covariates, you should not trust the standard errors, confidence intervals, and likelihood-ratio tests from models using the classic Poisson likelihood. You would need to adjust those inferential results appropriately, or use a different objective function. The point estimates of your regression coefficients won't be biased by the overdispersion, though.

sinytsang's picture
Offline
Joined: 11/08/2014 - 20:00
How about zero-inflated Poisson?

I have poisson count data that is zero-inflated. I was wondering if OpenMx can handle that as well?

Thanks!

RobK's picture
Offline
Joined: 04/19/2011 - 21:00
Sure.

No reason why not. The zero-inflated Poisson distribution simply has a slightly different PMF. Zero-inflated Poisson is a mixture distribution, but since it's a discrete mixture, it's pretty easy to work with.

Let's first simulate some data for some simple examples:

u <- runif(1000,0,1)
v <- rpois(1000,1)
y <- ifelse(u<=0.4,0,v)
dat <- cbind( y, 1, rnorm(1000) )
dimnames(dat) <- list(NULL,c("y","x0","x1"))

In the upcoming OpenMx 2.0, it will be practical to do this analysis with a row algebra fitfunction. I'll assume you're still using 1.3.2/1.4, so setting up the model with an R objective would look like this:

ZIPoisobj <- function(model,state){
  modeldata <- model$data@observed
  b0 <- model$b@values[1,1]
  b1 <- model$b@values[2,1]
  gam0 <- model$gamma@values[1,1]
  gam1 <- model$gamma@values[2,1]
  lambda <- exp(b0*modeldata[,2] + b1*modeldata[,3])
  w <- exp(gam0*modeldata[,2] + gam1*modeldata[,3]) / (1+exp(gam0*modeldata[,2] + gam1*modeldata[,3]))
  iszero <- modeldata[,1]==0
  return(-2*sum(log(iszero*w + (1-w)*dpois(modeldata[,1],lambda))))
}
ZIPoisModel <- mxModel(
  "ZIPois",
  mxData(observed=dat,type="raw"),
  mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
  mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("gam0","gam1"),name="gamma"),
  mxRObjective(ZIPoisobj)
)
ZIPoisRun <- mxRun(ZIPoisModel)

Note that the distribution has two parameters: the parameter of the Poisson mixture component (lambda) and the zero-inflation proportion (w). Both can be conditioned upon covariates, as I've done here, using a logit link for the zero-inflation proportion. In general, you can use different sets of covariates for the two parameters.

Edit: I might be able to be more helpful if you mentioned a few more details about your dataset and the analysis you're trying to do.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
Some things germane to the thread topic

Currently, we plan to implement built-in support for count distributions in a future version of OpenMx, made available via the UI for IFA models.

There is a demo in the OpenMx test suite that shows how to do a generalized estimating equations (GEE) analysis of repeated measures on an overdispersed count variable. It includes a loop to calculate robust standard errors from the "sandwich estimator" of the regression coefficients' repeated-sampling covariance matrix. However, the specification in OpenMx is rather cumbersome, and it will usually be easier to do GEE in software designed specifically for it (e.g., the R package 'gee').

Also, I have a recent paper on the use of multivariate discrete distributions to estimate biometric variance components from overdispersed count data collected in a twin study.

bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Has built-in support for

Has built-in support for count distributions been implemented yet? I'd like to do some IFA analyses of count variables (i.e., https://doi.org/10.3102/1076998610375838), and I'm hoping I might be able to avoid writing my own fit function.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
not IFA yet; ordinal-threshold hack

OpenMx's IFA module does not yet have support for count distributions. However, the density and cumulative-probability functions for the Poisson and negative binomial are usable in MxAlgebras. I have attached a script I wrote back in March that fits an ACE twin model to two normally distributed phenotypes and one count phenotype. The count phenotype follows a negative-binomial distribution, but is presumed to reflect a latent, standard-normal "liability" trait. The count variable is actually treated as an ordinal-threshold variable, with one threshold for each possible count within the range of counts observed in the sample. But, the thresholds are not themselves free parameters, but instead are functions of the two parameters of the negative-binomial distribution. The negative-binomial cumulative-probability function and the standard-normal quantile function are used to map each distinct count to a corresponding standard-normal quantile.

File attachments: