You are here

Zero-inflated poisson counts using mxThreshold

10 posts / 0 new
Last post
bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Zero-inflated poisson counts using mxThreshold

I'm working with a latent variable model and I'd like to predict a negative binomial count response variable. I found the clever cludge that @AdminRobK previously shared to use thresholds fixed to parametric quantiles for negative binomial distribution (#7856; 180313.R ). I think this approach will work well for me. The issue I've got is that my dependent variable is clearly zero-inflated. I've also got some minor observation-number differences that I'd like to incorporate as an offset.

I'm not exactly sure how to marry a zero-inflation model with the threshold parameterization of the negative binomial Rob used there? Can anyone offer a suggestion?

For the offset, with this parameterization, would it be correct to just add the log(number of observations) as a predictor in my model with a fixed coefficient of 1.0?

Thanks!

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
built-in functionality

Hi, Brenton! As of OpenMx v2.18, that kludge of mine has built-in support, and it includes zero-inflation. The relevant function is mxMarginalNegativeBinomial(), which has an argument for the zero-inflation proportion, zeroInf. This test script demonstrates its use (in the OpenMx source repository, it's tests/testthat/test-discrete.R .

Could you say a bit more about the offset to which you refer?

bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Awesome! That's great!

Awesome! That's great!

For the offset: I've got substance use days for participants, but the time periods for each participant vary somewhat--some participants have 25 days of data, some 30, etc. If I were fitting an observed variable Poisson model, I would use a linear predictor like this:

log(count_dv) = log(days_observed) + β0 + β1x1 + β2x2 + …

So log(days_observed) is an offset/variable with coefficient fixed to 1.0.

AdminRobK's picture
Offline
Joined: 01/24/2014 - 12:15
sounds reasonable
For the offset: I've got substance use days for participants, but the time periods for each participant vary somewhat--some participants have 25 days of data, some 30, etc. If I were fitting an observed variable Poisson model, I would use a linear predictor like this:

log(count_dv) = log(days_observed) + β0 + β1x1 + β2x2 + …

So log(days_observed) is an offset/variable with coefficient fixed to 1.0.

OK, I think I get it. Sounds reasonable.

bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Just to be sure I'm

Just to be sure I'm interpreting these results correctly. Below, use_days_1_30_Use and use_days_1_30_Incar are specified with:

umxPath(v1m0 = criteria),
       mxMarginalNegativeBinomial(criteria,
                                  maxCount = 30, size = 4, mu = 1, zeroInf = .01),

The Estimate for the Delta → use_days paths are regression coefficients on the normal latent "liability" variable underlying the negative binomial indicators, correct?

Do the Discrete parameters 1, 2, 3 correspond to the zero-inflation, size/dispersion, and mu parameters, respectively? (Could those be given informative labels in the output?) Are size and prob/mu parameterized as in ?pnbinom (that is NB2)? And the zero-inflation parameter is the marginal probability of 0 parameter from the binomial model?

     matrix                 row                 col      Estimate   Std.Error
10        A   use_days_1_30_Use use_days_1_30_Incar -2.799097e-01 0.139894372
17        A   use_days_1_30_Use               Delta  4.135083e-01 0.211776268
18        A use_days_1_30_Incar               Delta  3.062733e-01 0.184349627
31        S               Delta               Delta  5.157468e-01 0.061836412
35 Discrete                   1   use_days_1_30_Use   0.000000000 0.212945574
36 Discrete                   2   use_days_1_30_Use   0.055341620 0.011971673
37 Discrete                   3   use_days_1_30_Use   3.577466689 1.244645467
38 Discrete                   1 use_days_1_30_Incar   0.772313030 0.027908882
39 Discrete                   2 use_days_1_30_Incar   1.225494822 0.392580227
40 Discrete                   3 use_days_1_30_Incar  44.754645867 9.818826942

Thanks!

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
Discete parameters

> Do the Discrete parameters 1, 2, 3 correspond to the zero-inflation, size/dispersion, and mu parameters, respectively? (Could those be given informative labels in the output?)

You can easily figure it out using code like this,

m1 = mxModel("test", type="RAM", manifestVars = c('x1'),
        mxMarginalNegativeBinomial(
          "x1", maxCount = 30, size = 4, mu = 1, zeroInf = .01))
m1$Discrete$values

Where the values show up in mxMatrix Discrete tells you the meaning of the parameter. You can add labels to the Discrete$labels matrix to get labelled output in the summary table.

> Are size and prob/mu parameterized as in ?pnbinom (that is NB2)?

You can use mxGetExpected(model, "thresholds") to check whether the proportions match what you expect.

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
zero-inflation

> And the zero-inflation parameter is the marginal probability of 0 parameter from the binomial model?

Actually I think it is the extra probability of zero beyond what the binomial model would predict.

bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Also, is it possible to

Also, is it possible to disable the zero-inflation parameter?

AdminJosh's picture
Offline
Joined: 12/12/2012 - 12:15
disable the zero-inflation parameter

Yes, just set it fixed to 0 or whatever value you prefer.

bwiernik's picture
Offline
Joined: 01/30/2014 - 19:39
Ah, of course.

Ah, of course.