Negative Binomial Models
Negative binomial regression is used to estimate count models when the poisson estimation is inappropriate due to overdispersion (which is most of the time). In a poisson distribution the mean and variance are equal. When the variance is greater than the mean the distribution is said to display overdispersion. The nbreg command estimation includes an ancillary parameter α which is an estimate of the degree of overdispersion. For computational purposes, Stata estimates lnα which is then converted to α. When α is zero, negative binomial has the same distribution as poisson. The larger α is the greater the amount of overdispersion in the data.
When there is overdispersion the poisson estimates are inefficient with standard errors biased downward yielding spuriously large z-values.
The negative binomial distribution is given by

Negative Binomial Example
We will continue with the lahigh dataset.
use http://www.gseis.ucla.edu/courses/data/lahigh
nbreg daysabs gender langnce
Negative binomial regression Number of obs = 316
LR chi2(2) = 20.63
Prob > chi2 = 0.0000
Log likelihood = -880.9274 Pseudo R2 = 0.0116
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | -.4312069 .1396913 -3.09 0.002 -.7049968 -.1574169
langnce | -.0156493 .0039485 -3.96 0.000 -.0233882 -.0079104
_cons | 2.70344 .2292762 11.79 0.000 2.254067 3.152813
-------------+----------------------------------------------------------------
/lnalpha | .25394 .095509 .0667457 .4411342
-------------+----------------------------------------------------------------
alpha | 1.289094 .1231201 1.069024 1.554469
------------------------------------------------------------------------------
Likelihood ratio test of alpha=0: chibar2(01) = 1337.86 Prob>=chibar2 = 0.000
nbreg, irr
Negative binomial regression Number of obs = 316
LR chi2(2) = 20.63
Prob > chi2 = 0.0000
Log likelihood = -880.9274 Pseudo R2 = 0.0116
------------------------------------------------------------------------------
daysabs | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .6497245 .0907609 -3.09 0.002 .4941101 .8543478
langnce | .9844725 .0038872 -3.96 0.000 .9768832 .9921208
-------------+----------------------------------------------------------------
/lnalpha | .25394 .095509 .0667457 .4411342
-------------+----------------------------------------------------------------
alpha | 1.289094 .1231201 1.069024 1.554469
------------------------------------------------------------------------------
Likelihood ratio test of alpha=0: chibar2(01) = 1337.86 Prob>=chibar2 = 0.000
listcoef
nbreg (N=316): Factor Change in Expected Count
Observed SD: 7.4490028
------------------------------------------------------------------
daysabs | b z P>|z| e^b e^bStdX SDofX
---------+--------------------------------------------------------
gender | -0.43121 -3.087 0.002 0.6497 0.8058 0.5006
langnce | -0.01565 -3.963 0.000 0.9845 0.7552 17.9392
---------+--------------------------------------------------------
ln alpha | 0.25394 2.659
------------------------------------------------------------------
listcoef, percent
nbreg (N=316): Percentage Change in Expected Count
Observed SD: 7.4490028
----------------------------------------------------------------------
daysabs | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
gender | -0.43121 -3.087 0.002 -35.0 -19.4 0.5006
langnce | -0.01565 -3.963 0.000 -1.6 -24.5 17.9392
-------------+--------------------------------------------------------
ln alpha | 0.25394
alpha | 1.28909 SE(alpha) = 0.12312
----------------------------------------------------------------------
LR test of alpha=0: 1337.86 Prob>=LRX2 = 0.000
----------------------------------------------------------------------
mfx compute
Marginal effects after nbreg
y = predicted number of events (predict)
= 5.5280363
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
gender*| -2.389162 .79423 -3.01 0.003 -3.94582 -.832503 .487342
langnce | -.0865098 .02241 -3.86 0.000 -.130442 -.042578 50.0638
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
prchange
nbreg: Changes in Predicted Rate for daysabs
min->max 0->1 -+1/2 -+sd/2 MargEfct
gender -2.3892 -2.3892 -2.4022 -1.1957 -2.3837
langnce -9.3413 -0.1879 -0.0865 -1.5570 -0.0865
exp(xb): 5.5280
gender langnce
x= .487342 50.0638
sd(x)= .500633 17.9392
prtab gender
nbreg: Predicted rates for daysabs
----------------------
gender | Prediction
----------+-----------
female | 6.8208
male | 4.4316
----------------------
gender langnce
x= .48734177 50.063794
prtab langnce
nbreg: Predicted rates for daysabs
----------------------
ctbs lang |
nce | Prediction
----------+-----------
1.007114 | 11.9119
6.748048 | 10.8883
10.39049 | 10.2850
13.13055 | 9.8533
15.35938 | 9.5156
17.25647 | 9.2372
20.40919 | 8.7926
21.7637 | 8.6081
23.01052 | 8.4418
24.16932 | 8.2901
25.25478 | 8.1505
26.2782 | 8.0210
27.24847 | 7.9001
28.17271 | 7.7867
29.05672 | 7.6797
29.90528 | 7.5784
30.72241 | 7.4821
31.5115 | 7.3903
32.27546 | 7.3024
33.01677 | 7.2182
34.43988 | 7.0592
35.12527 | 6.9839
35.79525 | 6.9111
36.45115 | 6.8405
37.09416 | 6.7720
37.72536 | 6.7054
38.34572 | 6.6407
38.95612 | 6.5775
39.55739 | 6.5159
40.15026 | 6.4558
40.73543 | 6.3969
41.31353 | 6.3393
41.88515 | 6.2828
42.45086 | 6.2275
43.01117 | 6.1731
43.56657 | 6.1197
44.11754 | 6.0671
44.66451 | 6.0154
45.2079 | 5.9645
45.74812 | 5.9143
46.28556 | 5.8647
46.82059 | 5.8158
47.35357 | 5.7675
47.88486 | 5.7198
48.41482 | 5.6725
48.94376 | 5.6258
49.47205 | 5.5795
50 | 5.5336
50.52795 | 5.4880
51.05624 | 5.4428
51.58518 | 5.3980
52.11514 | 5.3534
52.64643 | 5.3091
53.17941 | 5.2650
53.71444 | 5.2211
54.25188 | 5.1773
54.7921 | 5.1338
55.33549 | 5.0903
55.88246 | 5.0469
56.43343 | 5.0036
56.98883 | 4.9603
57.54914 | 4.9170
58.11485 | 4.8736
58.68647 | 4.8302
59.26457 | 4.7867
59.84974 | 4.7431
60.44261 | 4.6993
61.04388 | 4.6553
62.27464 | 4.5665
63.54885 | 4.4763
64.20476 | 4.4306
64.87473 | 4.3844
65.56011 | 4.3376
66.26239 | 4.2902
66.98323 | 4.2421
67.72454 | 4.1932
68.48849 | 4.1433
69.27759 | 4.0925
70.09472 | 4.0405
70.94328 | 3.9872
71.82729 | 3.9324
73.72179 | 3.8175
74.74522 | 3.7569
78.2363 | 3.5571
79.59081 | 3.4825
81.08016 | 3.4023
82.74353 | 3.3149
84.64062 | 3.2179
86.86945 | 3.1076
89.60951 | 2.9772
93.25195 | 2.8122
98.99289 | 2.5706
----------------------
gender langnce
x= .48734177 50.063794
prtab langnce gender
nbreg: Predicted rates for daysabs
----------------------------
ctbs lang | gender
nce | female male
----------+-----------------
1.007114 | 14.6975 9.5493
6.748048 | 13.4347 8.7288
10.39049 | 12.6903 8.2452
13.13055 | 12.1576 7.8991
15.35938 | 11.7409 7.6283
17.25647 | 11.3974 7.4052
20.40919 | 10.8488 7.0487
21.7637 | 10.6212 6.9009
23.01052 | 10.4160 6.7675
24.16932 | 10.2288 6.6459
25.25478 | 10.0565 6.5340
26.2782 | 9.8967 6.4302
27.24847 | 9.7476 6.3333
28.17271 | 9.6076 6.2423
29.05672 | 9.4756 6.1565
29.90528 | 9.3506 6.0753
30.72241 | 9.2318 5.9981
31.5115 | 9.1185 5.9245
32.27546 | 9.0102 5.8541
33.01677 | 8.9062 5.7866
34.43988 | 8.7101 5.6592
35.12527 | 8.6172 5.5988
35.79525 | 8.5273 5.5404
36.45115 | 8.4402 5.4838
37.09416 | 8.3557 5.4289
37.72536 | 8.2736 5.3755
38.34572 | 8.1936 5.3236
38.95612 | 8.1157 5.2730
39.55739 | 8.0397 5.2236
40.15026 | 7.9655 5.1754
40.73543 | 7.8929 5.1282
41.31353 | 7.8218 5.0820
41.88515 | 7.7521 5.0367
42.45086 | 7.6838 4.9924
43.01117 | 7.6167 4.9488
43.56657 | 7.5508 4.9059
44.11754 | 7.4860 4.8638
44.66451 | 7.4222 4.8224
45.2079 | 7.3593 4.7815
45.74812 | 7.2974 4.7413
46.28556 | 7.2363 4.7016
46.82059 | 7.1759 4.6624
47.35357 | 7.1163 4.6236
47.88486 | 7.0574 4.5854
48.41482 | 6.9991 4.5475
48.94376 | 6.9414 4.5100
49.47205 | 6.8843 4.4729
50 | 6.8276 4.4361
50.52795 | 6.7714 4.3996
51.05624 | 6.7157 4.3633
51.58518 | 6.6603 4.3274
52.11514 | 6.6053 4.2916
52.64643 | 6.5506 4.2561
53.17941 | 6.4962 4.2208
53.71444 | 6.4421 4.1856
54.25188 | 6.3881 4.1505
54.7921 | 6.3343 4.1156
55.33549 | 6.2807 4.0807
55.88246 | 6.2272 4.0459
56.43343 | 6.1737 4.0112
56.98883 | 6.1203 3.9765
57.54914 | 6.0668 3.9418
58.11485 | 6.0134 3.9070
58.68647 | 5.9598 3.8722
59.26457 | 5.9061 3.8374
59.84974 | 5.8523 3.8024
60.44261 | 5.7983 3.7673
61.04388 | 5.7440 3.7320
62.27464 | 5.6344 3.6608
63.54885 | 5.5231 3.5885
64.20476 | 5.4667 3.5519
64.87473 | 5.4097 3.5148
65.56011 | 5.3520 3.4773
66.26239 | 5.2935 3.4393
66.98323 | 5.2341 3.4007
67.72454 | 5.1738 3.3615
68.48849 | 5.1123 3.3216
69.27759 | 5.0495 3.2808
70.09472 | 4.9854 3.2391
70.94328 | 4.9196 3.1964
71.82729 | 4.8520 3.1525
73.72179 | 4.7103 3.0604
74.74522 | 4.6354 3.0118
78.2363 | 4.3890 2.8516
79.59081 | 4.2969 2.7918
81.08016 | 4.1979 2.7275
82.74353 | 4.0901 2.6574
84.64062 | 3.9704 2.5797
86.86945 | 3.8343 2.4913
89.60951 | 3.6734 2.3867
93.25195 | 3.4699 2.2545
98.99289 | 3.1717 2.0607
----------------------------
gender langnce
x= .48734177 50.063794
Generalized Negative BinomialIt is possible to estimate a generalized version of the negative binomial model. The gnbreg command allows lnα to be modeled as a function of one or more variables.
nbreg daysabs gender langnce if school==1, nolog
Negative binomial regression Number of obs = 159
LR chi2(2) = 11.63
Prob > chi2 = 0.0030
Log likelihood = -495.81829 Pseudo R2 = 0.0116
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | -.5643986 .1653818 -3.41 0.001 -.888541 -.2402562
langnce | -.0061867 .0047883 -1.29 0.196 -.0155717 .0031982
_cons | 2.627209 .2509819 10.47 0.000 2.135294 3.119125
-------------+----------------------------------------------------------------
/lnalpha | -.0919676 .1319768 -.3506374 .1667022
-------------+----------------------------------------------------------------
alpha | .9121347 .1203806 .7042391 1.181402
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 680.70 Prob>=chibar2 = 0.000
nbreg daysabs gender langnce if school==2, nolog
Negative binomial regression Number of obs = 157
LR chi2(2) = 2.56
Prob > chi2 = 0.2778
Log likelihood = -367.07632 Pseudo R2 = 0.0035
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | -.2016785 .2224413 -0.91 0.365 -.6376554 .2342985
langnce | -.0102065 .0067666 -1.51 0.131 -.0234687 .0030558
_cons | 1.898195 .4318886 4.40 0.000 1.051709 2.744681
-------------+----------------------------------------------------------------
/lnalpha | .4011526 .1464837 .1140498 .6882554
-------------+----------------------------------------------------------------
alpha | 1.493545 .21878 1.120808 1.99024
------------------------------------------------------------------------------
Likelihood-ratio test of alpha=0: chibar2(01) = 449.37 Prob>=chibar2 = 0.000
gnbreg daysabs gender langnce, lnalpha(school)
Generalized negative binomial regression Number of obs = 316
LR chi2(2) = 20.22
Prob > chi2 = 0.0000
Log likelihood = -876.90377 Pseudo R2 = 0.0114
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
daysabs |
gender | -.4731664 .1386795 -3.41 0.001 -.7449732 -.2013596
langnce | -.0142358 .003965 -3.59 0.000 -.022007 -.0064646
_cons | 2.733251 .2213494 12.35 0.000 2.299414 3.167088
-------------+----------------------------------------------------------------
lnalpha |
school | .5881709 .2058519 2.86 0.004 .1847085 .9916333
_cons | -.6092282 .3159621 -1.93 0.054 -1.228502 .0100461
------------------------------------------------------------------------------
gnbreg, irr
Generalized negative binomial regression Number of obs = 316
LR chi2(2) = 20.22
Prob > chi2 = 0.0000
Log likelihood = -876.90377 Pseudo R2 = 0.0114
------------------------------------------------------------------------------
daysabs | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .6230264 .086401 -3.41 0.001 .474747 .8176184
langnce | .9858651 .0039089 -3.59 0.000 .9782334 .9935563
-------------+----------------------------------------------------------------
lnalpha | (type gnbreg to see ln(alpha) coefficient estimates)
------------------------------------------------------------------------------
display -2*-876.90377
1753.8075
mfx compute
Marginal effects after gnbreg
y = predicted number of events (predict)
= 5.9892103
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
gender*| -2.843323 .87245 -3.26 0.001 -4.5533 -1.13335 .487342
langnce | -.0852611 .02344 -3.64 0.000 -.131207 -.039316 50.0638
school | (no effect) 1.49684
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
gnbreg daysabs gender langnce school, lnalpha(school) nolog
Generalized negative binomial regression Number of obs = 316
LR chi2(3) = 45.88
Prob > chi2 = 0.0000
Log likelihood = -864.07066 Pseudo R2 = 0.0259
------------------------------------------------------------------------------
daysabs | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
daysabs |
gender | -.4321288 .1333284 -3.24 0.001 -.6934476 -.17081
langnce | -.0076396 .0039562 -1.93 0.053 -.0153937 .0001144
school | -.7655276 .1437565 -5.33 0.000 -1.047285 -.4837701
_cons | 3.387622 .2465595 13.74 0.000 2.904374 3.87087
-------------+----------------------------------------------------------------
lnalpha |
school | .5002206 .1980169 2.53 0.012 .1121145 .8883267
_cons | -.5857683 .3032156 -1.93 0.053 -1.18006 .0085234
------------------------------------------------------------------------------
gnbreg, irr
Generalized negative binomial regression Number of obs = 316
LR chi2(3) = 45.88
Prob > chi2 = 0.0000
Log likelihood = -864.07066 Pseudo R2 = 0.0259
------------------------------------------------------------------------------
daysabs | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .6491258 .0865469 -3.24 0.001 .4998498 .8429818
langnce | .9923895 .0039261 -1.93 0.053 .9847242 1.000114
school | .4650885 .0668595 -5.33 0.000 .3508891 .6164549
-------------+----------------------------------------------------------------
lnalpha | (type gnbreg to see ln(alpha) coefficient estimates)
------------------------------------------------------------------------------
display -2*-864.07066
1728.1413
display -2*(-876.90377-(-864.07066))
25.66622
Categorical Data Analysis Course
Phil Ender