To a certain extent relative risk is a more intuitive concept then is odds ratio. In the case of rare events (probability less than .1) odds ratios and relative risk are nearly equal. But what about the situations in which events are not rare. We will present two methods of obtaining relative risk using several of Stata's estimation commands along with their equivalent glm commands.
Acknowledgements: Numerous contributors to the Statalist and to Karla Lindquist of UCSF.
use http://www.gseis.ucla.edu/courses/data/honors
tabulate female honors, nolabel
| honors
female | 0 1 | Total
-----------+----------------------+----------
0 | 73 18 | 91
1 | 74 35 | 109
-----------+----------------------+----------
Total | 147 53 | 200
display "odds ratio = " (73*35)/(18*74)
odds ratio = 1.9181682
display "relative risk = " (35/109)/(18/91)
relative risk = 1.6233435
First we will compute the odds ratio using logistic regression followed by the equivalent
glm command. With glm the eform option displays the exponentiated coefficient.
logit honors female, or nolog
Logit estimates Number of obs = 200
LR chi2(1) = 3.94
Prob > chi2 = 0.0473
Log likelihood = -113.6769 Pseudo R2 = 0.0170
------------------------------------------------------------------------------
honors | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.918168 .6400451 1.95 0.051 .9973827 3.689024
------------------------------------------------------------------------------
glm honors female, fam(binom) link(logit) eform nolog
Generalized linear models No. of obs = 200
Optimization : ML: Newton-Raphson Residual df = 198
Scale parameter = 1
Deviance = 227.3538087 (1/df) Deviance = 1.148252
Pearson = 200 (1/df) Pearson = 1.010101
Variance function: V(u) = u*(1-u) [Bernoulli]
Link function : g(u) = ln(u/(1-u)) [Logit]
Standard errors : OIM
Log likelihood = -113.6769044 AIC = 1.156769
BIC = -821.7130299
------------------------------------------------------------------------------
honors | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.918168 .6400451 1.95 0.051 .9973827 3.689024
------------------------------------------------------------------------------
Next, we will compute the risk ratio using binary regression. Binary regression is
notorious for poor convergence. The equivalent glm command keeps the binomial family
but switches to the log link.
binreg honors female, rr nolog
Residual df = 198 No. of obs = 200
Pearson X2 = 200 Deviance = 227.3538
Dispersion = 1.010101 Dispersion = 1.148252
Bernoulli distribution, log link
------------------------------------------------------------------------------
| EIM
honors | Risk Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.623344 .4105603 1.92 0.055 .9888554 2.664944
------------------------------------------------------------------------------
glm honors female, fam(binom) link(log) eform nolog
Generalized linear models No. of obs = 200
Optimization : ML: Newton-Raphson Residual df = 198
Scale parameter = 1
Deviance = 227.3538087 (1/df) Deviance = 1.148252
Pearson = 199.9999895 (1/df) Pearson = 1.010101
Variance function: V(u) = u*(1-u) [Bernoulli]
Link function : g(u) = ln(u) [Log]
Standard errors : OIM
Log likelihood = -113.6769044 AIC = 1.156769
BIC = -821.7130299
------------------------------------------------------------------------------
honors | Risk Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.623343 .4105603 1.92 0.055 .9888551 2.664944
------------------------------------------------------------------------------
Next, we will use poisson regression to accomplish the same thing. In order to obtain reasonable
standard errors we need to include the robust option with poisson. And for glm we need
to change the family from binomial to poisson while leaving the link at log.This use of poisson regression to obtain relative risk is from an article by Guangyong Zou (A Modified Poisson Regression Approach to Prospective Studies with Binary Data. Am J Epidemiol 2004; 159(7):702-6.). This "modified poisson" approach is interesting in that each observation is only a 0/1 event, not the traditional count type variable typically found in poisson models.
poisson honors female, irr robust nolog
Poisson regression Number of obs = 200
Wald chi2(1) = 3.65
Prob > chi2 = 0.0560
Log pseudo-likelihood = -121.92877 Pseudo R2 = 0.0118
------------------------------------------------------------------------------
| Robust
honors | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.623344 .4115907 1.91 0.056 .987626 2.668261
------------------------------------------------------------------------------
glm honors female, fam(poisson) link(log) robust eform nolog
Generalized linear models No. of obs = 200
Optimization : ML: Newton-Raphson Residual df = 198
Scale parameter = 1
Deviance = 137.8575464 (1/df) Deviance = .6962502
Pearson = 146.9999999 (1/df) Pearson = .7424242
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
Standard errors : Sandwich
Log pseudo-likelihood = -121.9287732 AIC = 1.239288
BIC =-911.2092922
------------------------------------------------------------------------------
| Robust
honors | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.623344 .4115907 1.91 0.056 .987626 2.668261
------------------------------------------------------------------------------
We will conclude by running an example which includes several continuous predictors. This
is the type of model that often fails to converge using binary regression.
glm honors female lang math science ses, fam(binom) link(log) eform
Iteration 0: log likelihood = -143.93712 (not concave)
Iteration 1: log likelihood = -138.17315 (not concave)
*** output deleted ***
Iteration 49: log likelihood = -137.87877 (not concave)
Iteration 50: log likelihood = -137.87877 (not concave)
convergence not achieved
Generalized linear models No. of obs = 200
Optimization : ML: Newton-Raphson Residual df = 194
Scale parameter = 1
Deviance = 275.7575387 (1/df) Deviance = 1.421431
Pearson = 300000129.1 (1/df) Pearson = 1546392
Variance function: V(u) = u*(1-u) [Bernoulli]
Link function : g(u) = ln(u) [Log]
Standard errors : OIM
Log likelihood = -137.8787694 AIC = 1.438788
BIC = -752.1160304
------------------------------------------------------------------------------
honors | Risk Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 1.045888 .1405277 0.33 0.738 .8037407 1.360988
lang | 1.024339 2.74e-09 . 0.000 1.024339 1.024339
math | 1.032622 3.65e-09 . 0.000 1.032622 1.032622
science | 1.022528 6.65e-09 . 0.000 1.022528 1.022528
ses | 1.004854 . . . . .
------------------------------------------------------------------------------
glm honors female lang math science ses, fam(poisson) link(log) robust eform
Iteration 0: log pseudo-likelihood = -99.421803
Iteration 1: log pseudo-likelihood = -96.828386
Iteration 2: log pseudo-likelihood = -96.820156
Iteration 3: log pseudo-likelihood = -96.820155
Generalized linear models No. of obs = 200
Optimization : ML: Newton-Raphson Residual df = 194
Scale parameter = 1
Deviance = 87.64030952 (1/df) Deviance = .4517542
Pearson = 110.2542715 (1/df) Pearson = .568321
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
Standard errors : Sandwich
Log pseudo-likelihood = -96.82015476 AIC = 1.028202
BIC =-940.2332596
------------------------------------------------------------------------------
| Robust
honors | IRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 2.012414 .4427827 3.18 0.001 1.307468 3.097444
lang | 1.032885 .0139128 2.40 0.016 1.005973 1.060516
math | 1.059176 .0183383 3.32 0.001 1.023837 1.095736
science | 1.030488 .0179599 1.72 0.085 .9958814 1.066297
ses | 1.058971 .157016 0.39 0.699 .7919076 1.416099
------------------------------------------------------------------------------
Categorical Data Analysis Course
Phil Ender