It was stated in the multinomial logistic regression unit that the model is equivalent simultaneous tests of k-1 comparisons versus a reference group using binary logistic regression. This is similar to what happens when a multicategory predictor variable is used as a right-hand side variable. It also allows us to illustrate left/right equivalency, that is, the equivalence of using using variables as either left-hand side (lhs) or right-hand (rhs) side variables.
Example 1
Consider the following example using two dichotomous variables, female and public, from the high school and beyond dataset (hsb2).
use http://www.gseis.ucla.edu/courses/data/hsb2
generate public=schtyp==1
tabulate public female
| female
public | male female | Total
-----------+----------------------+----------
0 | 14 18 | 32
1 | 77 91 | 168
-----------+----------------------+----------
Total | 91 109 | 200
display (14*91)/(77*18) /* odds ratio */
.91919192
tabulate female public
| public
female | 0 1 | Total
-----------+----------------------+----------
male | 14 77 | 91
female | 18 91 | 109
-----------+----------------------+----------
Total | 32 168 | 200
display (14*91)/(18*77) /* odds ratio */
.91919192
logit public female
Logit estimates Number of obs = 200
LR chi2(1) = 0.05
Prob > chi2 = 0.8281
Log likelihood = -87.910407 Pseudo R2 = 0.0003
------------------------------------------------------------------------------
public | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | -.0842603 .3885359 -0.22 0.828 -.8457767 .677256
_cons | 1.704748 .2905436 5.87 0.000 1.135293 2.274203
------------------------------------------------------------------------------
listcoef
Odds of: 1 (public) vs 0 (private)
----------------------------------------------------------------------
public | b z P>|z| e^b e^bStdX SDofX
-------------+--------------------------------------------------------
female | -0.08426 -0.217 0.828 0.9192 0.9588 0.4992
----------------------------------------------------------------------
logit female public
Logit estimates Number of obs = 200
LR chi2(1) = 0.05
Prob > chi2 = 0.8281
Log likelihood = -137.79477 Pseudo R2 = 0.0002
------------------------------------------------------------------------------
female | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
public | -.0842603 .3885359 -0.22 0.828 -.8457767 .677256
_cons | .2513144 .3563483 0.71 0.481 -.4471154 .9497443
------------------------------------------------------------------------------
listcoef
Odds of: female vs male
----------------------------------------------------------------------
female | b z P>|z| e^b e^bStdX SDofX
-------------+--------------------------------------------------------
public | -0.08426 -0.217 0.828 0.9192 0.9695 0.3675
----------------------------------------------------------------------
Example 2Now, we will use a dichotomous variable (female) and a multicategory variable (ses).
tabulate ses female, row
| female
ses | male female | Total
-----------+----------------------+----------
low | 15 32 | 47
| 31.91 68.09 | 100.00
-----------+----------------------+----------
middle | 47 48 | 95
| 49.47 50.53 | 100.00
-----------+----------------------+----------
high | 29 29 | 58
| 50.00 50.00 | 100.00
-----------+----------------------+----------
Total | 91 109 | 200
| 45.50 54.50 | 100.00
tabulate female ses, col
| ses
female | low middle high | Total
-----------+---------------------------------+----------
male | 15 47 29 | 91
| 31.91 49.47 50.00 | 45.50
-----------+---------------------------------+----------
female | 32 48 29 | 109
| 68.09 50.53 50.00 | 54.50
-----------+---------------------------------+----------
Total | 47 95 58 | 200
| 100.00 100.00 100.00 | 100.00
mlogit ses female, base(1)
Multinomial regression Number of obs = 200
LR chi2(2) = 4.68
Prob > chi2 = 0.0964
Log likelihood = -208.24309 Pseudo R2 = 0.0111
------------------------------------------------------------------------------
ses | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
middle |
female | -.7366323 .3742013 -1.97 0.049 -1.470053 -.0032113
_cons | 1.142097 .2965523 3.85 0.000 .5608656 1.723329
-------------+----------------------------------------------------------------
high |
female | -.7576857 .4085121 -1.85 0.064 -1.558355 .0429834
_cons | .6592456 .31804 2.07 0.038 .0358988 1.282592
------------------------------------------------------------------------------
(Outcome ses==low is the comparison group)
xi: logit female i.ses
Logit estimates Number of obs = 200
LR chi2(2) = 4.68
Prob > chi2 = 0.0964
Log likelihood = -135.47889 Pseudo R2 = 0.0170
------------------------------------------------------------------------------
female | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Ises_2 | -.7366323 .3742013 -1.97 0.049 -1.470053 -.0032113
_Ises_3 | -.7576857 .4085122 -1.85 0.064 -1.558355 .0429834
_cons | .7576857 .3129164 2.42 0.015 .1443808 1.370991
------------------------------------------------------------------------------
Example 3We can do the same thing using two multicategory variables as lhs and rhs variables, in this case, ses and prog.
xi: mlogit ses i.prog, base(1)
Multinomial regression Number of obs = 200
LR chi2(4) = 16.78
Prob > chi2 = 0.0021
Log likelihood = -202.19105 Pseudo R2 = 0.0398
------------------------------------------------------------------------------
ses | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
middle |
_Iprog_2 | .6166071 .4334269 1.42 0.155 -.232894 1.466108
_Iprog_3 | .725937 .4775892 1.52 0.129 -.2101205 1.661995
_cons | .2231436 .3354102 0.67 0.506 -.4342484 .8805355
-------------+----------------------------------------------------------------
high |
_Iprog_2 | 1.368595 .5000522 2.74 0.006 .3885105 2.348679
_Iprog_3 | .0363676 .6322987 0.06 0.954 -1.202915 1.27565
_cons | -.5753641 .4166667 -1.38 0.167 -1.392016 .2412875
------------------------------------------------------------------------------
(Outcome ses==low is the comparison group)
xi: mlogit prog i.ses, base(1)
Multinomial regression Number of obs = 200
LR chi2(4) = 16.78
Prob > chi2 = 0.0021
Log likelihood = -195.70519 Pseudo R2 = 0.0411
------------------------------------------------------------------------------
prog | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
academic |
_Ises_2 | .6166071 .4334269 1.42 0.155 -.232894 1.466108
_Ises_3 | 1.368595 .5000522 2.74 0.006 .3885105 2.348679
_cons | .1718503 .3393104 0.51 0.613 -.493186 .8368865
-------------+----------------------------------------------------------------
vocation |
_Ises_2 | .725937 .4775892 1.52 0.129 -.2101205 1.661995
_Ises_3 | .0363676 .6322987 0.06 0.954 -1.202915 1.27565
_cons | -.2876821 .3818813 -0.75 0.451 -1.036156 .4607915
------------------------------------------------------------------------------
(Outcome prog==general is the comparison group)
Example 4Finally, let's try this with a categorical and a continuous variable.
anova write prog
Number of obs = 200 R-squared = 0.1776
Root MSE = 8.63918 Adj R-squared = 0.1693
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 3175.69786 2 1587.84893 21.27 0.0000
|
prog | 3175.69786 2 1587.84893 21.27 0.0000
|
Residual | 14703.1771 197 74.635417
-----------+----------------------------------------------------
Total | 17878.875 199 89.843593
daoneway write, by(prog)
One-way Disciminant Function Analysis
Observations = 200
Variables = 1
Groups = 3
Pct of Cum Canonical After Wilks'
Fcn Eigenvalue Variance Pct Corr Fcn Lambda Chi-square df P-value
| 0 0.82238 38.525 2 0.0000
1 0.2160 100.00 100.00 0.4215 |
[output omitted]
display "approximate F-ratio = " 38.525/2
approximate F-ratio = 19.2625
mlogit prog write
Multinomial regression Number of obs = 200
LR chi2(2) = 37.17
Prob > chi2 = 0.0000
Log likelihood = -185.51084 Pseudo R2 = 0.0911
------------------------------------------------------------------------------
prog | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
general |
write | -.0660079 .0210013 -3.14 0.002 -.1071696 -.0248461
_cons | 2.71249 1.132801 2.39 0.017 .4922405 4.932739
-------------+----------------------------------------------------------------
vocation |
write | -.1178089 .0216186 -5.45 0.000 -.1601806 -.0754372
_cons | 5.358994 1.115256 4.81 0.000 3.173132 7.544856
------------------------------------------------------------------------------
(Outcome prog==academic is the comparison group)
display "approximate F-ratio = " 37.17/2
approximate F-ratio = 18.585
Categorical Data Analysis Course
Phil Ender