After laying all of the theoretical foundation for logistic regression, it must be admitted that for many models there is very little difference between the OLS results and the logistic regression results. Here is a small example in which there doesn't seem to be much difference between OLS and logit.
First Example
use http://www.gseis.ucla.edu/courses/data/honors
regress honors lang female
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 2, 197) = 35.85
Model | 10.3957196 2 5.19785982 Prob > F = 0.0000
Residual | 28.5592804 197 .144970966 R-squared = 0.2669
-------------+------------------------------ Adj R-squared = 0.2594
Total | 38.955 199 .195753769 Root MSE = .38075
------------------------------------------------------------------------------
honors | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lang | .0214989 .0026362 8.16 0.000 .0163001 .0266977
female | .1467375 .054142 2.71 0.007 .0399652 .2535098
_cons | -.9378584 .1448623 -6.47 0.000 -1.223538 -.6521786
------------------------------------------------------------------------------
predict p1
logit honors lang female, nolog
Logit estimates Number of obs = 200
LR chi2(2) = 60.40
Prob > chi2 = 0.0000
Log likelihood = -85.44372 Pseudo R2 = 0.2612
------------------------------------------------------------------------------
honors | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lang | .1443657 .0233337 6.19 0.000 .0986325 .1900989
female | 1.120926 .4081028 2.75 0.006 .321059 1.920793
_cons | -9.603365 1.426404 -6.73 0.000 -12.39906 -6.807665
------------------------------------------------------------------------------
predict p2
summarize p1 p2
Variable | Obs Mean Std. Dev. Min Max
-------------+-----------------------------------------------------
p1 | 200 .265 .2285603 -.2713931 .8427939
p2 | 200 .265 .2408362 .0058933 .9233922
corr p1 p2
(obs=200)
| p1 p2
-------------+------------------
p1 | 1.0000
p2 | 0.9490 1.0000
list p1 p2 in 1/20
p1 p2
1. .0473354 .0545689
2. -.1423999 .0139005
3. -.1638987 .0120544
4. .283823 .2202598
5. .5633085 .6485339
6. .0725889 .0563498
7. -.0386602 .0313819
8. .1763286 .1206826
9. .0473354 .0545689
10. -.1891523 .0116561
11. .3268208 .2738011
12. .1800833 .1094523
13. .0295912 .0428232
14. -.060159 .0272784
15. .1548298 .106182
16. .2193264 .1548247
17. .24458 .1593262
18. .2193264 .1548247
19. .4343152 .4369393
20. .1548298 .106182
/* classification tables */
generate c1 = p1>.5
generate c2 = p2>.5
tabulate c1 c2
| c2
c1 | 0 1 | Total
-----------+----------------------+----------
0 | 164 4 | 168
1 | 0 32 | 32
-----------+----------------------+----------
Total | 164 36 | 200Note the out-of range predictions, negative values, in the example above.
Next, let's look at a counter example in which OLS and logistic produce different results.
Counter Example
use http://www.gseis.ucla.edu/courses/data/apilog, clear
regress hiqual enroll meals avg_ed
Source | SS df MS Number of obs = 1149
-------------+------------------------------ F( 3, 1145) = 522.92
Model | 145.625156 3 48.5417186 Prob > F = 0.0000
Residual | 106.287812 1145 .092827783 R-squared = 0.5781
-------------+------------------------------ Adj R-squared = 0.5770
Total | 251.912968 1148 .219436383 Root MSE = .30468
------------------------------------------------------------------------------
hiqual | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
enroll | .0000233 .0000521 0.45 0.654 -.0000789 .0001255
meals | -.0077637 .0005286 -14.69 0.000 -.0088009 -.0067266
avg_ed | .1697195 .0210965 8.04 0.000 .1283274 .2111116
_cons | .2513082 .083063 3.03 0.003 .0883353 .414281
------------------------------------------------------------------------------
predict p1
(51 missing values generated)
logit hiqual enroll meals avg_ed, nolog
Logit estimates Number of obs = 1149
LR chi2(3) = 917.65
Prob > chi2 = 0.0000
Log likelihood = -265.40191 Pseudo R2 = 0.6335
------------------------------------------------------------------------------
hiqual | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
enroll | -.0019593 .000735 -2.67 0.008 -.0033999 -.0005187
meals | -.0785112 .0076189 -10.30 0.000 -.093444 -.0635784
avg_ed | 2.148565 .299792 7.17 0.000 1.560984 2.736147
_cons | -3.302163 1.030206 -3.21 0.001 -5.32133 -1.282996
------------------------------------------------------------------------------
predict p2
(51 missing values generated)
summarize p1 p2
Variable | Obs Mean Std. Dev. Min Max
-------------+-----------------------------------------------------
p1 | 1149 .3246301 .3561617 -.3461998 1.101522
p2 | 1149 .3246301 .3848081 .000036 .9986064
corr p1 p2
(obs=1149)
| p1 p2
-------------+------------------
p1 | 1.0000
p2 | 0.9256 1.0000
list p1 p2 in 1/20
p1 p2
1. .0686489 .0087564
2. -.2424376 .0002072
3. .4535227 .348036
4. .5269313 .4345559
5. .9753819 .9948776
6. -.0621362 .0028704
7. -.0452434 .0019718
8. .4313931 .2918692
9. .2042937 .0150685
10. .7003989 .8609163
11. . .
12. -.0974959 .0009283
13. .5642942 .6845942
14. .1766382 .0204579
15. .4774918 .3685073
16. -.1551605 .0002245
17. .8235874 .9654739
18. .0041572 .0047186
19. -.0109633 .0019114
20. -.1417836 .0005978
/* classification tables */
generate c1 = p1>.5 if p1~=.
generate c2 = p2>.5 if p1~=.
tabulate c1 c2
| c2
c1 | 0 1 | Total
-----------+----------------------+----------
0 | 750 0 | 750
1 | 30 369 | 399
-----------+----------------------+----------
Total | 780 369 | 1149
Categorical Data Analysis Course
Phil Ender