Linear Statistical Models: Regression

Predicted Scores, Residuals and Other Goodies


Predicted scores are the values predicted from the linear regression model. Predicted scores are often denoted by Y' or Yhat. Residuals scores or just plain residuals for short, are the differences between the observed score and the predicted score. Residuals can be standardized in several different ways, including what are known as Studentized residuals.

Leverage has to do with how extreme scores are on the predictor variable and will be denoted as lev. When an observation has both a large residual and high leverage the observation is said to be influential. Cook's D is one measure of influence of an observation.

Here is how you can obtain predicted scores, residuals, leverage and Cook's D using Stata.

use http://www.philender.com/courses/data/hsbdemo, clear
  
regress science math
  
      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  1,   198) =  130.81
       Model |  7760.55791     1  7760.55791           Prob > F      =  0.0000
    Residual |  11746.9421   198  59.3279904           R-squared     =  0.3978
-------------+------------------------------           Adj R-squared =  0.3948
       Total |    19507.50   199  98.0276382           Root MSE      =  7.7025

------------------------------------------------------------------------------
     science |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |     .66658   .0582822    11.44   0.000     .5516466    .7815135
       _cons |   16.75789   3.116229     5.38   0.000     10.61264    22.90315
------------------------------------------------------------------------------
  
generate pre1 = 16.75789 + .66658*math
  
predict pre2
  
list pre1 pre2 in 1/20
  
          pre1       pre2
  1.  44.08767   44.08768
  2.  52.08663   52.08664
  3.  52.75321   52.75322
  4.  48.08715   48.08715
  5.  54.75295   54.75296
  6.  50.75347   50.75348
  7.  44.75425   44.75426
  8.  46.75399   46.75399
  9.  52.75321   52.75322
 10.  51.42005   51.42006
 11.  50.75347   50.75348
 12.  50.75347   50.75348
 13.  64.08507   64.08508
 14.  54.75295   54.75296
 15.  50.08689   50.08689
 16.  45.42083   45.42084
 17.  50.75347   50.75348
 18.  56.75269    56.7527
 19.  58.08585   58.08586
 20.  54.75295   54.75296
  
corr pre1 pre2
(obs=200)
  
             |     pre1     pre2
-------------+------------------
        pre1 |   1.0000
        pre2 |   1.0000   1.0000
  
generate res1 = science - pre1
  
predict res2, resid
  
list res1 res2 in 1/20
  
          res1       res2
  1.  2.912331   2.912324
  2.  10.91337   10.91336
  3.  5.246792   5.246784
  4.  4.912849   4.912844
  5. -1.752949  -1.752956
  6.  12.24653   12.24652
  7.   8.24575   8.245745
  8.  -7.75399  -7.753996
  9.  5.246792   5.246784
 10. -1.420052  -1.420056
 11.  2.246529   2.246524
 12.  12.24653   12.24652
 13. -3.085068  -3.085076
 14.  .2470512    .247044
 15. -19.08689  -19.08689
 16.   4.57917   4.579165
 17. -.7534714  -.7534758
 18.  1.247311   1.247304
 19.  -3.08585  -3.085856
 20. -1.752949  -1.752956
  
predict rsta, rsta
  
predict rstu, rstu
  
list res1 res2 rsta rstu in 1/20
  
          res1       res2       rsta       rstu
  1.  2.912331   2.912324   .3805392   .3797159
  2.  10.91337   10.91336   1.420427    1.42411
  3.  5.246792   5.246784   .6829278   .6820047
  4.  4.912849   4.912844    .640015   .6390582
  5. -1.752949  -1.752956  -.2282794  -.2277322
  6.  12.24653   12.24652   1.594062   1.600334
  7.   8.24575   8.245745   1.076735   1.077171
  8.  -7.75399  -7.753996  -1.010918  -1.010974
  9.  5.246792   5.246784   .6829278   .6820047
 10. -1.420052  -1.420056  -.1848286  -.1843772
 11.  2.246529   2.246524   .2924176   .2917413
 12.  12.24653   12.24652   1.594062   1.600334
 13. -3.085068  -3.085076  -.4054857  -.4046285
 14.  .2470512    .247044   .0321714   .0320901
 15. -19.08689  -19.08689  -2.484742  -2.518029
 16.   4.57917   4.579165   .5975997    .596627
 17. -.7534714  -.7534758  -.0980758  -.0978302
 18.  1.247311   1.247304   .1625953    .162195
 19.  -3.08585  -3.085856  -.4026527  -.4017991
 20. -1.752949  -1.752956  -.2282794  -.2277322
  
corr res1 res2 rsta rstu
(obs=200)

             |     res1     res2     rsta     rstu
-------------+------------------------------------
        res1 |   1.0000
        res2 |   1.0000   1.0000
        rsta |   1.0000   1.0000   1.0000
        rstu |   1.0000   1.0000   1.0000   1.0000
  
predict lev, leverage
  
predict d, cooksd
  
sort d
  
list rsta lev d in -20/l
  
          rsta        lev          d
181. -1.404299   .0114879    .011459
182. -1.662786   .0083463   .0116353
183. -1.576623    .009279   .0116406
184.   1.42586   .0127641    .013143
185. -2.137993   .0057607   .0132424
186.  2.158743   .0057607   .0135007
187.  -1.53488   .0114879   .0136892
188.  2.246003   .0060859   .0154442
189. -2.484742   .0054006   .0167619
190. -1.796041   .0114879   .0187439
191. -1.495091   .0167983   .0190953
192.  2.420326   .0066418   .0195839
193.  1.733267     .01566   .0238973
194. -1.799162   .0152117   .0250004
195. -1.971433   .0127641   .0251248
196. -1.414706   .0264486    .027186
197. -2.620104    .009279   .0321482
198.  2.298569   .0141548   .0379298
199. -2.453299   .0152117   .0464843
200.  3.403156    .022826   .1352672


Linear Statistical Models Course

Phil Ender,12Jan03