Linear Statistical Models: Regression

Data Transformation


Purpose of Transformations

  1. To linearize regression model.
  2. To stabilize variance (reduce heterogeneity of variance, "heteroscedasticity").
  3. To normalize variables.
  • Some transformations will serve more than one purpose. For example, a transformation that linearizes a variable may also help to normalize it.

    Transformations May be Necessary Due to:

    Variables to be Transformed

    Major Drawbacks

    Log Transformation

    1. To linearize regression model with consistently increasing slope.

    2. Stabilize variance when variance of residuals increases markedly with increasing Y.

    3. To normalize Y when distribution of residuals is positively skewed.

    Stata Example

    
    use http://www.philender.com/courses/data/lntrans, clear
    
    scatter y x, msym(oh) jitter(1)
    
    
    
    generate z = log(y)
    
    scatter z x, msym(oh) jitter(1) 
    
    
    
    regress z x
    
      Source |       SS       df       MS                  Number of obs =      50
    ---------+------------------------------               F(  1,    48) = 2916.35
       Model |  365.874096     1  365.874096               Prob > F      =  0.0000
    Residual |  6.02190025    48  .125456255               R-squared     =  0.9838
    ---------+------------------------------               Adj R-squared =  0.9835
       Total |  371.895996    49  7.58971421               Root MSE      =   .3542
    
    ------------------------------------------------------------------------------
           z |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
    ---------+--------------------------------------------------------------------
           x |   .9417895   .0174395     54.003   0.000        .906725     .976854
       _cons |    .906511   .1082093      8.377   0.000       .6889417     1.12408
    ------------------------------------------------------------------------------
    
    
    predict p
    
    twoway (scatter z x, msym(oh) jitter(1))(line p x)
    
    
    
    generate p2 = exp(p)
    
    twoway (scatter y x, msym(oh) jitter(1))(line p2 x)
    
    
    
    /* now transform x instead of y */
    
    generate xt = exp(x)
    
    scatter y xt, msym(oh) jitter(1)
    
    
    
    regress y xt
    
      Source |       SS       df       MS                  Number of obs =      50
    ---------+------------------------------               F(  1,    48) =  650.09
       Model |  4.3685e+09     1  4.3685e+09               Prob > F      =  0.0000
    Residual |   322552812    48  6719850.24               R-squared     =  0.9312
    ---------+------------------------------               Adj R-squared =  0.9298
       Total |  4.6911e+09    49  95736235.2               Root MSE      =  2592.3
    
    ------------------------------------------------------------------------------
           y |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
    ---------+--------------------------------------------------------------------
          xt |   1.409637   .0552866     25.497   0.000       1.298476    1.520799
       _cons |   493.3881    414.134      1.191   0.239       -339.284     1326.06
    ------------------------------------------------------------------------------
    
    rvfplot, yline(0) msym(oh)
    
    
    Square Root (SQRT) Transformation

    Used to stabilize variance when proportional to the mean of Y; especially when Y approximates a Poisson distribution.

    Reciprocal Transformation

    To stabilize variance when proportional to the 4th power of mean of Y, i.e., huge increase in variance above some threshold of Y. Purpose is to mimnimize effect of large values of Y. Transformed large Ys will be close to zero, thus large increases in Y will result in only trivial decreases in Y'.

    Square Transformation

    1. Linearize when X vs Y is curvilinear downward, i.e., slope decreases as X increases..

    2. Stabilize variance when it decreases with the mean of Y.

    3. Normalize Y when distribution of residuals is negatively skewed.

    Arcsin-Root Transformation

    Poisson Distribution

  • Poisson Examples

    Binomial Distribution

    Negative Binomial Distribution

    An Example

    What to do if you can't figure out which transformation to use?

  • Ladder of powers: Does each of the following transformations and tests for normality. Y3, Y2, Y, sqrt(Y), ln(Y), 1/sqrt(Y), 1/Y, 1/Y2, 1/Y3. In Stata the command is:

  • Box-Cox transformation: Finds the value u for the transformation, (yu-1)/u, which normalizes the transformed variable. The values being transformed must be strictly positive, that is, greater than zero. In Stata the command is:


    Linear Statistical Models Course

    Phil Ender, 18dec99