Four Possibilities
use http://www.philender.com/courses/data/hsbdemo, clear
regress write read
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 1, 198) = 109.52
Model | 6367.42127 1 6367.42127 Prob > F = 0.0000
Residual | 11511.4537 198 58.1386552 R-squared = 0.3561
-------------+------------------------------ Adj R-squared = 0.3529
Total | 17878.875 199 89.843593 Root MSE = 7.6249
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5517051 .0527178 10.47 0.000 .4477445 .6556656
_cons | 23.95944 2.805744 8.54 0.000 18.42647 29.49242
------------------------------------------------------------------------------
twoway (scatter write read, msym(oh))(lfit write read), legend(off)
regress write read female
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 2, 197) = 77.21
Model | 7856.32118 2 3928.16059 Prob > F = 0.0000
Residual | 10022.5538 197 50.8759077 R-squared = 0.4394
-------------+------------------------------ Adj R-squared = 0.4337
Total | 17878.875 199 89.843593 Root MSE = 7.1327
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .5658869 .0493849 11.46 0.000 .468496 .6632778
female | 5.486894 1.014261 5.41 0.000 3.48669 7.487098
_cons | 20.22837 2.713756 7.45 0.000 14.87663 25.58011
------------------------------------------------------------------------------
predict p2
sort female p2
scatter write p2 read, msym(oh i) con(. L) sort

generate fxr = female*read
regress write c.read##i.female
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 3, 196) = 52.31
Model | 7949.6163 3 2649.8721 Prob > F = 0.0000
Residual | 9929.2587 196 50.6594831 R-squared = 0.4446
-------------+------------------------------ Adj R-squared = 0.4361
Total | 17878.875 199 89.843593 Root MSE = 7.1175
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .6360156 .0714073 8.91 0.000 .4951904 .7768408
1.female | 12.49063 5.259266 2.37 0.019 2.118614 22.86265
|
female#|
c.read |
1 | -.133902 .0986707 -1.36 0.176 -.3284945 .0606905
|
_cons | 16.52388 3.845114 4.30 0.000 8.940769 24.10699
------------------------------------------------------------------------------
twoway (scatter write read, msym(oh)) (lfit write read if female==0) ///
(lfit write read if female==1), legend(off)xxxxxxx
Classical ANCOVA
Assumptions

Selecting a Covariate
Logic of ANCOVA

Which may be rewritten:

Homogeneity of Regression

Steps in ANCOVA
Numerical Example: coded using Effect Coding
input id y c1 c2 grp v1 v2 v3
1 6 1 6 1 1 0 0
2 9 1 7 1 1 0 0
3 8 2 15 1 1 0 0
4 8 3 13 1 1 0 0
5 12 3 18 1 1 0 0
6 12 4 9 1 1 0 0
7 10 4 16 1 1 0 0
8 8 5 10 1 1 0 0
9 12 5 16 1 1 0 0
10 13 6 18 1 1 0 0
11 13 4 12 2 0 1 0
12 16 4 12 2 0 1 0
13 15 5 17 2 0 1 0
14 16 6 9 2 0 1 0
15 19 6 20 2 0 1 0
16 17 8 18 2 0 1 0
17 19 8 16 2 0 1 0
18 23 9 20 2 0 1 0
19 19 10 10 2 0 1 0
20 22 10 17 2 0 1 0
21 20 7 8 3 0 0 1
22 22 7 14 3 0 0 1
23 24 9 11 3 0 0 1
24 26 9 11 3 0 0 1
25 24 10 16 3 0 0 1
26 25 11 20 3 0 0 1
27 28 11 19 3 0 0 1
28 27 12 19 3 0 0 1
29 29 13 12 3 0 0 1
30 26 13 16 3 0 0 1
31 27 7 16 4 -1 -1 -1
32 28 8 10 4 -1 -1 -1
33 25 8 13 4 -1 -1 -1
34 27 9 7 4 -1 -1 -1
35 31 9 15 4 -1 -1 -1
36 29 10 20 4 -1 -1 -1
37 32 10 16 4 -1 -1 -1
38 30 12 21 4 -1 -1 -1
39 32 12 15 4 -1 -1 -1
40 33 14 21 4 -1 -1 -1
end
/* using regress with factor variables */
regress y i.grp##c.c1
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 7, 32) = 109.27
Model | 2382.23359 7 340.319084 Prob > F = 0.0000
Residual | 99.6664092 32 3.11457529 R-squared = 0.9598
-------------+------------------------------ Adj R-squared = 0.9511
Total | 2481.9 39 63.6384615 Root MSE = 1.7648
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
grp |
2 | 3.435985 2.272924 1.51 0.140 -1.19381 8.06578
3 | 7.650473 3.069013 2.49 0.018 1.399098 13.90185
4 | 13.34207 3.017006 4.42 0.000 7.196633 19.48752
|
c1 | .9015152 .3434768 2.62 0.013 .2018757 1.601155
|
grp#c.c1 |
2 | .2026515 .4276252 0.47 0.639 -.6683925 1.073696
3 | .1489436 .4352144 0.34 0.734 -.7375591 1.035446
4 | .0402098 .4365514 0.09 0.927 -.8490164 .929436
|
_cons | 6.734848 1.29432 5.20 0.000 4.098405 9.371292
------------------------------------------------------------------------------
testparm grp#c.c1
( 1) 2.grp#c.c1 = 0
( 2) 3.grp#c.c1 = 0
( 3) 4.grp#c.c1 = 0
F( 3, 32) = 0.11
Prob > F = 0.9550
/* using anova */
anova y i.grp##c.c1
Number of obs = 40 R-squared = 0.9598
Root MSE = 1.76482 Adj R-squared = 0.9511
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2382.23359 7 340.319084 109.27 0.0000
|
grp | 70.1635717 3 23.3878572 7.51 0.0006
c1 | 152.279387 1 152.279387 48.89 0.0000
grp#c1 | 1.00618243 3 .335394144 0.11 0.9550
|
Residual | 99.6664092 32 3.11457529
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
Number of obs = 40 R-squared = 0.9598
Root MSE = 1.76482 Adj R-squared = 0.9511
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2382.23359 7 340.319084 109.27 0.0000
|
grp | 70.1635717 3 23.3878572 7.51 0.0006
c1 | 152.279387 1 152.279387 48.89 0.0000
grp#c1 | 1.00618243 3 .335394144 0.11 0.9550
|
Residual | 99.6664092 32 3.11457529
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
/* without interaction */
anova y i.grp c1
Number of obs = 40 R-squared = 0.9681
Root MSE = 1.85482 Adj R-squared = 0.9459
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2402.77139 16 150.173212 43.65 0.0000
|
grp | 270.804721 3 90.2682404 26.24 0.0000
c1 | 186.671388 13 14.3593375 4.17 0.0014
|
Residual | 79.1286121 23 3.44037444
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
/* back to regress with factor variables */
regress y i.grp c1
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 4, 35) = 206.97
Model | 2381.22741 4 595.306852 Prob > F = 0.0000
Residual | 100.672592 35 2.87635976 R-squared = 0.9594
-------------+------------------------------ Adj R-squared = 0.9548
Total | 2481.9 39 63.6384615 Root MSE = 1.696
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
grp |
2 | 4.453014 .8983061 4.96 0.000 2.629356 6.276673
3 | 8.411249 1.184014 7.10 0.000 6.007572 10.81493
4 | 13.01516 1.1535 11.28 0.000 10.67344 15.35689
|
c1 | 1.013052 .1337037 7.58 0.000 .7416185 1.284485
_cons | 6.355625 .703058 9.04 0.000 4.928341 7.782908
------------------------------------------------------------------------------
testparm i.grp
( 1) 2.grp = 0
( 2) 3.grp = 0
( 3) 4.grp = 0
F( 3, 35) = 48.19
Prob > F = 0.0000
/* compute original means */
table grp, contents(mean y)
----------+-----------
grp | mean(y)
----------+-----------
1 | 9.8
2 | 17.9
3 | 25.1
4 | 29.4
----------+-----------
/* compute adjusted means using margins */
/* margins will work with either regress or anova */
margins grp, asbalanced
Predictive margins Number of obs = 40
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
grp |
1 | 14.08014 .7789391 18.08 0.000 12.55345 15.60684
2 | 18.53316 .5427882 34.14 0.000 17.46931 19.597
3 | 22.49139 .6373144 35.29 0.000 21.24228 23.74051
4 | 27.09531 .6165704 43.95 0.000 25.88685 28.30376
------------------------------------------------------------------------------
Regression Equation

Separate Intercepts

Computing Adjusted Means

Multiple Covariates
Numerical Example
Same data as above example, except for the additional interaction terms:
/* run regression */
regress y c.c1##grp c.c2##grp
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 11, 28) = 71.35
Model | 2396.40189 11 217.854717 Prob > F = 0.0000
Residual | 85.4981125 28 3.05350402 R-squared = 0.9656
-------------+------------------------------ Adj R-squared = 0.9520
Total | 2481.9 39 63.6384615 Root MSE = 1.7474
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
c1 | .6228412 .4106184 1.52 0.141 -.2182726 1.463955
|
grp |
2 | 1.748814 3.135003 0.56 0.581 -4.672948 8.170576
3 | 9.113772 3.358697 2.71 0.011 2.233793 15.99375
4 | 14.68458 3.254535 4.51 0.000 8.017967 21.35119
|
grp#c.c1 |
2 | .3730242 .4858476 0.77 0.449 -.6221895 1.368238
3 | .4238656 .517841 0.82 0.420 -.6368836 1.484615
4 | .2512126 .5316271 0.47 0.640 -.8377761 1.340201
|
c2 | .1896132 .1565606 1.21 0.236 -.1310866 .5103131
|
grp#c.c2 |
2 | .0703098 .2157489 0.33 0.747 -.3716317 .5122513
3 | -.1858784 .2318612 -0.80 0.429 -.6608245 .2890676
4 | -.1372108 .2240572 -0.61 0.545 -.5961711 .3217495
|
_cons | 5.255291 1.770547 2.97 0.006 1.62849 8.882092
------------------------------------------------------------------------------
/* test homogeneity of regression slopes */
testparm grp#c.c1 grp#c.c2
( 1) 2.grp#c.c1 = 0
( 2) 3.grp#c.c1 = 0
( 3) 4.grp#c.c1 = 0
( 4) 2.grp#c.c2 = 0
( 5) 3.grp#c.c2 = 0
( 6) 4.grp#c.c2 = 0
F( 6, 28) = 0.43
Prob > F = 0.8553
/* using anova */
anova y c.c1##grp c.c2##grp
Number of obs = 40 R-squared = 0.9656
Root MSE = 1.74743 Adj R-squared = 0.9520
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2396.40189 11 217.854717 71.35 0.0000
|
c1 | 85.0803048 1 85.0803048 27.86 0.0000
grp | 73.5988644 3 24.5329548 8.03 0.0005
grp#c1 | 2.40458727 3 .80152909 0.26 0.8518
c2 | 7.69362841 1 7.69362841 2.52 0.1237
grp#c2 | 5.14302215 3 1.71434072 0.56 0.6449
|
Residual | 85.4981125 28 3.05350402
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
test grp#c.c1 grp#c.c2
Source | Partial SS df MS F Prob > F
--------------+----------------------------------------------------
grp#c1 grp#c2 | 7.80432184 6 1.30072031 0.43 0.8553
Residual | 85.4981125 28 3.05350402
/* anova without interaction */
anova y c.c1 c.c2 grp
Number of obs = 40 R-squared = 0.9624
Root MSE = 1.65656 Adj R-squared = 0.9569
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 2388.59757 5 477.719513 174.08 0.0000
|
c1 | 98.974038 1 98.974038 36.07 0.0000
c2 | 7.37015734 1 7.37015734 2.69 0.1105
grp | 420.189396 3 140.063132 51.04 0.0000
|
Residual | 93.3024343 34 2.74418925
-----------+----------------------------------------------------
Total | 2481.9 39 63.6384615
/* rerun as regression */
regress
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 5, 34) = 174.08
Model | 2388.59757 5 477.719513 Prob > F = 0.0000
Residual | 93.3024343 34 2.74418925 R-squared = 0.9624
-------------+------------------------------ Adj R-squared = 0.9569
Total | 2481.9 39 63.6384615 Root MSE = 1.6566
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
c1 | .8952525 .1490706 6.01 0.000 .5923047 1.1982
c2 | .1199612 .0731997 1.64 0.110 -.0287985 .2687209
|
grp |
2 | 4.60118 .8820702 5.22 0.000 2.808598 6.393762
3 | 8.996353 1.210347 7.43 0.000 6.536631 11.45607
4 | 13.46896 1.160214 11.61 0.000 11.11112 15.8268
|
_cons | 5.220638 .9753057 5.35 0.000 3.238579 7.202698
------------------------------------------------------------------------------
/* compute original means again */
table grp, contents(mean y)
----------+-----------
grp | mean(y)
----------+-----------
1 | 9.8
2 | 17.9
3 | 25.1
4 | 29.4
----------+-----------
/* compute adjusted means using margins */
margins grp, asbalanced
Predictive margins Number of obs = 40
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
grp |
1 | 13.78338 .7820854 17.62 0.000 12.25052 15.31624
2 | 18.38456 .537869 34.18 0.000 17.33035 19.43876
3 | 22.77973 .646886 35.21 0.000 21.51186 24.0476
4 | 27.25234 .6098128 44.69 0.000 26.05713 28.44755
------------------------------------------------------------------------------
Regression Equation

Separate Intercepts

Interpretational Problems
Specification Error
Extrapolation Errors
Differential Growth
Nonlinearity
Measurement Error
Stata Example
These data are from a 1996 study (Gregoire, Kumar, Everitt, Henderson & Studd; also in Rabe-Hesketh & Everitt, 1999) on the efficacy of estrogen patches in treating postpartum depression. Women were randomly assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). Prior to the first treatment all patients took the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six months once the treatment began and average depression scores computed for each subject. Higher scores on the EDPS are indicative of higher levels of dsepression.
use http://www.philender.com/courses/data/depress1, clear
describe
Contains data from depress1.dta
obs: 61
vars: 4 18 Feb 2000 11:21
size: 1,220 (99.8% of memory free)
-------------------------------------------------------------------------------
1. subj float %9.0g
2. dep float %9.0g post-treatment depression score
3. pre float %9.0g pre-treatment depression score
4. group float %14.0g gl treatment group
-------------------------------------------------------------------------------
codebook group
group --------------------------------------------------------- treatment group
type: numeric (float)
label: gl
range: [0,1] units: 1
unique values: 2 coded missing: 0 / 61
tabulation: Freq. Numeric Label
27 0 placebo patch
34 1 estrogen patch
summarize pre dep
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
pre | 61 21.04033 3.722975 15 28
dep | 61 12.41284 5.407777 2 26.5
ttest pre, by(group)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
placebo | 27 20.77778 .7611158 3.954874 19.21328 22.34227
estrogen | 34 21.24882 .61301 3.574432 20.00165 22.496
---------+--------------------------------------------------------------------
combined | 61 21.04033 .476678 3.722975 20.08683 21.99383
---------+--------------------------------------------------------------------
diff | -.4710457 .9658499 -2.403707 1.461615
------------------------------------------------------------------------------
Degrees of freedom: 59
Ho: mean(placebo ) - mean(estrogen) = diff = 0
Ha: diff < 0 Ha: diff ~= 0 Ha: diff > 0
t = -0.4877 t = -0.4877 t = -0.4877
P < t = 0.3138 P > |t| = 0.6276 P > t = 0.6862
pwcorr dep pre, sig
| dep pre
-------------+------------------
dep | 1.0000
|
|
pre | 0.2920 1.0000
| 0.0224
|
/* a quick-and-dirty scatterplot */
plot dep pre
26.5 +
p | *
o |
s | *
t | *
- | *
t |
r | *
e | * * * * * *
a | * * *
t | * * * * * *
m | * *
e | * * *
n | * *
t | * * * * * *
| * *
d | * * * *
e | * * *
p | * * *
r | * * *
2 + * *
+----------------------------------------------------------------+
15 pre-treatment depression score 28
/* analysis without covariate */
regress dep i.group
Source | SS df MS Number of obs = 61
-------------+------------------------------ F( 1, 59) = 10.54
Model | 265.972224 1 265.972224 Prob > F = 0.0019
Residual | 1488.67078 59 25.2317081 R-squared = 0.1516
-------------+------------------------------ Adj R-squared = 0.1372
Total | 1754.643 60 29.2440501 Root MSE = 5.0231
------------------------------------------------------------------------------
dep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.group | -4.20399 1.294842 -3.25 0.002 -6.794964 -1.613017
_cons | 14.75605 .9666994 15.26 0.000 12.82169 16.69041
------------------------------------------------------------------------------
margins group, asbalanced
Adjusted predictions Number of obs = 61
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
0 | 14.75605 .9666994 15.26 0.000 12.86135 16.65075
1 | 10.55206 .8614575 12.25 0.000 8.863633 12.24048
------------------------------------------------------------------------------
The ANCOVA
/* test for treat by covariate interation */
anova dep c.pre##group
Number of obs = 61 R-squared = 0.2541
Root MSE = 4.79187 Adj R-squared = 0.2148
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 445.80933 3 148.60311 6.47 0.0008
|
pre | 177.442758 1 177.442758 7.73 0.0074
group | 1.49167086 1 1.49167086 0.06 0.7997
group#pre | 3.19588284 1 3.19588284 0.14 0.7105
|
Residual | 1308.83367 57 22.9619943
-----------+----------------------------------------------------
Total | 1754.643 60 29.2440501
regress dep pre i.group
Source | SS df MS Number of obs = 61
-------------+------------------------------ F( 2, 58) = 9.78
Model | 442.613448 2 221.306724 Prob > F = 0.0002
Residual | 1312.02956 58 22.6211993 R-squared = 0.2523
-------------+------------------------------ Adj R-squared = 0.2265
Total | 1754.643 60 29.2440501 Root MSE = 4.7562
------------------------------------------------------------------------------
dep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pre | .4618001 .1652592 2.79 0.007 .1309977 .7926024
1.group | -4.421519 1.2285 -3.60 0.001 -6.880629 -1.96241
_cons | 5.16087 3.553626 1.45 0.152 -1.952484 12.27422
------------------------------------------------------------------------------
table group, contents(mean dep)
---------------+-----------
treatment |
group | mean(dep)
---------------+-----------
placebo patch | 14.75605
estrogen patch | 10.55206
---------------+-----------
margins group, asbalanced
Predictive margins Number of obs = 61
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
group |
0 | 14.87729 .9163541 16.24 0.000 13.08127 16.67332
1 | 10.45578 .8164047 12.81 0.000 8.855652 12.0559
------------------------------------------------------------------------------Interpretation
The interaction between the covariate (pre) and the treatment (group) was not significant, implying that we have homogeneity of regression slopes. In the final regression model, both the covariate and the treatment were statistically significant. Women with higher pretest scores on depression remain higher after treatment. Each point increase on the pretest was associated with about a .46 point increase on the predicted posttest score.
The effect of the estrogen patch was also significant. Women using the treatment patch had predicted depression scoress almost 4.5 points lower than women using the control patch.
Another Stata Example
These data examine a reading instruction program called "reading recovery." Students are randomly assigned to two treatment groups: a control group which receives standard reading instruction (treat = 0, n = 43) and the reading recovery group (treat = 1, n = 32).
There were two pretests administered at the beginning of the year. One test (pre1) consisted of dictation tasks, and the second (pre2) were early literacy skills. After four months of remedial reading instruction, the students we administered a standardized test of reading skills.
We will begin be examining the variables and determining if the treatment groups differ on the pretest measures.
use http://www.philender.com/courses/data/readexp, clear
describe
Contains data from readexp.dta
obs: 75
vars: 6 21 Dec 2000 21:29
size: 2,100 (99.8% of memory free)
-------------------------------------------------------------------------------
1. id float %9.0g
2. school float %9.0g
3. treat float %9.0g
4. pre1 float %9.0g
5. pre2 float %9.0g
6. post float %9.0g
-------------------------------------------------------------------------------
tabulate treat
treat | Freq. Percent Cum.
------------+-----------------------------------
0 | 43 57.33 57.33
1 | 32 42.67 100.00
------------+-----------------------------------
Total | 75 100.00
summarize pre1 pre2 post
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
pre1 | 75 8.44 7.571711 0 31
pre2 | 75 39.05333 18.32506 7 88
post | 75 33.26667 11.11528 12 64
corr pre1 pre2 post
(obs=75)
| pre1 pre2 post
---------+---------------------------
pre1 | 1.0000
pre2 | 0.6017 1.0000
post | 0.3202 0.5522 1.0000
stem pre1 if treat==0, lines(2)
Stem-and-leaf plot for pre1
0* | 00001222333344444
0. | 555667789
1* | 0134
1. | 568899
2* | 013
2. | 569
3* | 1
stem pre1 if treat==1, lines(2)
Stem-and-leaf plot for pre1
0* | 0001222223344
0. | 5556678889
1* | 000111
1. | 66
2* | 1
stem pre2 if treat==0, lines(1)
Stem-and-leaf plot for pre2
0* | 7
1* | 0668889
2* | 0013356899
3* | 0146789
4* | 13346
5* | 01225679
6* | 1366
7* |
8* | 8
stem pre2 if treat==1, lines(1)
Stem-and-leaf plot for pre2
0* | 9
1* | 367
2* | 12379
3* | 02358
4* | 001677
5* | 00123469
6* | 17
7* |
8* | 24
ttest pre1, by(treat)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 | 43 9.883721 1.337014 8.767389 7.185517 12.58192
1 | 32 6.5 .9002688 5.092689 4.66389 8.33611
---------+--------------------------------------------------------------------
combined | 75 8.44 .8743059 7.571711 6.697908 10.18209
---------+--------------------------------------------------------------------
diff | 3.383721 1.735173 -.0744738 6.841916
------------------------------------------------------------------------------
Degrees of freedom: 73
Ho: mean(0) - mean(1) = diff = 0
Ha: diff < 0 Ha: diff ~= 0 Ha: diff > 0
t = 1.9501 t = 1.9501 t = 1.9501
P < t = 0.9725 P > |t| = 0.0550 P > t = 0.0275
ttest pre2, by(treat)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 | 43 37.30233 2.769381 18.16005 31.71349 42.89116
1 | 32 41.40625 3.282671 18.56959 34.7112 48.1013
---------+--------------------------------------------------------------------
combined | 75 39.05333 2.115996 18.32506 34.83712 43.26955
---------+--------------------------------------------------------------------
diff | -4.103924 4.280596 -12.63514 4.427291
------------------------------------------------------------------------------
Degrees of freedom: 73
Ho: mean(0) - mean(1) = diff = 0
Ha: diff < 0 Ha: diff ~= 0 Ha: diff > 0
t = -0.9587 t = -0.9587 t = -0.9587
P < t = 0.1704 P > |t| = 0.3409 P > t = 0.8296
ranksum pre1, by(treat)
Two-sample Wilcoxon rank-sum (Mann-Whitney) test
treat | obs rank sum expected
---------+---------------------------------
0 | 43 1748 1634
1 | 32 1102 1216
---------+---------------------------------
combined | 75 2850 2850
unadjusted variance 8714.67
adjustment for ties -39.54
----------
adjusted variance 8675.12
Ho: pre1(treat==0) = pre1(treat==1)
z = 1.224
Prob > |z| = 0.2210
ranksum pre2, by(treat)
Two-sample Wilcoxon rank-sum (Mann-Whitney) test
treat | obs rank sum expected
---------+---------------------------------
0 | 43 1547.5 1634
1 | 32 1302.5 1216
---------+---------------------------------
combined | 75 2850 2850
unadjusted variance 8714.67
adjustment for ties -4.71
----------
adjusted variance 8709.96
Ho: pre2(treat==0) = pre2(treat==1)
z = -0.927
Prob > |z| = 0.3540
Interpretation
Because there appears to be a great deal of skewness in pre2 and some differences in the shapes of the distributions for pre1 Mann-Whitney tests (ranksum command) are preferable to the Student's t-tests for looking at pretest differences in the groups
Now, let's conduct the analysis of covariance.
/* without covariates */
regress post treat
Source | SS df MS Number of obs = 75
---------+------------------------------ F( 1, 73) = 8.39
Model | 942.050388 1 942.050388 Prob > F = 0.0050
Residual | 8200.61628 73 112.337209 R-squared = 0.1030
---------+------------------------------ Adj R-squared = 0.0908
Total | 9142.66667 74 123.54955 Root MSE = 10.599
------------------------------------------------------------------------------
post | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
treat | 7.165698 2.474476 2.896 0.005 2.234075 12.09732
_cons | 30.2093 1.616321 18.690 0.000 26.98798 33.43063
------------------------------------------------------------------------------
/* test treatment by slope interaction */
anova post c.pre1##treat c.pre2##treat
Number of obs = 75 R-squared = 0.3989
Root MSE = 8.92437 Adj R-squared = 0.3554
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 3647.21028 5 729.442056 9.16 0.0000
|
pre1 | .66873253 1 .66873253 0.01 0.9273
treat | 22.9971541 1 22.9971541 0.29 0.5928
treat#pre1 | 137.179391 1 137.179391 1.72 0.1937
pre2 | 1190.78252 1 1190.78252 14.95 0.0002
treat#pre2 | 144.906484 1 144.906484 1.82 0.1818
|
Residual | 5495.45639 69 79.6442955
-----------+----------------------------------------------------
Total | 9142.66667 74 123.54955
test treat#c.pre1 treat#c.pre2
Source | Partial SS df MS F Prob > F
----------------------+----------------------------------------------------
treat#pre1 treat#pre2 | 168.245679 2 84.1228394 1.06 0.3533
Residual | 5495.45639 69 79.6442955
/* the ancova */
regress post pre1 pre2 i.treat
Source | SS df MS Number of obs = 75
-------------+------------------------------ F( 3, 71) = 14.54
Model | 3478.9646 3 1159.65487 Prob > F = 0.0000
Residual | 5663.70207 71 79.7704516 R-squared = 0.3805
-------------+------------------------------ Adj R-squared = 0.3543
Total | 9142.66667 74 123.54955 Root MSE = 8.9314
------------------------------------------------------------------------------
post | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pre1 | .1698471 .1843945 0.92 0.360 -.1978251 .5375193
pre2 | .2726797 .0747458 3.65 0.001 .1236409 .4217185
1.treat | 6.621356 2.253639 2.94 0.004 2.127727 11.11499
_cons | 18.35899 2.525568 7.27 0.000 13.32315 23.39483
------------------------------------------------------------------------------
regress post pre2 i.treat
Source | SS df MS Number of obs = 75
-------------+------------------------------ F( 2, 72) = 21.43
Model | 3411.28428 2 1705.64214 Prob > F = 0.0000
Residual | 5731.38239 72 79.6025332 R-squared = 0.3731
-------------+------------------------------ Adj R-squared = 0.3557
Total | 9142.66667 74 123.54955 Root MSE = 8.922
------------------------------------------------------------------------------
post | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pre2 | .3172027 .0569533 5.57 0.000 .2036683 .4307371
1.treat | 5.863922 2.096051 2.80 0.007 1.68552 10.04232
_cons | 18.3769 2.522833 7.28 0.000 13.34773 23.40608
------------------------------------------------------------------------------
/* unadjusted means */
tabstat post, by(treat)
Summary for variables: post
by categories of: treat
treat | mean
---------+----------
0 | 30.2093
1 | 37.375
---------+----------
Total | 33.26667
--------------------
margins treat, asbalanced
Predictive margins Number of obs = 75
Model VCE : OLS
Expression : Linear prediction, predict()
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
treat |
0 | 30.76473 1.364246 22.55 0.000 28.09085 33.4386
1 | 36.62865 1.582889 23.14 0.000 33.52624 39.73105
------------------------------------------------------------------------------Interpretation
The interaction between the covariates (pre1 & pre2) and the treatment (treat) were not significant, implying that we have homogeneity of regression slopes. In the regression model with no interactions pre1 was not significant and was dropped from the analysis. Both the covariate (pre2) and the treatment were statistically significant. Students with higher pretest scores on tended to have higher posttest scores. Each point increase on the pre2 was associated with about a .32 point increase on the predicted posttest score.
The effect of the reading recovery was also significant while controling for the initial level of pre2. Students receiving the treatment had predicted posttest scores almost 5.9 points higher than students in the control group. The predicted change without the covariate was approximately 7.2 points. Including the covariate reduced the amount of predicted change.
Linear Statistical Models Course
Phil Ender, 24sep10, 22Feb00