Its a multivariate world after all...
There are many situations in which you will have access to more than one outcome variable. In those situations you have three options:
The Manova Linear Model
Hypotheses
Assumptions
| 1. | The sets of observations are independent of one another. |
| 2. | The variables within each group come from multivariate normal populations. |
| 3. | The variance-covariance matrices for each group are equal in the population. |
Schematic with Example Data
Level
Group 1 Group 2 Group 3
y11 y21 y31
... ... ...
y1n y2n y3n
y11 y21 y31
... ... ...
y1n y2n y3n
y11 y21 y31
... ... ...
y1n y2n y3n
Stata Computer Example
input y1 y2 y3 grp
19.6 5.15 9.5 1
15.4 5.75 9.1 1
22.3 4.35 3.3 1
24.3 7.55 5.0 1
22.5 8.50 6.0 1
20.5 10.25 5.0 1
14.1 5.95 18.8 1
13.0 6.30 16.5 1
14.1 5.45 8.9 1
16.7 3.75 6.0 1
16.8 5.10 7.4 1
17.1 9.00 7.5 2
15.7 5.30 8.5 2
14.9 9.85 6.0 2
19.7 3.60 2.9 2
17.2 4.05 0.2 2
16.0 4.40 2.6 2
12.8 7.15 7.0 2
13.6 7.25 3.2 2
14.2 5.30 6.2 2
13.1 3.10 5.5 2
16.5 2.40 6.6 2
16.0 4.55 2.9 3
12.5 2.65 0.7 3
18.5 6.50 5.3 3
19.2 4.85 8.3 3
12.0 8.75 9.0 3
13.0 5.20 10.3 3
11.9 4.75 8.5 3
12.0 5.85 9.5 3
19.8 2.85 2.3 3
16.5 6.55 3.3 3
17.4 6.60 1.9 3
end
sort grp
by grp: summarize y1 y2 y3
-> grp= 1
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
y1 | 11 18.11818 3.903797 13 24.3
y2 | 11 6.190909 1.899713 3.75 10.25
y3 | 11 8.681818 4.863089 3.3 18.8
-> grp= 2
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
y1 | 11 15.52727 2.075616 12.8 19.7
y2 | 11 5.581818 2.434263 2.4 9.85
y3 | 11 5.109091 2.531187 .2 8.5
-> grp= 3
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
y1 | 11 15.34545 3.138268 11.9 19.8
y2 | 11 5.372727 1.759029 2.65 8.75
y3 | 11 5.636364 3.546907 .7 10.3
manova y1 y2 y3 = grp
Number of obs = 33
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
grp | W 0.5258 2 6.0 56.0 3.54 0.0049 e
| P 0.4767 6.0 58.0 3.02 0.0122 a
| L 0.8972 6.0 54.0 4.04 0.0021 a
| R 0.8920 3.0 29.0 8.62 0.0003 u
|--------------------------------------------------
Residual | 30
-----------+--------------------------------------------------
Total | 32
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
forvalues i=1/3 {
display
display "anova for y`i'"
display
anova y`i' grp
}
anova for y1
Number of obs = 33 R-squared = 0.1526
Root MSE = 3.13031 Adj R-squared = 0.0961
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 52.9242378 2 26.4621189 2.70 0.0835
|
grp | 52.9242378 2 26.4621189 2.70 0.0835
|
Residual | 293.965442 30 9.79884808
-----------+----------------------------------------------------
Total | 346.88968 32 10.8403025
anova for y2
Number of obs = 33 R-squared = 0.0305
Root MSE = 2.05173 Adj R-squared = -0.0341
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 3.97515121 2 1.9875756 0.47 0.6282
|
grp | 3.97515121 2 1.9875756 0.47 0.6282
|
Residual | 126.287277 30 4.20957589
-----------+----------------------------------------------------
Total | 130.262428 32 4.07070087
anova for y3
Number of obs = 33 R-squared = 0.1610
Root MSE = 3.76993 Adj R-squared = 0.1051
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 81.8296936 2 40.9148468 2.88 0.0718
|
grp | 81.8296936 2 40.9148468 2.88 0.0718
|
Residual | 426.370896 30 14.2123632
-----------+----------------------------------------------------
Total | 508.20059 32 15.8812684Interestingly, the multivariate F-ratio was significant but none of the univariate F's were. A better tool for looking at the multivariate effects is to use simultaneous confidence intervals.
simulci y1 y2 y3, by(grp) cv(.31)
s=2 m=0 n=13 cv= .31
group variable: grp
pairwise simultaneous
comparison difference confidence intervals
dv: y1
grp 1 vs grp 2 2.591* 0.292 4.889
grp 1 vs grp 3 2.773* 0.474 5.071
grp 2 vs grp 3 0.182 -2.117 2.480
dv: y2
grp 1 vs grp 2 0.609 -0.897 2.116
grp 1 vs grp 3 0.818 -0.688 2.325
grp 2 vs grp 3 0.209 -1.297 1.716
dv: y3
grp 1 vs grp 2 3.573* 0.805 6.341
grp 1 vs grp 3 3.045* 0.277 5.814
grp 2 vs grp 3 -0.527 -3.295 2.241
We see from these results that variables y1 and y3 display significant effects when looking at the differences by groups 1 & 2 and 1 & 3.
Example Using HSB2
use http://www.philender.com/courses/data/hsb2, clear
manova read write math science = prog
Number of obs = 200
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
prog | W 0.6942 2 8.0 388.0 9.71 0.0000 e
| P 0.3134 8.0 390.0 9.06 0.0000 a
| L 0.4296 8.0 386.0 10.36 0.0000 a
| R 0.4023 4.0 195.0 19.61 0.0000 u
|--------------------------------------------------
Residual | 197
-----------+--------------------------------------------------
Total | 199
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
simulci read write math science, by(prog) cv(.075)
s=2 m=.5 n=96 cv= .075
group variable: prog
pairwise simultaneous
comparison difference confidence intervals
dv: read
prog 1 vs prog 2 -6.406 -13.876 1.063
prog 1 vs prog 3 3.556 -3.914 11.025
prog 2 vs prog 3 9.962* 2.492 17.431
dv: write
prog 1 vs prog 2 -4.924 -11.829 1.982
prog 1 vs prog 3 4.573 -2.332 11.479
prog 2 vs prog 3 9.497* 2.592 16.403
dv: math
prog 1 vs prog 2 -6.711* -13.319 -0.103
prog 1 vs prog 3 3.602 -3.006 10.210
prog 2 vs prog 3 10.313* 3.705 16.921
dv: science
prog 1 vs prog 2 -1.356 -9.000 6.289
prog 1 vs prog 3 5.224 -2.420 12.869
prog 2 vs prog 3 6.580 -1.065 14.225
Factorial Manova Example
use http://www.philender.com/courses/data/hsb2, clear
manova read math science = female prog female#prog
Number of obs = 200
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
------------+--------------------------------------------------
Model | W 0.6719 5 15.0 530.4 5.48 0.0000 a
| P 0.3516 15.0 582.0 5.15 0.0000 a
| L 0.4541 15.0 572.0 5.77 0.0000 a
| R 0.3665 5.0 194.0 14.22 0.0000 u
|--------------------------------------------------
Residual | 194
------------+--------------------------------------------------
female | W 0.9823 1 3.0 192.0 1.15 0.3283 e
| P 0.0177 3.0 192.0 1.15 0.3283 e
| L 0.0180 3.0 192.0 1.15 0.3283 e
| R 0.0180 3.0 192.0 1.15 0.3283 e
|--------------------------------------------------
prog | W 0.7177 2 6.0 384.0 11.55 0.0000 e
| P 0.2892 6.0 386.0 10.87 0.0000 a
| L 0.3839 6.0 382.0 12.22 0.0000 a
| R 0.3573 3.0 193.0 22.99 0.0000 u
|--------------------------------------------------
female#prog | W 0.9586 2 6.0 384.0 1.37 0.2273 e
| P 0.0416 6.0 386.0 1.37 0.2268 a
| L 0.0429 6.0 382.0 1.36 0.2278 a
| R 0.0353 3.0 193.0 2.27 0.0819 u
|--------------------------------------------------
Residual | 194
------------+--------------------------------------------------
Total | 199
---------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
Only the multivariate test of the prog main effect was statistically significant.
Linear Statistical Models Course
Phil Ender, 17sep00, 26apr00