Discriminant analysis, also known as linear discriminant function analysis, combines aspects of multivariate analysis of varicance with the ability to classify observations into known categories. It is a useful adjunct in helping to interpret the results of manova. It provides an alternative to multinomial logistic regression that will function adequately with somewhat smaller sample sizes. Like so many multivariate techniques it does require that the dependent variables come from a multivariate normal distribution.
What does discriminant analysis do?
In the beginning...
Let Y be a linear combination of p dependent variables, such that

Total deviation SSCP
Let T be the Total deviation SSCP matrix, which is obtained from

Within Groups deviation SSCP
The within group sums of squares and cross products matrix is expressed as:

Between Groups deviation SSCP

where

which has n1 rows of means from group 1, n2 rows of means from groups 2, nk rows of means from group k, and where

However, it is easier to obtain B by taking T - W.
The between groups sums of squares for the linear combination is:

Discriminant Analysis
We wish to select the elements of v such that
is
a maximum.
This occurs when (B - λW)v = 0. Thus, discriminant analysis reduces to finding the eigenvalues and eigenvectors of W-1B which is often written E-1H.
Computational Note
W-1B is generally non-symmetric. Most matrix languages cannot compute the eigenvalues and vectors unless the matrix is symmetric. Therefore, you must perform some additional computations to obtain the eigenvalues and eigenvectors.
mat l = cholesky(W) /* Cholesky decomposition */ mat u = l' /* u equals l transpose */ mat a = inv(l)*B*inv(u) /* a is symmetric */ mat symeigen uv e = a /* e will have the eigenvalues of W-1B */ mat v = inv(u)*uv /* v will have the eigenvectors of W-1B */
Aside
The cholesky function performs the Cholesky decomposition of a matrix such that if L = cholesky(A) then L'L = A, where L is lower triangular matrix. The formula could also be be written LU = A where U is upper triangular. Matrix A must be symmetric and nonnegative definite.
Number of Dimensions
Trimming
Standardized Discriminant Weights
Standardized discriminant weights or coefficients are used when all the variables are in standard score from.
Standardized discriminant coefficients are one of the pieces used in the interpretation of the discriminant analysis.
wii is a diagonal matrix whose elements are the square roots of the diagonal elements of W.
Then let 
sv is the vector of standardized discriminant coefficients.
Structure Matrix
The structure matrix gives the correlations between the variables and the discriminant functions.
The structure matrix is another of the pieces used in the interpretation of the discriminant analysis.
Computing the structure matrix.
mat E = V'*W*V mat F = inv(sqrt(diag(E))) mat G = inv(sqet(diag(W))) mat A = G*W*V*F
A is the structure matrix.
Canonical Correlations

Wilks' Lambda

Using Canonical Correlations
Let t = 1 - Rc2
then,

2-Group Case
>Chi-square

with degrees of freedom:
let q = k-1 For Chisquare1 = pq For Chisquare2 = (p-1)*(q-1) For Chisquare3 = (p-2)*(q-2)
2-Group Example
There are new discriminant analyse procedures in Stata 10. We will be using the candisc or discrim lda command for these examples. candisc is a wrapper program for discrim lad which includes a number of commonly used options.
use http://www.philender.com/courses/data/honors, clear
describe
Contains data from http://www.gseis.ucla.edu/courses/data/honors.dta
obs: 200
vars: 7 14 Dec 2001 09:19
size: 6,400 (99.9% of memory free)
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
id float %9.0g
female float %9.0g fl
ses float %9.0g sl
lang float %9.0g language test score
math float %9.0g math score
science float %9.0g science score
honors float %9.0g
candisc female lang, group(honors)
Canonical linear discriminant analysis
| | Like-
| Canon. Eigen- Variance | lihood
Fcn | Corr. value Prop. Cumul. | Ratio F df1 df2 Prob>F
----+---------------------------------+------------------------------------
1 | 0.5166 .364005 1.0000 1.0000 | 0.7331 35.854 2 197 0.0000 e
---------------------------------------------------------------------------
Ho: this and smaller canon. corr. are zero; e = exact F
Standardized canonical discriminant function coefficients
| function1
-------------+-----------
female | .3706778
lang | .9822288
Canonical structure
| function1
-------------+-----------
female | .2328554
lang | .9302168
Group means on canonical variables
honors | function1
-------------+-----------
0 | -.3604546
1 | .9997513
Resubstitution classification summary
+---------+
| Key |
|---------|
| Number |
| Percent |
+---------+
| Classified
True honors | 0 1 | Total
-------------+----------------+-------
0 | 114 33 | 147
| 77.55 22.45 | 100.00
| |
1 | 16 37 | 53
| 30.19 69.81 | 100.00
-------------+----------------+-------
Total | 130 70 | 200
| 65.00 35.00 | 100.00
| |
Priors | 0.5000 0.5000 |
/* display raw discriminant coefficients */
matrix list e(L_unstd)
e(L_unstd)[3,1]
function1
female .74791853
lang .10957945
_cons -6.1309503
predict pr, pr
predict dsc, dsc
twoway line pr dsc, sort
3-Group Example
use http://www.philender.com/courses/data/hsb2, clear
(highschool and beyond (200 cases))
/* equivalent to one-way anova */
candisc write, group(prog) notable nostruct
Canonical linear discriminant analysis
| | Like-
| Canon. Eigen- Variance | lihood
Fcn | Corr. value Prop. Cumul. | Ratio F df1 df2 Prob>F
----+---------------------------------+------------------------------------
1 | 0.4215 .215987 1.0000 1.0000 | 0.8224 21.275 2 197 0.0000 e
---------------------------------------------------------------------------
Ho: this and smaller canon. corr. are zero; e = exact F
Standardized canonical discriminant function coefficients
| function1
-------------+-----------
write | 1
Group means on canonical variables
prog | function1
-------------+-----------
general | -.1668754
academic | .4030641
vocation | -.6962467
/* compare with */
anova write prog
Number of obs = 200 R-squared = 0.1776
Root MSE = 8.63918 Adj R-squared = 0.1693
Source | Partial SS df MS F Prob > F
-----------+----------------------------------------------------
Model | 3175.69786 2 1587.84893 21.27 0.0000
|
prog | 3175.69786 2 1587.84893 21.27 0.0000
|
Residual | 14703.1771 197 74.635417
-----------+----------------------------------------------------
Total | 17878.875 199 89.843593
display sqrt(e(r2))
.42145333
/* equivalent to one-way manova */
candisc write read math, group(prog) nostruct
Canonical linear discriminant analysis
| | Like-
| Canon. Eigen- Variance | lihood
Fcn | Corr. value Prop. Cumul. | Ratio F df1 df2 Prob>F
----+---------------------------------+------------------------------------
1 | 0.5125 .356283 0.9874 0.9874 | 0.7340 10.87 6 390 0.0000 e
2 | 0.0672 .004543 0.0126 1.0000 | 0.9955 .44518 2 196 0.6414 e
---------------------------------------------------------------------------
Ho: this and smaller canon. corr. are zero; e = exact F
Standardized canonical discriminant function coefficients
| function1 function2
-------------+----------------------
write | .3310784 1.183414
read | .2728524 -.4097932
math | .5815538 -.655658
Group means on canonical variables
prog | function1 function2
-------------+----------------------
general | -.3120021 .1190423
academic | .5358515 -.0196809
vocation | -.8444861 -.0658081
Resubstitution classification summary
+---------+
| Key |
|---------|
| Number |
| Percent |
+---------+
| Classified
True prog | general academic vocation | Total
-------------+------------------------------+---------
general | 11 17 17 | 45
| 24.44 37.78 37.78 | 100.00
| |
academic | 18 68 19 | 105
| 17.14 64.76 18.10 | 100.00
| |
vocation | 14 7 29 | 50
| 28.00 14.00 58.00 | 100.00
-------------+------------------------------+---------
Total | 43 92 65 | 200
| 21.50 46.00 32.50 | 100.00
| |
Priors | 0.3333 0.3333 0.3333 |
/* display manova table */
estat manova
Number of obs = 200
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
prog | W 0.7340 2 6.0 390.0 10.87 0.0000 e
| P 0.2672 6.0 392.0 10.08 0.0000 a
| L 0.3608 6.0 388.0 11.67 0.0000 a
| R 0.3563 3.0 196.0 23.28 0.0000 u
|--------------------------------------------------
Residual | 197
-----------+--------------------------------------------------
Total | 199
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
label define sel 1 "G" 2 "A" 3 "V", modify
scoreplot, msymbol(i)
/* compare discriminant analysis with manova */
manova write read math = prog
Number of obs = 200
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
prog | W 0.7340 2 6.0 390.0 10.87 0.0000 e
| P 0.2672 6.0 392.0 10.08 0.0000 a
| L 0.3608 6.0 388.0 11.67 0.0000 a
| R 0.3563 3.0 196.0 23.28 0.0000 u
|--------------------------------------------------
Residual | 197
-----------+--------------------------------------------------
Total | 199
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
Fisher's Iris DataThis example makes use of the classic Iris data that R. A. Fisher used in developing the linear discriminant function.
use http://www.philender.com/courses/data/iris, clear
describe
Contains data from http://www.gseis.ucla.edu/courses/data/iris.dta
obs: 150
vars: 6 17 Feb 2000 14:30
size: 4,200 (99.9% of memory free)
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
case float %9.0g
type float %10.0g tl type of iris
sl float %9.0g sepal length
sw float %9.0g sepal width
pl float %9.0g petal length
pw float %9.0g petal width
-------------------------------------------------------------------------------
Sorted by: type
tabulate type
type of |
iris | Freq. Percent Cum.
------------+-----------------------------------
setosa | 50 33.33 33.33
versicolor | 50 33.33 66.67
virginica | 50 33.33 100.00
------------+-----------------------------------
Total | 150 100.00
candisc sl sw pl pw, group(type)
Canonical linear discriminant analysis
| | Like-
| Canon. Eigen- Variance | lihood
Fcn | Corr. value Prop. Cumul. | Ratio F df1 df2 Prob>F
----+---------------------------------+------------------------------------
1 | 0.9848 32.1919 0.9912 0.9912 | 0.0234 199.15 8 288 0.0000 e
2 | 0.4712 .285391 0.0088 1.0000 | 0.7780 13.794 3 145 0.0000 e
---------------------------------------------------------------------------
Ho: this and smaller canon. corr. are zero; e = exact F
Standardized canonical discriminant function coefficients
| function1 function2
-------------+----------------------
sl | -.4269549 -.0124077
sw | -.5212416 -.7352612
pl | .9472573 .4010379
pw | .5751607 -.5810398
Canonical structure
| function1 function2
-------------+----------------------
sl | .2225959 -.3108118
sw | -.1190115 -.8636809
pl | .7060654 -.1677014
pw | .6331779 -.7372421
Group means on canonical variables
type | function1 function2
-------------+----------------------
setosa | -7.6076 -.215133
versicolor | 1.825049 .7278996
virginica | 5.78255 -.5127666
Resubstitution classification summary
+---------+
| Key |
|---------|
| Number |
| Percent |
+---------+
| Classified
True type | setosa versicolor virginica | Total
-------------+------------------------------------+-----------
setosa | 50 0 0 | 50
| 100.00 0.00 0.00 | 100.00
| |
versicolor | 0 48 2 | 50
| 0.00 96.00 4.00 | 100.00
| |
virginica | 0 1 49 | 50
| 0.00 2.00 98.00 | 100.00
-------------+------------------------------------+-----------
Total | 50 49 51 | 150
| 33.33 32.67 34.00 | 100.00
| |
Priors | 0.3333 0.3333 0.3333 |
estat manova
Number of obs = 150
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
type | W 0.0234 2 8.0 288.0 199.15 0.0000 e
| P 1.1919 8.0 290.0 53.47 0.0000 a
| L 32.4773 8.0 286.0 580.53 0.0000 a
| R 32.1919 4.0 145.0 1166.96 0.0000 u
|--------------------------------------------------
Residual | 147
-----------+--------------------------------------------------
Total | 149
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
label define tl 1 "S" 2 "C" 3 "V", modify
scoreplot, msymbol(i)

One More Example
use http://www.philender.com/courses/data/hsb2
egen grp = group(prog female)
/* findit tablist */
tablist grp prog female, sort(v) nol clean
grp prog female Freq
1 1 0 21
2 1 1 24
3 2 0 47
4 2 1 58
5 3 0 23
6 3 1 27
candisc read write math science socst, group(grp)
Canonical linear discriminant analysis
| | Like-
| Canon. Eigen- Variance | lihood
Fcn | Corr. value Prop. Cumul. | Ratio F df1 df2 Prob>F
----+---------------------------------+------------------------------------
1 | 0.5782 .502254 0.6132 0.6132 | 0.4975 5.8498 25 707.3 0.0000 a
2 | 0.4363 .23516 0.2871 0.9003 | 0.7474 3.6511 16 584.2 0.0000 a
3 | 0.2197 .050705 0.0619 0.9622 | 0.9231 1.7353 9 467.4 0.0786 a
4 | 0.1728 .030785 0.0376 0.9997 | 0.9699 1.4843 4 386 0.2062 e
5 | 0.0144 .000207 0.0003 1.0000 | 0.9998 .04025 1 194 0.8412 e
---------------------------------------------------------------------------
Ho: this and smaller canon. corr. are zero; e = exact F, a = approximate F
Standardized canonical discriminant function coefficients
| function1 function2 function3 function4 function5
-------------+-------------------------------------------------------
read | -.1161169 -.4829687 .1886365 1.321818 .1538771
write | -.6602812 1.128027 -.212974 .1360569 .4061492
math | -.463602 -.5339929 .8257989 -.7658781 .3940945
science | .6107802 -.3927407 -1.125132 -.3845829 .2800101
socst | -.3563766 -.225085 -.2689487 -.3519051 -1.088045
Canonical structure
| function1 function2 function3 function4 function5
-------------+-------------------------------------------------------
read | -.5809872 -.5235904 -.2656326 .5296293 .1929744
write | -.8095514 .2094842 -.443327 .0312908 .3212865
math | -.6837577 -.5135297 .0156334 -.2991848 .4230919
science | -.2778489 -.4341668 -.7449201 -.1238876 .4050248
socst | -.7035231 -.2935881 -.3890976 -.0536654 -.5143777
Group means on canonical variables
grp | function1 function2 function3 function4 function5
-------------+-------------------------------------------------------
1 | .6640051 -.4587528 -.4351828 .2093051 -.0166562
2 | .1637273 .6219646 -.1688396 -.3557218 -.0124101
3 | -.3608076 -.4947248 -.0729075 -.1062484 .0169581
4 | -.7733276 .0895048 .1406045 .0935383 -.0099369
5 | 1.315385 -.3706178 .4039811 -.0541723 -.0050263
6 | .506798 .7885798 -.0307032 .1835678 .020094
Resubstitution classification summary
+---------+
| Key |
|---------|
| Number |
| Percent |
+---------+
| Classified
True grp | 1 2 3 4 5 6 | Total
-------------+------------------------------------------------+-------
1 | 5 2 3 1 7 3 | 21
| 23.81 9.52 14.29 4.76 33.33 14.29 | 100.00
| |
2 | 0 10 1 5 4 4 | 24
| 0.00 41.67 4.17 20.83 16.67 16.67 | 100.00
| |
3 | 5 6 17 13 3 3 | 47
| 10.64 12.77 36.17 27.66 6.38 6.38 | 100.00
| |
4 | 1 9 11 25 4 8 | 58
| 1.72 15.52 18.97 43.10 6.90 13.79 | 100.00
| |
5 | 5 4 1 1 12 0 | 23
| 21.74 17.39 4.35 4.35 52.17 0.00 | 100.00
| |
6 | 3 4 2 2 4 12 | 27
| 11.11 14.81 7.41 7.41 14.81 44.44 | 100.00
-------------+------------------------------------------------+-------
Total | 19 35 35 47 34 30 | 200
| 9.50 17.50 17.50 23.50 17.00 15.00 | 100.00
| |
Priors | 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 |
estat manova
Number of obs = 200
W = Wilks' lambda L = Lawley-Hotelling trace
P = Pillai's trace R = Roy's largest root
Source | Statistic df F(df1, df2) = F Prob>F
-----------+--------------------------------------------------
grp | W 0.4975 5 25.0 707.3 5.85 0.0000 a
| P 0.6031 25.0 970.0 5.32 0.0000 a
| L 0.8191 25.0 942.0 6.17 0.0000 a
| R 0.5023 5.0 194.0 19.49 0.0000 u
|--------------------------------------------------
Residual | 194
-----------+--------------------------------------------------
Total | 199
--------------------------------------------------------------
e = exact, a = approximate, u = upper bound on F
Multivariate Course Page
Phil Ender, 13jul07, 28oct05, 23apr05, 29Jan98