Upload
others
View
9
Download
0
Embed Size (px)
Applied Multivariate Analysis
Seppo Pynnonen
Department of Mathematics and Statistics, University of Vaasa, Finland
Spring 2017
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Discriminant Analysis
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
1 Discriminant analysis
Background
General Setup for the Discriminant Analysis
Descriptive Discriminant Analysis
Number of Discriminant Functions
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
Example 1
Consider the following data on financial ratios for solvent and bankrupted
companies
Financial Ratios of Bankrupt and Solvent Companies, Altman (1968)
Source: Morrison (1990). Multivariate Statistical Methods,
3rd ed. McGraw-Hill
X1 = Working Capital / Total Assets
X2 = Retained Earnings / Total Assets
X3 = Earnings Before Interest and Taxes / Total Assets
X4 = Market Value of Equity / Total Value of Liabilities
X5 = Sales / Total Assets
Group, 1 = Bankrupt 2 = Solvent
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
Group X1 X2 X3 X4 X5 Group X1 X2 X3 X4 X5
1 36.7 -62.8 -89.5 54.1 1.7 1 25.2 -11.4 4.8 7.0 0.9
1 24.0 3.3 -3.5 20.9 1.1
1 -61.6 -120.8 -103.2 24.7 2.5
1 -1.0 -18.1 -28.8 36.2 1.1
1 18.9 -3.8 -50.6 26.4 0.9
1 -57.2 -61.2 -56.6 11.0 1.7
1 3.0 -20.3 -17.4 8.0 1.0
1 -5.1 -194.5 -25.8 6.5 0.5
1 17.9 20.8 -4.3 22.6 1.0
1 5.4 -106.1 -22.9 23.8 1.5
1 23.0 -39.4 -35.7 69.1 1.2
1 -67.6 -164.1 -17.7 8.7 1.3
1 -185.1 -308.9 -65.8 35.7 0.8
1 13.5 7.2 -22.6 96.1 2.0
1 -5.7 -118.3 -34.2 21.7 1.5
1 72.4 -185.9 -280.0 12.5 6.7
1 17.0 -34.6 -19.4 35.5 3.4
1 -31.2 -27.9 6.3 7.0 1.3
1 14.1 -48.2 6.8 16.6 1.6
1 -60.6 -49.2 -17.2 7.2 0.3
1 26.2 -19.2 -36.7 90.4 0.8
1 7.0 -18.1 -6.5 16.5 0.9
1 53.1 -98.0 -20.8 26.6 1.7
1 -17.2 -129.0 -14.2 267.9 1.3
1 32.7 -4.0 -15.8 177.4 2.1
1 26.7 -8.7 -36.3 32.5 2.8
1 -7.7 -59.2 -12.8 21.3 2.1
1 18.0 -13.1 -17.6 14.6 0.9
1 2.0 -38.0 1.6 7.7 1.2
1 -35.3 -57.9 0.7 13.7 0.8
1 5.1 -8.8 -9.1 100.9 0.9
1 0.0 -64.7 -4.0 0.7 0.1
1 25.2 -11.4 4.8 7.0 0.9Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
2 35.2 43.0 16.4 99.1 1.3
2 38.8 47.0 16.0 126.5 1.9
2 14.0 -3.3 4.0 91.7 2.7
2 55.1 35.0 20.8 72.3 1.9
2 59.3 46.7 12.6 724.1 0.9
2 33.6 20.8 12.5 152.8 2.4
2 52.8 33.0 23.6 475.9 1.5
2 45.6 26.1 10.4 287.9 2.1
2 47.4 68.6 13.8 581.3 1.6
2 40.0 37.3 33.4 228.8 3.5
2 69.0 59.0 23.1 406.0 5.5
2 34.2 49.6 23.8 126.6 1.9
2 47.0 12.5 7.0 53.4 1.8
2 15.4 37.3 34.1 570.1 1.5
2 56.9 35.3 4.2 240.3 0.9
2 43.8 49.5 25.1 115.0 2.6
2 20.7 18.1 13.5 63.1 4.0
2 33.8 31.4 15.7 144.8 1.9
2 35.8 21.5 -14.4 90.0 1.0
2 24.4 8.5 5.8 149.1 1.5
2 48.9 40.6 5.8 82.0 1.8
2 49.9 34.6 26.4 310.0 1.8
2 54.8 19.9 26.7 239.9 2.3
2 39.0 17.4 12.6 60.5 1.3
2 53.0 54.7 14.6 771.7 1.7
2 20.1 53.5 20.6 307.5 1.1
2 53.7 35.6 26.4 289.5 2.0
2 46.1 39.4 30.5 700.0 1.9
2 48.3 53.1 7.1 164.4 1.9
2 46.7 39.8 13.8 229.1 1.2
2 60.3 59.5 7.0 226.6 2.0
2 17.9 16.3 20.4 105.6 1.0
2 24.7 21.7 -7.8 118.6 1.6
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
Relevant questions then are:
How do the companies in these two groups differ from each other?
Which ratios best discriminate the groups?
Are the ratios useful for predicting bankruptcies?
Partial answers to can be obtained by examining each single variable at a
time.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
For example sample statistics for each group are
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Background
Some graphics may also be helpful. For example,
More complete use of group separation information, however, canbe given by discriminant analysis (DA).
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
General Setup for the Discriminant Analysis
1 Discriminant analysis
Background
General Setup for the Discriminant Analysis
Descriptive Discriminant Analysis
Number of Discriminant Functions
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
General Setup for the Discriminant Analysis
Discriminant analysis is used for two purposes:
(1) describing major differences among the groups, and
(2) classifying subject on the basis of measurements.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
1 Discriminant analysis
Background
General Setup for the Discriminant Analysis
Descriptive Discriminant Analysis
Number of Discriminant Functions
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
The start off setup:
p variables
q exclusive groups
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
The goal of the descriptive DA is:
Form k new variables such that
1 The new variables are uncorrelated.
2 The first new variable has the best discriminating power w.r.tthe given groups. The second new variable has the secondbest discriminating power and is uncorrelated with the firstone, the third has the third best discriminating power and isuncorrelated with the previous ones, etc.
Remark 1
k ≤ min(p, q − 1). For example, if q = 2 then k = min(p, 1) = 1.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
More precisely, suppose we have observations on random variablesx1, . . . , xp from q groups.
Then the jth discriminant function is defined as a linearcombination of the original variables
yj = aj1x1 + · · ·+ ajpxp, (1)
such that corr[yj , y`] = 0 for j 6= `, and y1 has the bestdiscriminating power, y2 the second best, and so on.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Remark 2
In the basic case the assumption is that the groups differ only withrespect to the means of the variables.
As a consequence the correlations between the variables and variances are
assumed the same over the groups (groups have similar covariance
structures).
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
The idea in deriving the discriminant functions is to divide thetotal variation into between group and within group variation
T = B + W, (2)
where T denotes the total covariance matrix, B the betweencovariance matrix, and W the within covariance matrix.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Technically the problem reduces again to an eigenvalue problem.
In this case the eigenvalues are extracted form the matrix
BW−1. (3)
The resulting eigenvectors form the coefficients for thediscriminant functions yj , j = 1, . . . , k with k = min(q − 1, p).
The functions are called canonical discriminant functions.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Example 2
Consider the bankruptcy data. SAS proc candisc or SPSS (Analyze
→ Classify → Discriminant). Below are SAS results.
Example: Discriminant analysis applied to bankrupt data
Canonical Discriminant Analysis
66 Observations 65 DF Total
5 Variables 64 DF Within Classes
2 Classes 1 DF Between Classes
Class Level Information
GROUP Frequency Weight Proportion
1 33 33.0000 0.500000
2 33 33.0000 0.500000
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Canonical Discriminant Analysis Within-Class Covariance Matrices
GROUP = 1 DF = 32
Variable X1 X2 X3 X4 X5
X1 2104.5659 1834.1637 -266.4029 249.8980 18.0357
X2 1834.1637 5085.4767 1632.2018 177.7665 -15.6653
X3 -266.4029 1632.2018 2637.1822 168.3066 -46.6066
X4 249.8980 177.7665 168.3066 3018.2188 1.6108
X5 18.0357 -15.6653 -46.6066 1.6108 1.3509
GROUP = 2 DF = 32
Variable X1 X2 X3 X4 X5
X1 201.986 117.413 16.740 974.165 1.921
X2 117.413 272.496 52.076 1630.092 0.879
X3 16.740 52.076 118.108 814.591 2.762
X4 974.165 1630.092 814.591 42669.190 -14.529
X5 1.921 0.879 2.762 -14.529 0.865
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Canonical Discriminant Analysis
Simple Statistics
Total-Sample
Variable N Mean Variance Std Dev
X1 66 19.28485 1632 40.39972
X2 66 -13.63485 5064 71.15836
X3 66 -8.23182 1920 43.81308
X4 66 147.35909 34186 184.89362
X5 66 1.72121 1.13924 1.06735
GROUP = 1
Variable N Mean Variance Std Dev
X1 33 -2.83030 2105 45.87555
X2 33 -62.51212 5085 71.31253
X3 33 -31.78182 2637 51.35350
X4 33 40.04545 3018 54.93832
X5 33 1.50303 1.35093 1.16229
GROUP = 2
Variable N Mean Variance Std Dev
X1 33 41.40000 201.98563 14.21216
X2 33 35.24242 272.49627 16.50746
X3 33 15.31818 118.10841 10.86777
X4 33 254.67273 42669 206.56522
X5 33 1.93939 0.86496 0.93003
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Univariate Test Statistics
F Statistics, Num DF= 1 Den DF= 64
Total Pooled Between RSQ/
Variable STD STD STD R-Squared (1-RSQ)
X1 40.3997 33.9599 31.2755 0.304266 0.4373
X2 71.1584 51.7589 69.1229 0.479063 0.9196
X3 43.8131 37.1166 33.3047 0.293363 0.4152
X4 184.8936 151.1413 151.7644 0.342055 0.5199
X5 1.0673 1.0526 0.3086 0.042428 0.0443
Univariate Test Statistics
Variable F Pr > F
X1 27.9892 0.0001
X2 58.8555 0.0001
X3 26.5698 0.0001
X4 33.2726 0.0001
X5 2.8357 0.0971
Average R-Squared: Unweighted = 0.2922351
Weighted by Variance = 0.3546308
Multivariate Statistics and Exact F Statistics
S=1 M=1.5 N=29
Statistic Value F Num DF Den DF Pr > F
Wilks’ Lambda 0.369760775 20.4534 5 60 0.0001
Pillai’s Trace 0.630239225 20.4534 5 60 0.0001
Hotelling-Lawley Trace 1.704451275 20.4534 5 60 0.0001
Roy’s Greatest Root 1.704451275 20.4534 5 60 0.0001
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Example: Discriminant analysis applied to bankrupt data
Canonical Discriminant Analysis
Adjusted Approx Squared
Canonical Canonical Standard Canonical
Correlation Correlation Error Correlation
1 0.793876 0.781803 0.045863 0.630239
Eigenvalues of INV(E)*H
= CanRsq/(1-CanRsq)
Eigenvalue Difference Proportion Cumulative
1 1.7045 . 1.0000 1.0000
Test of H0: The canonical correlations in the
current row and all that follow are zero
Likelihood
Ratio Approx F Num DF Den DF Pr > F
1 0.36976078 20.4534 5 60 0.0001
NOTE: The F statistic is exact.
Total Canonical Structure
CAN1
X1 0.694823
X2 0.871854
X3 0.682260
X4 0.736708
X5 0.259462
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Between Canonical Structure
CAN1
X1 1.000000
X2 1.000000
X3 1.000000
X4 1.000000
X5 1.000000
Pooled Within Canonical Structure
CAN1
X1 0.506539
X2 0.734533
X3 0.493528
X4 0.552283
X5 0.161231
Total-Sample Standardized Canonical Coefficients
CAN1
X1 0.1404518774
X2 0.6028563830
X3 0.6695203123
X4 0.5616859665
X5 0.5320432994
Pooled Within-Class Standardized Canonical Coefficients
CAN1
X1 0.1180635365
X2 0.4385036080
X3 0.5671902048
X4 0.4591503359
X5 0.5246858501
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Raw Canonical Coefficients
CAN1
X1 0.0034765558
X2 0.0084720383
X3 0.0152812900
X4 0.0030378872
X5 0.4984713894
Class Means on Canonical Variables
GROUP CAN1
1 -1.285613175
2 1.285613175
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
The output includes several coefficient matrices.
The structure matrices describe the correlations of the original variableswith the discriminant function.
The most useful of these for interpretation purposes is the withincanonical structure.
In the case of multiple groups also between canonical structure may giveuseful additional information.
This structure tells how the means of variables and means of discriminant
functions are correlated.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
The standardized coefficients are obtained by dividing the rawcoefficients by the standard deviations of the variables.
These coefficient tell the marginal effect of the (standardized)variable on the discriminant function.
Labeling the discriminant function is based on those variableshaving largest correlations and largest standardized coefficients.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Example 3
From the within canonical structure we observe:
X2 (Retained earnings / Total assets) has the highest correlationwith the discriminant function.
X4 (Market value of equity / Total Value of Liabilities), X1
(Working capital / Total Assets), and X3 (Earnings before interestand taxes / Total assets) have next highest.
X5 (Sales / Total Assets) is small, but it has a large standardizedcoefficient.
Summing up, profitable and companies whose market value is on a high
level are the properties preventing from the bankruptcy.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
It should be noted that the basic assumption in the discriminantanalysis is that the variables are normally distributed in each of thegroups, and that the covariance matrices are the same.
The former assumption is harder to test. The latter is easier (inSPSS select Box M from the options).
If the covariance matrices are not the same the linear discriminantfunction analysis is invalid.
One should move to the quadratic discriminant function analysis.
This method, however, is planned for classification purposes.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Descriptive Discriminant Analysis
Example 4
Testing for the equality of the population covariance matrices.
H0 : Σ1 = Σ2, (4)
where Σi is the population covariance matrix of the population i(i = 1, 2).
SPSS give the result: Test Chi-Square Value = 186.18 with 15 degrees offreedom and p-value = 0.0001
We observe that the null hypothesis is rejected, hence one analysis results
should be interpreted with caution.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
1 Discriminant analysis
Background
General Setup for the Discriminant Analysis
Descriptive Discriminant Analysis
Number of Discriminant Functions
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
In a case of multiple group (> 2) the question is: in how manydimension the groups are different.
In the case of two groups this is not a major problem, because thegroups can differentiate only in one dimension.
Generally, however, there can be more discriminating dimensions, ifq > 2.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Example 5
The following data is a classic example considering different species ofIris Setosa.
The following measures were made:
SL: Sepal lengthSW: Sepal WIdthPL: Pedal LengthPW: Pedal Width
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
The CANDISC procedure produces the following results.
title;
data iris;
title ’Discriminant Analysis of Fisher (1936) Iris Data’;
input sepallen sepalwid petallen petalwid spec_no @@;
if spec_no=1 then species=’SETOSA ’;
if spec_no=2 then species=’VERSICOLOR’;
if spec_no=3 then species=’VIRGINICA ’;
label sepallen=’Sepal Length in mm.’
sepalwid=’Sepal Width in mm.’
petallen=’Petal Length in mm.’
petalwid=’Petal Width in mm.’;
datalines;
50 33 14 02 1 64 28 56 22 3 65 28 46 15 2 67 31 56 24 3
63 28 51 15 3 46 34 14 03 1 69 31 51 23 3 62 22 45 15 2
59 32 48 18 2 46 36 10 02 1 61 30 46 14 2 60 27 51 16 2
65 30 52 20 3 56 25 39 11 2 65 30 55 18 3 58 27 51 19 3
68 32 59 23 3 51 33 17 05 1 57 28 45 13 2 62 34 54 23 3
77 38 67 22 3 63 33 47 16 2 67 33 57 25 3 76 30 66 21 3
49 25 45 17 3 55 35 13 02 1 67 30 52 23 3 70 32 47 14 2
64 32 45 15 2 61 28 40 13 2 48 31 16 02 1 59 30 51 18 3
55 24 38 11 2 63 25 50 19 3 64 32 53 23 3 52 34 14 02 1
49 36 14 01 1 54 30 45 15 2 79 38 64 20 3 44 32 13 02 1
67 33 57 21 3 50 35 16 06 1 58 26 40 12 2 44 30 13 02 1
77 28 67 20 3 63 27 49 18 3 47 32 16 02 1 55 26 44 12 2
50 23 33 10 2 72 32 60 18 3 48 30 14 03 1 51 38 16 02 1
61 30 49 18 3 48 34 19 02 1 50 30 16 02 1 50 32 12 02 1
61 26 56 14 3 64 28 56 21 3 43 30 11 01 1 58 40 12 02 1
51 38 19 04 1 67 31 44 14 2 62 28 48 18 3 49 30 14 02 1
51 35 14 02 1 56 30 45 15 2 58 27 41 10 2 50 34 16 04 1
.
.
.
;
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
title ’Canonical Discriminant Analysis of IRIS data’;
proc candisc data = iris;
class species;
var sepallen--petalwid;
run;
Which gives the results:
Canonical Discriminant Analysis of IRIS data
Canonical Discriminant Analysis
150 Observations 149 DF Total
4 Variables 147 DF Within Classes
3 Classes 2 DF Between Classes
Class Level Information
SPECIES Frequency Weight Proportion
SETOSA 50 50.0000 0.333333
VERSICOLOR 50 50.0000 0.333333
VIRGINICA 50 50.0000 0.333333
Canonical Discriminant Analysis
Multivariate Statistics and F Approximations
S=2 M=0.5 N=71
Statistic Value F Num DF Den DF Pr > F
Wilks’ Lambda 0.023438631 199.145 8 288 0.0001
Pillai’s Trace 1.191898825 53.4665 8 290 0.0001
Hotelling-Lawley Trace 32.47732024 580.532 8 286 0.0001
Roy’s Greatest Root 32.1919292 1166.96 4 145 0.0001
NOTE: F Statistic for Roy’s Greatest Root is an upper bound.
NOTE: F Statistic for Wilks’ Lambda is exact.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Adjusted Approx Squared
Canonical Canonical Standard Canonical
Correlation Correlation Error Correlation
1 0.984821 0.984508 0.002468 0.969872
2 0.471197 0.461445 0.063734 0.222027
Eigenvalues of INV(E)*H
= CanRsq/(1-CanRsq)
Eigenvalue Difference Proportion Cumulative
1 32.1919 31.9065 0.9912 0.9912
2 0.2854 . 0.0088 1.0000
Test of H0: The canonical correlations in the
current row and all that follow are zero
Likelihood
Ratio Approx F Num DF Den DF Pr > F
1 0.02343863 199.1453 8 288 0.0001
2 0.77797337 13.7939 3 145 0.0001
Total Canonical Structure
CAN1 CAN2
SEPALLEN 0.791888 0.217593 Sepal Length in mm.
SEPALWID -0.530759 0.757989 Sepal Width in mm.
PETALLEN 0.984951 0.046037 Petal Length in mm.
PETALWID 0.972812 0.222902 Petal Width in mm.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Between Canonical Structure
CAN1 CAN2
SEPALLEN 0.991468 0.130348 Sepal Length in mm.
SEPALWID -0.825658 0.564171 Sepal Width in mm.
PETALLEN 0.999750 0.022358 Petal Length in mm.
PETALWID 0.994044 0.108977 Petal Width in mm.
Pooled Within Canonical Structure
CAN1 CAN2
SEPALLEN 0.222596 0.310812 Sepal Length in mm.
SEPALWID -0.119012 0.863681 Sepal Width in mm.
PETALLEN 0.706065 0.167701 Petal Length in mm.
PETALWID 0.633178 0.737242 Petal Width in mm.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Total-Sample Standardized Canonical Coefficients
CAN1 CAN2
SEPALLEN -0.686779533 0.019958173 Sepal Length in mm.
SEPALWID -0.668825075 0.943441829 Sepal Width in mm.
PETALLEN 3.885795047 -1.645118866 Petal Length in mm.
PETALWID 2.142238715 2.164135931 Petal Width in mm.
Pooled Within-Class Standardized Canonical Coefficients
CAN1 CAN2
SEPALLEN -.4269548486 0.0124075316 Sepal Length in mm.
SEPALWID -.5212416758 0.7352613085 Sepal Width in mm.
PETALLEN 0.9472572487 -.4010378190 Petal Length in mm.
PETALWID 0.5751607719 0.5810398645 Petal Width in mm.
Raw Canonical Coefficients
CAN1 CAN2
SEPALLEN -.0829377642 0.0024102149 Sepal Length in mm.
SEPALWID -.1534473068 0.2164521235 Sepal Width in mm.
PETALLEN 0.2201211656 -.0931921210 Petal Length in mm.
PETALWID 0.2810460309 0.2839187853 Petal Width in mm.
Class Means on Canonical Variables
SPECIES CAN1 CAN2
SETOSA -7.607599927 0.215133017
VERSICOLOR 1.825049490 -0.727899622
VIRGINICA 5.782550437 0.512766605
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
The Wilk’s lambda test indicates that there are two statisticallysignificant discriminators on the five percent level.
Generally the hypotheses to be tested is like in the factor analysis
H0 : The number of discriminators = m
H1 : More is needed(5)
On the basis of the within-matrices the first discriminator indicatesthat the species differ with respect to the overall size of the leavesand the second discriminator that species differ also with respectto the width of the leaves.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Example 9.6: Bankruptcy risk and signal to reorganization of a company(Laitinen, Luoma, Pynnonen 1996, UV, Discussion Papers 200)
Thus we have four groups.Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Sample statistics:Table 7. Descriptive statistics of groups for estimation data.
B2 (n=20) N3 (n=17) N4 (n=23) F for eqVariable Mean Std Dev Mean Std Dev Mean Std Dev Mean Std Dev of meansROI -10.24 8.60 3.52 5.59 2.27 7.14 12.02 5.96 37.66***TCF -13.32 10.83 0.13 2.31 0.97 5.00 6.47 5.67 32.48***QRA 0.58 0.39 0.57 0.55 1.14 0.70 0.85 0.42 4.95**SCA -0.61 20.22 -4.75 18.79 13.62 13.19 23.13 19.55 10.39***DSR 1.09 0.55 0.69 0.25 0.88 0.34 0.57 0.28 7.62*****=significant at level 0.01***=significant at level 0.001
B1 (n=20)
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Number of canonical discriminant functions:
The results indicate that also the third canonical discriminant function is
statistically significant.Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Canonical structure and standardized coefficients:
Table 11. Canonical structure and Standardized canonical coefficients both as pooled within.
Canonical structure* Standardized coefficientVariable CAN1 CAN2 CAN3 CAN1 CAN2 CAN3
ROI 0.702 0.036 0.004 0.717 0.013 -0.737TCF 0.643 0.059 0.467 0.372 -0.458 0.983QRA 0.101 0.513 0.653 -0.061 0.563 0.661SCA 0.252 0.773 -0.168 0.169 0.946 -0.522DSR -0.306 0.203 0.149 -0.722 0.034 0.16
*Correlation coefficients between original variables and canonical variables.
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Interpretation of the discriminant functions:
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
Group differences:
Seppo Pynnonen Applied Multivariate Analysis
Discriminant analysis
Number of Discriminant Functions
CAN1, the financial performance, shows that the financial performance isthe main characteristic differentiating healthy and bankruptcy firms (asexpected).
CAN2, controversy dynamic liquidity and static ratios, is differentiatingcharacteristic between reorganizable non-bankrupt and reorganizablebankrupt firms.
CAN3, controversy between liquidity and other ratios, reorganizable
non-bankrupt firms and healthy firms. The distinction is probably due to
the fact that non-bankrupt firms may have cash reserves (high liquidity),
but do not use it profitably.
Seppo Pynnonen Applied Multivariate Analysis