Copy of Multiple ant Analysis

Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 5-1

Chapter 5Multiple Discriminant Analysis


LEARNING OBJECTIVESUpon completing this chapter, you should be able to

do the following:• State the circumstances under which a linear

discriminant analysis should be used instead of multiple regression.

• Identify the major issues relating to types of variables used and sample size required in the application of discriminant analysis.

• Understand the assumptions underlying discriminant analysis in assessing its appropriateness for a particular problem.




LEARNING OBJECTIVES continued . . . Upon completing this chapter, you should be able to do

the following:• Describe the two computation approaches for

discriminant analysis and the method for assessing overall model fit.

• Explain what a classification matrix is and how to develop one, and describe the ways to evaluate the predictive accuracy of the discriminant function.

• Tell how to identify independent variables with discriminatory power.

• Justify the use of a split-sample approach for validation.




Multiple discriminant analysis (MDA) is an appropriate technique when the dependent variable is categorical (nominal or nonmetric) and the independent variables are metric. The single dependent variable can have two, three or more categories.

Discriminant Analysis Defined

Examples:• Gender – Male vs. Female• Heavy Users vs. Light Users• Purchasers vs. Non-purchasers• Good Credit Risk vs. Poor Credit Risk• Member vs. Non-Member• Low, medium, high• Attorney, Physician or Professor


• MDA is a linear combination of 2 (or more) independent variables will discriminate between objects or groups defined a priori.

• Variate’s weight for each IV is calculated which is also known as “discriminant function”.

• MDA derives variate that best distinguishes between a priori groups.

• MDA sets variate’s weights to maximize between-group variance relative to within-group variance.



For each observation we can obtain a Discriminant Z-score

Average Z score for a group gives Centroid Classification done using Cutting Scores

which are derived from group centroids Statistical significance of Discriminant

Function done using distance bet. group centroids

LR similar to 2-group discriminant analysis

Discriminant Function

Z = W1X1 + W2X2 + W3X3 + …. + WiXi

Z = Discriminant Score

Wi = Discriminant weight for variable i

Xi = Independent variable i


KitchenAid Survey Results for the Evaluation* of a New Consumer

ProductX3 Style

Group 1 Would purchase 1 8 9 6

2 6 7 53 10 6 34 9 4 45 4 8 2

Group Mean 7.4 6.8 4.0 Group 2 Would not purchase 6 5 4 7

7 3 7 28 4 5 59 2 4 3

10 2 2 2Group Mean 3.2 4.4 3.8

Difference between group means 4.2 2.4 0.2

Purchase Intention Subject Number

X1 Durabilit

y

X2 Performance

*Evaluations made on a 0 (very poor) to 10 (excellent) rating scale.

Univariate Representation of Discriminant Z Scores



Z

Z

A B

BA


Graphic Illustration of Two-Group Discriminant Analysis

X2

X1

Z


A’

B’

A

B


Discriminant Analysis Decision Process

Stage 1: Objectives of Discriminant Analysis

Stage 2: Research Design for Discriminant Analysis

Stage 3: Assumptions of Discriminant Analysis

Stage 4: Estimation of the Discriminant Model and Assessing Overall Fit

Stage 5: Interpretation of the Results

Stage 6: Validation of the Results

Stage 1: Objective of Discriminant Analysis

Discriminant Analysis can address any of the following questions:• Determining whether statistically significant differences

exist between the average score profile on a set of variables for two (or more) defined groups.

• Determining which of the IVs account the most for the differences in the average score profiles of the two or more groups.

• Establishing procedures for classifying statistical units (individuals or objects) into groups on the basis of their scores on a set of IVs.

• Establishing the number and composition of the dimensions of discrimination between groups formed from the set of IVs.

Stage 1: Illustrative example

A company has two locations to serve customers• North America• Outside North America

The management is interested in any difference in perceptions between those customers saved by two locations

Two variables can be identified

X6- X18- customers perceptions on

thirteen characteristics

X4 company locations two locations

(North America, out of North America)

Multiple Discriminant analysis is to be used

Objectives (Example)

Find any differences in customer perception that may occur between two geographical areas

Stage 2: Research Design for Discriminant Analysis

Selection of dependent and independent variables

Sample size (total and per variable) Sample division for vadidation

Selecting Dependent and Independent Variables

How many categories in the dependent Variables?

Converting metric variables• Most Common Approach

– To use the metric scale responses to develop non-metric categories. For example, use a question asking the typical number of soft drinks consumed per day and develop a three-category variable of 0 drinks for non-user, 1 -5 for light users, and 5 or more for heavy users.

• Polar Extremes Approach– Compares only the extreme two group and excludes the

middle group(s)

Sample Size

Overall sample size

Sample size per category

Division of the sample

Creating the subsamples

What if the overall sample is too small?

Rules of Thumb 5-1Discriminant Analysis Design

The dependent variable must be non-metric, representing groups of object that are expected to differ on the independent variables.

Choose a dependent variable that:• Best represent group differences of interest• Defines groups that are substantially different, and• Minimizes the number of categories while still meeting

the research objectives. In converting metric variable to a non-metric scale

for use as the dependent variable, consider using extreme groups to maximize the group differences.

Rules of Thumb 5-1 continued…

Independent variables must identify differences between at least two groups to be of any use in discriminant analysis

The sample size must be large enough to:• Have at least one more observation per group.• Have 20 cases per independent variable, with a minimum

recommended level of 5 observations per variable.• Have at least one more observation per group than the

number of independent variables, but striving for at least 20 cases per group.

• Have a large enough sample to divide it into an estimation and holdout sample, each meeting the above requirements.

Rules of Thumb 5-1 continued…

Assess the equality of covariance matrices with the Box’s M test, but apply a conservative significance level of .01.

Examine the independent variables for univariate normality.

Multicollinearity among the independent variables can markedly reduce the estimated impact of independent variables in the derived discriminant function(s), particularly is a stepwise estimation process is used.

Stage 2: Research design (Example)

Three key issues

Selecting dependent and independent variables

Sample size


Selecting dependent and independent variables

DV = X4 two groups categorical variable (non Metric)

IVs = X6 to X18 thirteen (13) customer perceptions to discriminate between each geographical area (metric)

Important Dependent variable - Non metric

Independent variables- Metric

Thirteen independent Variables X6 Product Quality metric X7 E-Commerce Activities/Website metric X8 Technical Support metric X9 Complaint Resolution metric X10 Advertising metric X11 Product Line metric X12 Sales force Image metric X13 Competitive Pricing metric X14 Warranty & Claims metric X15 New Products metric X16 Ordering & Billing metric X17 Price Flexibility metric X18 Delivery Speed metric

Selecting Sample size

Overall sample

100 observations

Satisfy minimum requirement 5: 1 ratio When total sample not split, ratio can be

increased to 8: 1 but validation of result is more important

Important • Overall sample size• Sample size for categories Analysis sample and Holdout Sample (validation sample)

Sample for categories

The researcher can decide with adequate sample units

Analysis sample 60

Holdout sample 40


Analysis sample (60) can be divided into two groups

Group sizes 26 and 34

Satisfies the minimum requirement of 20 observations per group

Stage 3: Assumptions of Discriminant Analysis

Key Assumptions• Multivariate normality of the IVs. • Equal variance and covariance for the

groups Other Assumptions

• Minimal multicollinearity among IVs.• Group sample sizes relatively equal.• Linear relationships.• Elimination of outliers.

Stage 3: Assumptions of the MDA (Example)

Normality and linearity has been tested for the variables

At an acceptable level

Important• Normality, Multivariate normality• Linearity• Multicolinearity• Equality of covariance matrices

chapter 2 page no 80 in the book

page no 80

Multicolinearity has been tested for the variables

At an acceptable level

chapter 4 page no 211 in the book

page no 211

Equality of covariance matrices

Box’s M test It has been tested and differences in

covariance matrices between two groups .011 and significance

As all assumptions are met, no additional remedies are needed for transforming

variables


Stage 4: Estimation of the Discriminant Model and Assessing Overall Fit

Selecting An Estimation Method . . .

1. Simultaneous Estimation – all independent variables are considered concurrently.

2. Stepwise Estimation – independent variables are entered into the discriminant function one at a time.


Estimating the Discriminant Function

The stepwise procedure begins with all independent

variables not in the model, and selects variables for

inclusion based on:

• Statistically significant differences across the

groups (.05 or less required for entry),

• Statistical Significance of Functions: Wilks’ lamda, Hotelling’s trace, Pilliai’s criterion. Mahalanobis D2 and Rao’s V for stepwise.


Assessing Overall Model Fit

• Calculating discriminant Z scores for each observation,

• Evaluating group differences on the discriminant Z scores, and

• Assessing group membership prediction accuracy.


Assessing Group Membership Prediction Accuracy

Major Considerations: • The statistical and practical rational for

developing classification matrices,• The cutting score determination, • Construction of the classification matrices,

and • Standards for assessing classification

accuracy.


Rules of Thumb 5–2

Model Estimation and Model Fit • Although stepwise estimation may seem “optimal” by

selecting the most parsimonious set of maximally discriminating variables, beware of the impact of multicollinearity on the assessment of each variable’s discriminatory power.

• Overall model fit assesses the statistical significance between groups on the discriminant Z score(s), but does not assess predictive accuracy.

• With more than two groups, do not confine your analysis to only the statistically significant discriminant function(s), but consider if nonsignificant functions (with significance levels of up to .3) add explanatory power.


Calculating the Optimum Cutting Score

Issues . . .

• Define the prior probabilities based either on the relative sample sizes of the observed groups or specified by the researcher (either assumed to be equal or with values set by the researcher), and

• Calculate the optimum cutting score value as a weighted average based on the assumed sizes of the groups (derived from the sample sizes).


Optimal Cutting Score with Equal Samples Sizes

Group BGroup A

_ ZA

_ ZB

Classify as B (Purchaser)Classify as A

(Nonpurchaser)


Optimal Cutting Score with Unequal Samples Sizes

Group B

Group A

_ ZA

_ ZB

Optimal Weighted Cutting Score

Unweighted Cutting Score


Establishing Standards of Comparison for the Hit

Ratio

Group sizes determine standards based on:

• Equal Group Sizes

• Unequal Group Sizes – two criteria:

o Maximum Chance Criterion

o Proportional Chance Criterion


Classification MatrixHBAT’s New Consumer Product

ActualGroup

WouldPurchase

WouldNot

PurchaseActualTotal

PercentCorrect

Classification

Predicted Group

Percent Correctly Classified (hit ratio) =

100 x [(22 + 20)/50] = 84%

(1) 22 3 2588%

(2) 5 20 2580%Predicte

d Total 27 23 50



Assessing Predictive Accuracy

• The classification matrix and hit ratio replace R2 as the measure of model fit:

assess the hit ratio both overall and by group..

If the estimation and analysis samples both exceed 100 cases and each group exceeds 20 cases, derive separate standards for each sample. If not, derive a single standard from the overall sample.

• Analyze the missclassified observations both graphically (territorial map) and empirically (Mahalanobis D2).


Rules of Thumb 5–3 Continued . . .

Assessing Predictive Accuracy

• There are multiple criteria for comparison to the hit ratio:

The maximum chance criterion for evaluating the hit ratio is the most conservative, giving the highest baseline value to exceed.

Be cautious in using the maximum chance criterion in situations with overall samples less than 10 and/or group sizes under 20.

The proportional chance criterion considers all groups in establishing the comparison standard and is the most popular.

The actual predictive accuracy (hit ratio) should exceed the any criterion value by at least 25%.

Stage 4: estimating discriminant model and assessing overall fit (example)

As the objective is to determine the discriminating capabilities of individual variables stepwise method is selected

Identifying the variables with significant differences between groups

ImportantThere are two methods

• Simultaneous• stepwise

page no 377

X6, X11, X12, X13, X17 has the larger differences in the group means look at wilks’ Lambada, F value, significance, and Minimum D2

X13 has the largest D2 between groups and significance less than .05 and qualify for first entry.

page no378

page no 380

page no 381

.7492 = .561 = 56%

Calculate discriminant Z scores used in classification

Discriminant loadings from highest to lowest used for interpretation

Fisher’s linear discriminant function used for classification

page no 382

Assessing the predictive accuracy of discriminant function

Important• Calculating cutting score• procedure for classification • Assess the predictive accuracy

Predictive accuracy

page no 384

Cases with discriminant score less than -.2997 in group 0 and greater in 1

page no 385

Cases with discriminant score less than -.2997 in group 0 and greater in 1

page no 385

page no387


Stage 5: Interpretation of the Results

Three Methods . . .

1. Standardized discriminant weights,

2. Discriminant loadings (structure correlations), and

3. Partial F values.


Standardized discriminant weights

• Examines the sign and magnitude of standardized discriminant weight (discriminant coefficient assigned to each variable.

• It is used to compute discriminant function.

• The signs denotes the relationship (negative or postivie)


Discriminant loadings (structure correlations)

• Is used because of deficiencies of “weights”.

• It reflects the variance that IV shares with the discriminant function.

• It can be interpreted like factor loadings.• It can be calculated for variables.


Interpretation of the Results

Two or More Functions . . .

1. Rotation of discriminant functions

2. Potency index


Graphical Display of Discriminant Scores and

Loadings

• Territorial Map = most common method.

• Vector Plot of Discriminant Loadings,

preferably the rotated loadings = simplest

approach.


Plotting Procedure for Vectors

Three Steps . . .

1. Selecting variables,

2. Stretching the vectors, and

3. Plotting the group centroids.



Interpreting and Validating Discriminant Functions • Discriminant loadings are the preferred method to

assess the contribution of each variable to a discriminant function because they are: a standardized measure of importance (ranging

from 0 to 1).available for all independent variables whether

used in the estimation process or not.unaffected by multicollinearity.

• Loadings exceeding ±.40 are considered substantive for interpretation purposes.


Rules of Thumb 5–4 continued . . .

Interpreting and Validating Discriminant Functions • If there is more than one discriminant function, be sure

to: use rotated loadings.assess each variable’s contribution across all the

functions with the potency index. • The discriminant function must be validated either with

a holdout sample or one of the “Leave-one-out” procedures.

Stage 5: interpretation of the results (example)

Identifying important discriminating variables

1. Analyzing Wilk’s Lambada and univariate F

2. Analyzing the discriminant weights

3. Discriminant Loadings

Stage 5: interpretation of the results

Not entered due to multicollinearity

page no 388


Stage 6: Validation of the Results

• Utilizing a Holdout Sample

• Cross-Validation

Stage 6: validation of the results (example)

Important Internal validity External validity

• Internal validityClassification accuracy for both the holdout sample and cross-validated sample is higher and it establishes the internal validity• External validityExternal validity should be measured by the researcher use of additional samples

Thank You

Documents

Copy of Multiple ant Analysis