28
12/18/2010 12/18/2010 1 Session 7 MULTIVARIATE DATA ANALYSIS Contents« 1. Int roduction to multivariate analysis 2. De pend ence me th od s 3. Interdependence methods

Session 7 - Multivariate Data Analysis

Embed Size (px)

Citation preview

Page 1: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 1/28

12/18/201012/18/2010 1 1

Session 7

MULTIVARIATE DATA ANALYSIS

Contents«

1. Introduction to multivariate analysis2. Dependence methods

3. Interdependence methods

Page 2: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 2/28

Page 3: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 3/28

12/18/201012/18/2010 3 3

Dependence methods: O ne or more variables have

been designated as being predicted by a set of independent variables.

Multiple regression, AN OV A, Conjoint analysis,Discriminant analysis, Structural Equation Modeling...

Interdependence methods: No variable(s) aredesignated as being predicted by others. It is theinterrelationship among all the variables taken together that interests the researcher .

Factor analaysis, Cluster, Multidimensional Scaling.

Page 4: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 4/28

12/18/201012/18/2010 4 4

II. DEPENDENCE METHODSScale requirement

Method Required scale of variable(s)

Dependent Independent

One dependent variable

Multiple regression Interval interval

AN OV A Interval Nominal

Multiple regression withdummy variable

Interval Nominal

Discriminant analysis Nominal Interval

Conjoint analysis O rdinal Nominal

Two or more dependent variablesCanonical analysis Interval Interval

MANOV A Interval Nominal

Network structure including many dependent and independent variables

SEM Interval Interval

Page 5: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 5/28

12/18/201012/18/2010 5 5

II.1 Multiple Regression Y = a 1X1 + a 2X2 + a 3X3 + ... a nXn + b

O ne DV, two or more IDVs

y All are intervally scaled variables (except dummy variable)

y Three key results to analyze:

The fitness of the multiple regression equation:

represented by r 2 = 0 1 (coefficient of determination)

% of variation of Y explained by the regression.

Test of the significance level of r 2: Use F ± test (sig. )Test of the significance level of each regression coeficient(a1, a2, a3,«) : Use t ± test (sig.)

(SPSS provides all sig. levels)

Page 6: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 6/28

12/18/201012/18/2010 6 6

Assumptions in multiple regression

a. Linearity: relationships between DV

and IDV

s are linear.Test by observing the scatter diagram or correlation matrix

b. Multicolinearity: No linear correlation among ID V s.

Test by investigating ³Tolerance´ or V IF

c. Normality of all variables and of all residuals

d. Constant variance of the error term (Homoscedasticity)

e. Independence of the Error Terms

Page 7: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 7/28

12/18/201012/18/2010 7 7

Notes when using multiple regression:

Applicable when there exist linear correlations amongvariables.

Do not prove causal relationship.

Can be used for Prediction or Explanation

There should be more than 10 observations for one ID V (

requird sample size)If IDV is nominally scaled, dummy variable regression can

be employed

Page 8: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 8/28

12/18/201012/18/2010 8 8

Example:Identifying the determinants of employee satisfaction in XYZ Co.

DV

: Employee satisfaction.IDV s: Rewards, Working condition, Recognition by managers,Peer relationship, Promotion O pport., Development O pport.

IDV s UnstandardizedCoefficients

Standardized

Coefficients

t Sig. CollinearityStatistics

B Std.Error

Beta Toleance

V IF

(Constant) 0.540 0.193 2.793 .007

Rewards 0.526 0.081 0.596 6.491 .000 .793 1.062

Recognition 0.205 0.061 0.310 3.380 .001 .793 1.262

r 2=0.619 F sig. = 0.000

Page 9: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 9/28

12/18/201012/18/2010 9 9

II.2. ANOVA ± ANALYSIS OF VARIANCE

Non-metric ID V s and metric D V Used to compare means of D V under the impact of one or

more ID V s.

Can be used with more than one ID V (factorial AN OV A).

Principle: ³between-group variance > within-group

variance´ significant differences in the means of groups

Family: ANCOVA / MANOVA / MANCOVA

Page 10: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 10/28

12/18/201012/18/2010 1010

Example of ANOVA:

A survey of 200 companies in garment, cosmeticand plastic industries about their average expenses

for sales promotion during the last three years.

The researcher wants to explore whether there are

significant differences in the average expenses for

sales promotion among these three industries

Page 11: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 11/28

12/18/201012/18/2010 1111

IDV s: Industry(nominal) (3 treatments)DV : Sales Promotion expenses (ratio)

Company No. Industry SP expenses(1000 USD)

1 Garment 123

2 Garment 235

3 Cosmetic 1346

4 Plastic 876.. ..

199 Plastic 68

200 Garment 12

Page 12: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 12/28

12/18/201012/18/2010 1212

Possible method: compare the mean values of D V for

each pair of industries (using t ± test).However, when the No. of treatments increases the

comparisons become arduous.

In such a situation, AN OV A is the better method:

H0 : Q 1 = Q 2 = ... = Q k = Q

Ha : at least one Q i which is significantly different fromthe others.

Where Q = population mean

Page 13: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 13/28

12/18/201012/18/2010 1313

II.3. DISCRIMINANT ANALYSIS

Purpose: to identify the linear combination of ID V s that isbest discriminate among the prespecified groups that areformed on the basis of a D V .

Metric ID V s, Nominal D V .

O utcomes: A linear combination:Y = v 1.X1 + v 2.X2 + v 3.X3 + «and critical score Y cri

For a particular subject:

Calculate its Y score,Compare Y Ycri

predict which group the subject belongs.

Page 14: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 14/28

12/18/201012/18/2010 1414

Example

An IT trading company wants to know whether family income

(X1) householder¶s education (X 2) are useful to discriminatebetween PC buyers and non-PC buyers.

Conduct a survey of n households (with / without a PC).

IDV s: X1 ± income, X

2 ± education : metric variables

DV : with a PC, without a PC: categorical variable

Analysis results: discriminant function Y= v 1X1 + v 2X2

v1, v 2 : discriminant coefficients

Ycri : critical score

Given a household i (X 1i and X 2i ) we can predict whether it isa (potential) buyer.

Page 15: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 15/28

Page 16: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 16/28

12/18/201012/18/2010 1616

Example

Test a new product with 3 attributes:Price: (high, medium, low)Package size: (small, medium, large)Features: (simple, complex)

Form 8 test alternatives (instead of 18 combinations). Ask respondents to rank order

Results:

contribution of each attribute to overall preferencepreference of each treatment in an attribute.identify the most preferred combination. .

Page 17: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 17/28

II.5. Structural Equation Modeling - SEM

CUS TOO I T TIO

C O TITOO I T TIO

UN TIONALC OO I NA TION

BUS INESSPERF ORMANCE

RESP ONS IVENESS

PR OF ITOR IEN T A TION

MARKE TOR IEN T A TION

..

.

MANAGEMEN TC OMPE TENC IES

...

.

.

Page 18: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 18/28

Page 19: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 19/28

12/18/201012/18/2010 1919

Example:

Case X 1 X2 X3 «. «. X m

123

«n

Factor analysis: grouping m variables into k factorsFactor 1 includes X1 X6 X9 XmFactor 2 includes X2 X3 X10 Xm - 1Factor 3 includes X4 X5 X7 X8 ...

Exploratory factor analysis (EFA)Confirmatory factor analysis (CFA).

Page 20: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 20/28

12/18/201012/18/2010 2020

III.2. Cluster analysis

Segmenting objects into homogeneous groups, givendata for the objects on a variety of characteristics.

Ex: Market SegmentationBuying behavior Typology

Procedure:- Identify variables / characteristics for for grouping

- Segmenting based on similarities - distances.- Labeling clusters based on their shared charateristcs.- V alidation and profiling

Page 21: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 21/28

12/18/201012/18/2010 2121

Example: Segmenting the detergent market

Metric ScalesBased on consumer buying behaviors.

³ P lease indicate the importance level (f rom 1 f or very important

to 5 f or not important at all ) o f the f ollowing f actors when youconsider buying detergent powder´

X1 ± Product quality ____ X2 ± Price ____ X3 ± Convenience ____ X4 ± Known brand ____ X5 ± Sales promotion ____

Page 22: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 22/28

Page 23: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 23/28

12/18/201012/18/2010 2323

III.3. Multidimensional scaling (perceptual mapping)

Inferring the number / nature of dimensions underlyingrespondent perceptions based on their judgements about

objects (brands, products, companies, localities, etc.)

Metric / nonmetric scale

Identifying the relative positions (on a map) of competitive

brands based on several dimensions.

Page 24: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 24/28

12/18/201012/18/2010 2424

Example: MDS result for T V brands in HCMC

Page 25: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 25/28

Page 26: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 26/28

12/18/201012/18/2010 2626

PRACTICE PROJECT

A better procedure: Assess and refine the scales by using Factor analysis and Reliability assessmentCalculate factor scores using the qualified variables

Multiple regressionInterpret the results

Page 27: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 27/28

Page 28: Session 7 - Multivariate Data Analysis

8/8/2019 Session 7 - Multivariate Data Analysis

http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 28/28

12/18/201012/18/2010 2828

END SESSION 7