Upload
trinh-anh-phong
View
229
Download
0
Embed Size (px)
Citation preview
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 1/28
12/18/201012/18/2010 1 1
Session 7
MULTIVARIATE DATA ANALYSIS
Contents«
1. Introduction to multivariate analysis2. Dependence methods
3. Interdependence methods
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 2/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 3/28
12/18/201012/18/2010 3 3
Dependence methods: O ne or more variables have
been designated as being predicted by a set of independent variables.
Multiple regression, AN OV A, Conjoint analysis,Discriminant analysis, Structural Equation Modeling...
Interdependence methods: No variable(s) aredesignated as being predicted by others. It is theinterrelationship among all the variables taken together that interests the researcher .
Factor analaysis, Cluster, Multidimensional Scaling.
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 4/28
12/18/201012/18/2010 4 4
II. DEPENDENCE METHODSScale requirement
Method Required scale of variable(s)
Dependent Independent
One dependent variable
Multiple regression Interval interval
AN OV A Interval Nominal
Multiple regression withdummy variable
Interval Nominal
Discriminant analysis Nominal Interval
Conjoint analysis O rdinal Nominal
Two or more dependent variablesCanonical analysis Interval Interval
MANOV A Interval Nominal
Network structure including many dependent and independent variables
SEM Interval Interval
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 5/28
12/18/201012/18/2010 5 5
II.1 Multiple Regression Y = a 1X1 + a 2X2 + a 3X3 + ... a nXn + b
O ne DV, two or more IDVs
y All are intervally scaled variables (except dummy variable)
y Three key results to analyze:
The fitness of the multiple regression equation:
represented by r 2 = 0 1 (coefficient of determination)
% of variation of Y explained by the regression.
Test of the significance level of r 2: Use F ± test (sig. )Test of the significance level of each regression coeficient(a1, a2, a3,«) : Use t ± test (sig.)
(SPSS provides all sig. levels)
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 6/28
12/18/201012/18/2010 6 6
Assumptions in multiple regression
a. Linearity: relationships between DV
and IDV
s are linear.Test by observing the scatter diagram or correlation matrix
b. Multicolinearity: No linear correlation among ID V s.
Test by investigating ³Tolerance´ or V IF
c. Normality of all variables and of all residuals
d. Constant variance of the error term (Homoscedasticity)
e. Independence of the Error Terms
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 7/28
12/18/201012/18/2010 7 7
Notes when using multiple regression:
Applicable when there exist linear correlations amongvariables.
Do not prove causal relationship.
Can be used for Prediction or Explanation
There should be more than 10 observations for one ID V (
requird sample size)If IDV is nominally scaled, dummy variable regression can
be employed
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 8/28
12/18/201012/18/2010 8 8
Example:Identifying the determinants of employee satisfaction in XYZ Co.
DV
: Employee satisfaction.IDV s: Rewards, Working condition, Recognition by managers,Peer relationship, Promotion O pport., Development O pport.
IDV s UnstandardizedCoefficients
Standardized
Coefficients
t Sig. CollinearityStatistics
B Std.Error
Beta Toleance
V IF
(Constant) 0.540 0.193 2.793 .007
Rewards 0.526 0.081 0.596 6.491 .000 .793 1.062
Recognition 0.205 0.061 0.310 3.380 .001 .793 1.262
r 2=0.619 F sig. = 0.000
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 9/28
12/18/201012/18/2010 9 9
II.2. ANOVA ± ANALYSIS OF VARIANCE
Non-metric ID V s and metric D V Used to compare means of D V under the impact of one or
more ID V s.
Can be used with more than one ID V (factorial AN OV A).
Principle: ³between-group variance > within-group
variance´ significant differences in the means of groups
Family: ANCOVA / MANOVA / MANCOVA
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 10/28
12/18/201012/18/2010 1010
Example of ANOVA:
A survey of 200 companies in garment, cosmeticand plastic industries about their average expenses
for sales promotion during the last three years.
The researcher wants to explore whether there are
significant differences in the average expenses for
sales promotion among these three industries
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 11/28
12/18/201012/18/2010 1111
IDV s: Industry(nominal) (3 treatments)DV : Sales Promotion expenses (ratio)
Company No. Industry SP expenses(1000 USD)
1 Garment 123
2 Garment 235
3 Cosmetic 1346
4 Plastic 876.. ..
199 Plastic 68
200 Garment 12
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 12/28
12/18/201012/18/2010 1212
Possible method: compare the mean values of D V for
each pair of industries (using t ± test).However, when the No. of treatments increases the
comparisons become arduous.
In such a situation, AN OV A is the better method:
H0 : Q 1 = Q 2 = ... = Q k = Q
Ha : at least one Q i which is significantly different fromthe others.
Where Q = population mean
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 13/28
12/18/201012/18/2010 1313
II.3. DISCRIMINANT ANALYSIS
Purpose: to identify the linear combination of ID V s that isbest discriminate among the prespecified groups that areformed on the basis of a D V .
Metric ID V s, Nominal D V .
O utcomes: A linear combination:Y = v 1.X1 + v 2.X2 + v 3.X3 + «and critical score Y cri
For a particular subject:
Calculate its Y score,Compare Y Ycri
predict which group the subject belongs.
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 14/28
12/18/201012/18/2010 1414
Example
An IT trading company wants to know whether family income
(X1) householder¶s education (X 2) are useful to discriminatebetween PC buyers and non-PC buyers.
Conduct a survey of n households (with / without a PC).
IDV s: X1 ± income, X
2 ± education : metric variables
DV : with a PC, without a PC: categorical variable
Analysis results: discriminant function Y= v 1X1 + v 2X2
v1, v 2 : discriminant coefficients
Ycri : critical score
Given a household i (X 1i and X 2i ) we can predict whether it isa (potential) buyer.
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 15/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 16/28
12/18/201012/18/2010 1616
Example
Test a new product with 3 attributes:Price: (high, medium, low)Package size: (small, medium, large)Features: (simple, complex)
Form 8 test alternatives (instead of 18 combinations). Ask respondents to rank order
Results:
contribution of each attribute to overall preferencepreference of each treatment in an attribute.identify the most preferred combination. .
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 17/28
II.5. Structural Equation Modeling - SEM
CUS TOO I T TIO
C O TITOO I T TIO
UN TIONALC OO I NA TION
BUS INESSPERF ORMANCE
RESP ONS IVENESS
PR OF ITOR IEN T A TION
MARKE TOR IEN T A TION
..
.
MANAGEMEN TC OMPE TENC IES
...
.
.
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 18/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 19/28
12/18/201012/18/2010 1919
Example:
Case X 1 X2 X3 «. «. X m
123
«n
Factor analysis: grouping m variables into k factorsFactor 1 includes X1 X6 X9 XmFactor 2 includes X2 X3 X10 Xm - 1Factor 3 includes X4 X5 X7 X8 ...
Exploratory factor analysis (EFA)Confirmatory factor analysis (CFA).
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 20/28
12/18/201012/18/2010 2020
III.2. Cluster analysis
Segmenting objects into homogeneous groups, givendata for the objects on a variety of characteristics.
Ex: Market SegmentationBuying behavior Typology
Procedure:- Identify variables / characteristics for for grouping
- Segmenting based on similarities - distances.- Labeling clusters based on their shared charateristcs.- V alidation and profiling
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 21/28
12/18/201012/18/2010 2121
Example: Segmenting the detergent market
Metric ScalesBased on consumer buying behaviors.
³ P lease indicate the importance level (f rom 1 f or very important
to 5 f or not important at all ) o f the f ollowing f actors when youconsider buying detergent powder´
X1 ± Product quality ____ X2 ± Price ____ X3 ± Convenience ____ X4 ± Known brand ____ X5 ± Sales promotion ____
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 22/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 23/28
12/18/201012/18/2010 2323
III.3. Multidimensional scaling (perceptual mapping)
Inferring the number / nature of dimensions underlyingrespondent perceptions based on their judgements about
objects (brands, products, companies, localities, etc.)
Metric / nonmetric scale
Identifying the relative positions (on a map) of competitive
brands based on several dimensions.
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 24/28
12/18/201012/18/2010 2424
Example: MDS result for T V brands in HCMC
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 25/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 26/28
12/18/201012/18/2010 2626
PRACTICE PROJECT
A better procedure: Assess and refine the scales by using Factor analysis and Reliability assessmentCalculate factor scores using the qualified variables
Multiple regressionInterpret the results
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 27/28
8/8/2019 Session 7 - Multivariate Data Analysis
http://slidepdf.com/reader/full/session-7-multivariate-data-analysis 28/28
12/18/201012/18/2010 2828
END SESSION 7