36
BSAD 6314 Multivariate Statistics

BSAD 6314 Multivariate Statistics

  • Upload
    casey

  • View
    64

  • Download
    4

Embed Size (px)

DESCRIPTION

BSAD 6314 Multivariate Statistics. Major Goals: Expand your repertoire of analytical options. Choose appropriate techniques Conduct analyses using available software. Interpret findings Know where to go for additional help. What are Multivariate Stats?. Correlation. 1. - PowerPoint PPT Presentation

Citation preview

Page 1: BSAD 6314  Multivariate Statistics

BSAD 6314

Multivariate Statistics

Page 2: BSAD 6314  Multivariate Statistics

Major Goals:

• Expand your repertoire of analytical options.

• Choose appropriate techniques

• Conduct analyses using available software.

• Interpret findings

• Know where to go for additional help.

Page 3: BSAD 6314  Multivariate Statistics

What are Multivariate Stats?

IV

DV

1

>1

1 >1

• Correlation

• t test

• One-way ANOVA

• Multiple Regression

• Factorial ANOVA

• Canonical Correlation

• Factorial MANOVA

• One way MANOVA

• Canonical Correlation

Page 4: BSAD 6314  Multivariate Statistics

Hair TaxonomyDependence Structure/

Independence

Structure among

variables

Structure among cases/ people

Factor Analysis

Cluster Analysis

One DV, Mult IVs

# variables? Structure

Type of relationship

DV Cont.

DV Cat.

Mult Rel.

IVs & DVs

SEM

Single Rel.; Mult

DVs

IV Cont. IV Cat.

Canon. Corr.

MANOVA

Mult. Regr.

Discrim; Log Regr

Page 5: BSAD 6314  Multivariate Statistics

• The math behind the statistics!

• Two dimensional math represented in an array

• People x Variables (OB/HR).

• Organization x Variables (Strategy/OT).

Matrices:

Page 6: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

The variables (V) can be continuous measures, categories represented by numbers, and transformations, products or combinations of other variables.

Page 7: BSAD 6314  Multivariate Statistics

Nearly all statistical procedures—univariate and multivariate—are based on linear combinations. Understanding that basic fact has far-reaching implications for using statistical procedures to their fullest advantage.

A linear combination (LC) is nothing more than a weighted sum of variables:

LC = W1V1 + W2V2 + . . . + WKVK

Page 8: BSAD 6314  Multivariate Statistics

LC = W1V1 + W2V2 + . . . + WKVK

A very simple example is the total score on a questionnaire. The individual items on the questionnaire are the variables V1, V2, V3, etc. The weights are all set to a value of 1 (i.e., W1 = W2 = . . . Wk = 1).

We call this unit weighting.

Page 9: BSAD 6314  Multivariate Statistics

The items combined in a linear combination need not be variables. The items combined are often cases (e.g., people, groups, organizations).

LC = W1P1 + W2P2 + . . . + WKPN

A good example is the sample mean. In this case the weights are set to the reciprocal of the sample size (i.e., W1 = W2 = . . . Wk = 1/N).

Page 10: BSAD 6314  Multivariate Statistics

Different statistical procedures derive the weights (W) in a linear combination to either maximize some desirable property (e.g., a correlation) or to minimize some undesirable property (e.g., error).

The weights are sometimes assigned rather than derived (e.g., dummy, effect, and contrast codes) to produce linear combinations of particular interest.

Page 11: BSAD 6314  Multivariate Statistics

Choice of Statistical Techniques

• Research Purpose – Prediction, Explanation, Data Reduction

• Number of IVs

• Number of DVs

• Measurement Scale – Continuous, categorical

Page 12: BSAD 6314  Multivariate Statistics

Dependence/Prediction

• Correlation

• Regression

• Multiple Regression– With nominal variables; Polynomials

• Logistic regression/Dicriminant analysis

• Canonical Correlation

• MANOVA

Page 13: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

The simplest possible inferential statistic-the bivariate correlation-involves just two variables.

Page 14: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

Con

tinuo

us

Con

tinuo

us

In its usual form, the correlation is calculated on variables that are continuous.

Page 15: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

Con

tinuo

us

Cat

egor

ical

When one of the variables is categorical, the calculation produces a point-biserial correlation.

Page 16: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

Cat

egor

ical

Cat

egor

ical

When both variables are categorical, the calculation produces a phi coefficient.

Page 17: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

Regression

All forms of these correlations, however, can be recast as a linear combination:

V2 predicted = A + BV1

Page 18: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

r

OLS (ordinary least squares) Regression

B and A can be chosen so that the sum of the squared deviations between V2 predicted and V2 are minimized. Solving for B and A using this rule also produces the maximum possible correlation between V1 and V2.

Page 19: BSAD 6314  Multivariate Statistics

Least Squared Error

2 1 0 1 2

2

1

1

2

Z (Height)

Z (

We

igh

t)

Page 20: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

The problem can be easily expanded to include more than one “predictor.” This is multiple regression:

V4predicted = B1V1 + B2V2 + B3V3 + A

R

Con

tinuo

us

Con

tinuo

us

Page 21: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

The values for B1, B2, B3, and A are found by the least squares rule

R

Con

tinuo

us

Con

tinuo

us

Page 22: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

R

V1, V2, and V3 could be categorical contrast variables, perhaps coding the two main effects and the interaction from an experimental design. In that case, the multiple regression produces an analysis of variance.C

ontin

uous

Cat

egor

ical

Page 23: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

R

Or V2 might be the square of V1 and V3 might be the cube of V1. Then the multiple regression examines the polynomial trends.

Con

tinuo

us

Con

tinuo

us

Page 24: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

P6

.

.

.

PN

“R”

If the “outcome” variable is categorical, the basic nature of the analysis does not change. We still seek an “optimal” linear combination of V1, V2, and V3.

Cat

egor

ical

Page 25: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

R

Set A

The basic multiple regression problem can be generalized to situations that involve more than one “outcome” variable.

Set B

Page 26: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

RAB

Set A

LCA = W1V1 + W2V2 + W3V3 + W4V4 + W5V5 + W6V6

Set B

LCB = W7V7 + W8V8 + W9V9 + W10V10 + W11V11 + W12V12

Page 27: BSAD 6314  Multivariate Statistics

Structure of Data

• Goal is typically data reduction

• How do the data “hang together?

• Techniques include:– Factor analysis (the items)– Cluster analysis (the cases, people,

organizations)– MDS (the objects)

Page 28: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

Sometimes we are not interested in relations between sets of variables but instead focus on a single set and seek a linear combination that has desirable properties.

Page 29: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

Or we might wonder how many “dimensions” underlie the 12 variables. These multiple dimensions also would be represented by linear combinations, perhaps constrained to be uncorrelated.

These questions are addressed in principal components analysis and factor analysis.

Page 30: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12

P1

P2

P3

P4

P5

.

.

.

PN

A B C D

Page 31: BSAD 6314  Multivariate Statistics

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 . . . PN

V1

V2

V3

V4

V5

.

.

.

VK

Sometimes we might shift the status of “people” and “variables” in our analysis. Our interest might be in whether a smaller number of dimensions or clusters might underlie the larger collection of people.

Page 32: BSAD 6314  Multivariate Statistics

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 . . . PN

V1

V2

V3

V4

V5

.

.

.

VK

Approaches such as multidimensional scaling and cluster analysis can address such questions. These are conceptually similar to principal components analysis, but on a transposed matrix.

Page 33: BSAD 6314  Multivariate Statistics

Structural Equation Modeling

• Examines both structure of the data and predictive relationships simultaneously– Measurement model– Structural model

Page 34: BSAD 6314  Multivariate Statistics

A

D

C

B

Structural equation models examine the relations among latent variables. Two kinds of linear combinations are needed.

D2

D1

D3

C1

C2

B1

B2

A1

A2

Page 35: BSAD 6314  Multivariate Statistics

The key idea is that the original data matrix can be transformed using linear combinations to provide useful ways to summarize the data and to test hypotheses about how the data are structured.

Sometimes the linear combinations are of variables and sometimes they are of people.

Page 36: BSAD 6314  Multivariate Statistics

V1 V2 V3 V4 V5

V1

V2

V3

V4

V5

Once a linear combination is created, we also need to know something about its variability. Everything we need to know about the variability of a linear combination is contained in thevariance-covariance matrix of the original variables (). The weights that are applied to create a linear combination can also be applied to to get the variance of that LC.

12

r12 = _______ (2

122)½