25
(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn Univ 1 Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where X~ vectors of K random variables X = [X 1 ,X 2 ,…,X K ] Y ~ a single random variable

Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University1

Multi-variate StatisticsExtension of Bi-variate Statistics

(Y, X)~ random variableswhere X~ vectors of K random variables

X = [X1,X2,…,XK]Y ~ a single random variable

Page 2: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University2

Multi-variate Analyses• Pair-wise Covariance or

Correlation• Multi-way ANOVA• Multiple Regression

Page 3: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University3

Multiple Regression AnalysisFocus on the dependency of Y on the X vector, e.g.,

Xk - explanatory or independent variable,k = 1,…,K

Y - dependent variable

)(),...,,(

)(),...,,(

212|

21|

XvXXXv

XmXXXm

KXY

KXY

Page 4: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University4

Multiple Linear RegressionAssumptions1) linearity

where are unknown parameters

2) variance-independent or3) normality, i.e.

βXXY |

22| XY

),(~| 2βXNXY

TK ...21β

Page 5: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University5

CLNRM (1)Classical Linear Normal Regression

Model is based upon the assumptionsYi = Xi + i

where i = index of the observation i = identical and independent

normal error termi ~ N(0, 2) for all i=1,…,n

Page 6: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University6

CLNRM (2)Xi is pre-selected or non-random but Yi or i is randomly sampled.

Xi is the non-random component of Yii is the random component of Yi. Note that X1 can be intentionally set to

one for all observations so that its coefficient 1 becomes the y-intercept.

Page 7: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University7

CLNRM Matrix Representation (1)

Define

nKnnn

K

K

n XXX

XXXXXX

Y

YY

⋮⋮⋮⋮⋮⋮2

1

21

12212

12111

2

1

,

...

...

...

, εXY

Page 8: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University8

OLS Estimation for CLNRM (1)min

ormin

n

iKKiiii XXXY

1

22211 )]...([

XβYXβY T

Page 9: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University9

OLS Estimation for CLNRM (2)First-Order Conditions

0XβYX T2

0XβXYX TT

YXXXβ TT 1ˆ

Page 10: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University10

OLS Estimation for CLNRM (3)Estimator for 2

where is called the fitted value of YWhy n-K?

βXYβXY ˆˆ12

T

Kn

βXY ˆˆ

YYYY ˆ1 TT

Kn

Page 11: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University11

Properties of OLS estimators (1)Theorem

Does not require normality assumption.

Note that is an unbiased estimator of .

12)ˆ(

)ˆ(

XXβ

ββTV

E

β̂

Page 12: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University12

Variance-Covariance Matrix of 12)ˆ(

XXβ TV

)ˆ()ˆ,ˆ()ˆ,ˆ(

)ˆ,ˆ()ˆ()ˆ,ˆ()ˆ,ˆ()ˆ,ˆ()ˆ(

21

2212

1211

KKK

K

K

VCC

CVCCCV

⋯⋮⋱⋮⋮

⋯⋯

2 is generally unknown.

β̂Properties of OLS estimators (5)

Page 13: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University13

Estimated Variance-Covariance Matrix of 12 )()ˆ(ˆ

XXβ TV

)ˆ(ˆ)ˆ,ˆ(ˆ)ˆ,ˆ(ˆ

)ˆ,ˆ(ˆ)ˆ(ˆ)ˆ,ˆ(ˆ)ˆ,ˆ(ˆ)ˆ,ˆ(ˆ)ˆ(ˆ

21

2212

1211

KKK

K

K

VCC

CVCCCV

⋯⋮⋱⋮⋮

⋯⋯

β̂

Properties of OLS estimators (6)

Page 14: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University14

Standard Deviation of kβ̂

)ˆ(ˆ)ˆ( kk βVβse

Standard Error of kβ̂

)ˆ()ˆ( kk βVβsd

Properties of OLS estimators (7)

Page 15: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University15

Kn

Kn

sdt k

kk

cal

2

2

)(

)ˆ(

ˆ

Properties of OLS estimators (8)

)(~)ˆ(

ˆKnt

se k

kk

<<Basis for statistical inference>>

Page 16: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University16

Central Limit Theorem (1)Similar to that for the Simple Linear

Regression Model. Even though the error terms are not normal, the properties of OLS estimators asymptotically hold when the sample size is very large.

Page 17: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University17

Gauss-Markov Theorem (1)Similar to that for the Simple Linear

Regression Model. Given that X is non-random, OLS estimator is Best Linear Unbiased Estimator.

Page 18: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University18

Coefficient of Determination (1)R2 is a measure for goodness-of-fit. How

well does the model fit the observed data? Low R2 implies “bad” fit.

DefinitionSSR = Sum of Squared ResidualsSST = Sum of Squared Totals

SSTSSRR 12

Page 19: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University19

Coefficient of Determination (2)]ˆ[]ˆ[)ˆ(

1

2 YYYY

T wheren

iii YYSSR

Note that, in general, R2 cannot be greater than one but could be negative.

n

ii YYSST

1

2)(

Page 20: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University20

Coefficient of Determination (3)Low R2 or a bad fit does not mean a

bad model. It simply implies a large uncertainty in the nature. It is mainly used as a criterion to select various “candidate” models.

Page 21: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University21

Coefficient of Determination (4)If an Xi has constant value or a linear

combination of Xi ’s is equivalent to a constant value, then, always

and

10 2 R

n

ii

n

ii

YY

YYR

1

2

1

2

2

)(

)ˆ(

Page 22: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University22

Coefficient of Determination (5)Interpretation if 1-R2 or SSR/SST can be interpreted as

the fraction of total variation of Y due to the random component ().

R2 is generally regarded as the fraction of total variation of Y explained by the explanatory variables or due to the non-random component.

10 2 R

Page 23: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University23

Adjusted- R2 (1)We can cheat on R2 by adding more

irrelevant independent variables on the right-hand side, especially when sample is small.

Higher K ==> smaller SSR ==>higher R2

Page 24: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University24

Adjusted- R2 (2)Definition

Concept Penalize R2 by dividing with (n-K)when

an irrelevant variable is added.

2

221

)1/()/(1

YsnSSTKnSSRR

Page 25: Multi-variate Statisticspioneer.netserv.chula.ac.th/~ppongsa/2900600/03-multivar.pdf · Multi-variate Statistics Extension of Bi-variate Statistics (Y, X)~ random variables where

(c) Pongsa Pornchaiwiseskul, Faculty of Economics, Chulalongkorn University25

Adjusted- R2 (3)Purpose For a small sample, it is a better

measure for goodness-of-fit than R2. It is also used as criterion to add or remove an explanatory variable from the model if it does not contradict theories.