61
Mean and Variance

Mean and Variance

  • Upload
    eara

  • View
    63

  • Download
    0

Embed Size (px)

DESCRIPTION

Mean and Variance. Distribution ?. statistics. pop’n dist’n. dist’n of a sample. (sample) statistic. (population) parameter. pop’n dist’n. dist’n of a sample. A new variable X from mseg of credit card data. mseg X - PowerPoint PPT Presentation

Citation preview

Page 1: Mean and Variance

Mean and Variance

Page 2: Mean and Variance

Distribution ?

Page 3: Mean and Variance
Page 4: Mean and Variance

dist’n of a sample pop’n dist’n

statistics

(sample) statistic (population) parameter

Page 5: Mean and Variance

X %freq

Head 1 0.5

Tail 0 0.5

Total 1.0

X freq %freq

Head 1 20 0.4

Tail 0 30 0.6

Total 50 1.0

dist’n of a sample

pop’n dist’n

X %freq

Head 1 0.35

Tail 0 0.65

Total 1.0

Page 6: Mean and Variance

Y %freq

1 1/6

2 1/6

3 1/6

4 1/6

5 1/6

6 1/6

Total 1.0

Y freq %freq

1 10 0.1

2 20 0.2

3 10 0.1

4 20 0.2

5 20 0.2

6 20 0.2

Total 100 1.0

Page 7: Mean and Variance

mseg X

Low Spender 1Med Low Spender 2 Average Spender 3 Med High Spender 4 High Spender 5

A new variable X from mseg

of credit card data

Page 8: Mean and Variance

X freq %freq

1 26 0.26

2 20 0.20

3 11 0.11

4 25 0.25

5 18 0.18

Total 100 1.00

X %freq

1 ?

2 ?

3 ?

4 ?

5 ?

Total 1.00

Variable X of credit card

data

?

Page 9: Mean and Variance

Measure for location (center)

Mean,

Mode

Median

(truncated, winsorized) Mean

Page 10: Mean and Variance

Mean

Page 11: Mean and Variance

Median

Page 12: Mean and Variance

50% 50%

Median

Page 13: Mean and Variance

Mode

Page 14: Mean and Variance
Page 15: Mean and Variance

Hit/Stop Burst

Page 16: Mean and Variance

Dealer's hidden card ?

Page 17: Mean and Variance

2 - 91,11 10

Page 18: Mean and Variance

Outlier

Page 19: Mean and Variance

64

5 6

Truncated mean / Winsorized mean

Page 20: Mean and Variance

64 5 61 9

64 5 64 6

64 5 6

64

5 6

Truncated mean / Winsorized mean

Page 21: Mean and Variance

50% 50%

Q1 Q2 Q3

75% 25%25% 75%

Quartiles

25 percentile 50 percentile 75 percentile

Median

Page 22: Mean and Variance

일러스트 = 유재일 기자 [email protected]

빗나간 주택통계 부동산 정책도 헛발질

한국의 PIR 은 주택의 평균 가격과 도시근로자의 평균 가계소득을 기준으로 계산한다 . 반면 미국의 PIR 은 미디언 가격 (MEDIAN PRICE·중간가격 ) 과 미디언 소득을 기준으로 한다 . 미디언 가격은 그 지역에서 거래된 가장 가격이 싼 주택에서부터 가장 비싼 주택을 일렬로늘어 놓은 뒤 그 중간치를 선택한다 .

건설산업전략연구소 김선덕 소장은 “평균가격이나 평균소득은 고가의 주택이나 엄청난고소득자가 일부 포함되면 통계가 왜곡될 수 있다”고 말했다 . 더군다나 한국의 주택가격은호가 ( 呼價 ) 이고 미국의 주택가격은 실거래가를 기준으로 한다 .

차학봉 기자 , [email protected]입력 : 2007.03.26 23:31

Wrong housing statistics make wrong real estate policy.

While median is better statistic than mean in representing house prices,Korean government publishes statistics calculated by mean on house prices. Mean price can be distorted by just one or two extreme prices.

Page 23: Mean and Variance

percentile

p% (100-p)%

p-th percentile

Page 24: Mean and Variance

Measure for variability

Range

InterQuartile Range (IQR)

Variance

Standart Deviation

Page 25: Mean and Variance

11

Range

Page 26: Mean and Variance

1Q 2Q 3Q

13 QQIQR

Page 27: Mean and Variance

11

variance, standard deviation

Page 28: Mean and Variance

Y %freq

1 1/6

2 1/6

3 1/6

4 1/6

5 1/6

6 1/6

Total 1.0

Y freq %freq

1 10 0.1

2 20 0.2

3 10 0.1

4 20 0.2

5 20 0.2

6 20 0.2

Total 100 1.0

Mean (Y) = 1*0.1 + 2*0.20 + 3*0.1 + ... + 6*0.2

= 3.8 Mean (Y) = 1*(1/6) + 2*(1/6) + ... + 6*(1/6) =

3.5

Page 29: Mean and Variance

X freq %freq

Low Spender 1 26 0.26 Med Low Spender 2 20 0.20 Average Spender 3 11 0.11 Med High Spender 4 25 0.25 High Spender 5 18 0.18 -----------------------------------------------Total 100 1.00

Mean of X

Mean (X) = 1*0.26 + 2*0.20 + 3*0.11 + 4*0.25 +

5*0.18 = 2.89

Page 30: Mean and Variance

fX ~

i

ii xfxXE )()(

fX

)( 1xf1x

)( nxfnx

1Total

1)(

iixf

Page 31: Mean and Variance

fX ~

i

ii xfxXE )()( 22

fX

)( 1xf1x

)( nxfnx

1Total

2X21x

2nx

Page 32: Mean and Variance

X Q %freq

Low Spender 1 (-2)2 0.26 Med Low Spender 2 (-1)2 0.20 Average Spender 3 02 0.11 Med High Spender 4 12 0.25 High Spender 5 22 0.18 -----------------------------------------------Total 1.00

A new variable Q = (X – 3)2

Mean (Q) = (-2)2*0.26 + (-1)2*0.20 + 02*0.11 +

12*0.25 + 22*0.18

Page 33: Mean and Variance

fX ~

i

ii xfcxcXE )()(])[( 22

]))([()( 2XEXEXVar

)(XEc Let ,

Page 34: Mean and Variance

*~ fX

XxfxXEi

ii )()( **

*fX

)( 1* xf1x

)(* nxfnx

1Total

Distribution of a sample

Page 35: Mean and Variance

i

ii

ii Xxn

xfxXE1

)()( **

*fX

5/21

5/13

1Total

5/22

*fX

5/11

5/13

1Total

5/12

5/11

5/12

Sample mean

freq

2

12

5

Page 36: Mean and Variance

2*** ))(()( XEXEXVar

*~ fX

2*2 )(1

)()( xxn

xfxxi

ii

ii

(O)

Sample variance

222)(1

1X

ii sorsxx

n

2*** ))((1

)( XEXEn

nXVar

Page 37: Mean and Variance

1

2)(1

1

ii xx

n

For large n,

1

2)(1

ii xx

n

11

n

n

20n large enough

Page 38: Mean and Variance

1

22 )(1

1

ii xx

ns

n N

1

22 )(1

iixN

X

Page 39: Mean and Variance

Standard deviation

)()( XVarXsd

)(*)(* XVarXsd

Page 40: Mean and Variance

X V freq

Low Spender 1 (1-2.89)2 26 Med Low Spender 2 (2-2.89)2 20 Average Spender 3 (3-2.89)2 11 Med High Spender 4 (4-2.89)2 25 High Spender 5 (5-2.89)2 18 -----------------------------------------------Total 100

V = (X – 2.89 )2

Var*(X)= (1/99)[(1-2.89)2*26 + …+ (5-2.89)2*18] =

2.22 sd*(X) = 1.49

Page 41: Mean and Variance

dist’n of a sample pop’n dist’n

statistics

sample mean population mean

sample variance population variance

sample median population median

…. ….

Page 42: Mean and Variance

Nn

no. of teeth

weight of body

no. of phone calls

Page 43: Mean and Variance

N

no. of teeth weight of body

N

freqxf ii )( )(xf

1)( dxxf1)( i

ixf

no. of phone calls

n

n

freqxf ii )(

1)( i

ixf

Page 44: Mean and Variance
Page 45: Mean and Variance

dxxf )(i

ixf )(

dxxfx )(2i

ii xfx )(2

Page 46: Mean and Variance

E

)(,)(,)(* xfxfxf ii

Page 47: Mean and Variance

dxxfxXEXEXVar )()())(()( 22

dxxfxXE )()(

i

ii xfxXE )()(

)()())(()( 22ii

i

xfxXEXEXVar

Page 48: Mean and Variance

Expected value

Page 49: Mean and Variance

dxxfxXE )()(

i

ii xfxXE )()(

Page 50: Mean and Variance

X f(xi)

Head 1 0.5

Tail 0 0.5

5.0)( XE

0 1

Page 51: Mean and Variance

Y f(yi)

1 1/6

2 1/6

3 1/6

4 1/6

5 1/6

6 1/6

5.3)( YE

Page 52: Mean and Variance

1)1( E

1)(1)1( i

ixfE

ccE )( X f(xi)

1 1/2

1 1/4

1 1/8

1 1/8

Page 53: Mean and Variance

)(3)3( XEXE

)(3)(3)(3)3( XExfxxfxXEi

iii

ii

X 3X f(xi)

1 3 1/2

2 6 1/4

3 9 1/8

4 12 1/8

)()( XEccXE

Page 54: Mean and Variance

)()1()()1)(())(( XEEXEXEEXEE

2))(()()())(())(( XEXEXEXXEEXXEE

Page 55: Mean and Variance

E

)(),(),(* xfxfxf ii

Page 56: Mean and Variance

100 x + 10 x

i ii i iii ybxaybxa )(

)()()( YEbXEaYbXaE

Page 57: Mean and Variance

100 x + 10 x

X Y 100X 10Y 100X+10Y

f

1 (H) 1 100 10 110 1/12

0 (T) 1 0 10 10 1/12

1 (H) 2 100 20 120 1/12

0 (T) 2 0 20 20 1/12

1 (H) 6 100 60 160 1/12

0 (T) 6 0 60 60 1/12

]6010110)[12/1()10100( YXE

85)(10)(100 YEXE

Page 58: Mean and Variance

2))(()( XEXEXVar

22 ))(()( XEXE

22 ))(()(2 XEXEXXE

Page 59: Mean and Variance

22 )())(()( cXEXEXEXVar

For any constantc

Page 60: Mean and Variance

0)1( Var

)()( 2 XVaraaXVar

Page 61: Mean and Variance

Thank you !!