129
1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the chapter, you will be able to: 1) Identify population and sample data and perform population and sample statistical calculations. 2) Define, interpret and evaluate statistics. 3) Demonstrate the use of statistical tables.

1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

Embed Size (px)

Citation preview

Page 1: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

1

Chapter 5 – Statistical Review• Chapter 5 is a brief review of statistical

concepts. It is NOT a replacement for a statistics course.

• By the end of the chapter, you will be able to:1) Identify population and sample data and perform population and sample statistical calculations.2) Define, interpret and evaluate statistics.3) Demonstrate the use of statistical tables.4) Construct confidence intervals and hypothesis tests from sample data.5) Begin to calculate OLS estimations

Page 2: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

2

5.1 Simple Economic Models and Random Components

• Consider the linear economic model:

Yi = β1 + β2Xi + єi• The variable Y is related to another

variable X–Utility is related to hours of TV watched

• Єi (or epsilon) represents error; everything included in Y that is not explained by X– Ie: Quality of TV show, Quality of

Popcorn, Other Facts of Life

Page 3: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

3

5.1 Observed or Random Components

• Єi (or epsilon) is the RANDOM ERROR TERM; it takes on values according to chance

• Since Yi depends on Єi, it is also random

• β1 + β2Xi is assumed to be fixed in most simple models (which simplifies everything)–Referred to as the deterministic part of

the model–X, β1 and β2 are Non-Random

• β1 and β2 are unknown, and must be estimated

Page 4: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

4

5.1 Example

• Consider the function:Utilityi = β1 + β2Sistersi + єi

• Happiness depends on the number of sisters

• єi captures: number of brothers, income, and other factors (ie: bad data collection and shocks)

• Utility and Sisters are Observable• Utility and єi are random

• β1 and β2 must be estimated (< or > 0?)

Page 5: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

5

5.2 Random Variables and Probabilities

• Random Variable–A variable whose value is determined by the outcome of a chance experiment

• Ie: Sum of a dice roll, card taken out of a deck, performance of a stock, oil discovered in a province, gender of a new baby, etc.

• Some outcomes can be more likely than others (ie: greater chance to discover oil in Alberta, more likely to roll an 8 than a 5)

Page 6: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

6

5.2 Random Variables• Discrete Variable

–Can take on a finite # of values–Ie: Dice roll, card picked

• Continuous Variable–Can take on any value within a range

–Ie: Height, weight, time

• Variables are often assumed discrete to aid in calculations and economic assumptions (ie: Money in increments of 1 cent)

Page 7: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

7

5.2 Probability Terminology

• Probabilities are assigned to the various outcomes of random variables

Sample Space – set of all possible outcomes from a random experiment-ie S = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}-ie E = {Pass exam, Fail exam, Fail horribly}

Event – a subset of the sample space-ie B = {3, 6, 9, 12} ε S-ie F = {Fail exam, Fail horribly} ε E

Page 8: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

8

5.2 Probability Terminology

Mutually Exclusive Events – cannot occur at the same time-rolling both a 3 and an 11; being both dead and alive; having both a son and a daughter (and only one child)

Exhaustive Events – cover all possible outcomes-a dice roll must lie within S ε [2,12]-a person is either married or not married

Page 9: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

9

5.2 Quiz ExampleStudents do a 4-question quiz with each

question worth 2 marks (no part marks). Handing in the quiz is worth 2 marks, and there is a 1 mark bonus question.

Events:-getting a zero (not handing in the quiz)-getting at least 40% (at least 1 right)-getting 100% or more (all right or all right

plus the bonus question)-getting 110% (all right plus the bonus

question)*Events contain one or more

possible outcomes

Page 10: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

10

5.2 ProbabilityProbability = the likelihood of an event

occurring (between 0 and 1)

P(a) = Prob(a) = probability that event a will occur

P(Y=y) = probability that the random variable Y will take on value y

P(ylow < Y < yhigh) = probability that the random variable Y takes on any value between ylow and yhigh

Page 11: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

11

5.2 Probability Examples

P(true love) = probability that you will find true love

P(Sleep=8 hours) = probability that the random variable Sleep will take on the value 8 hours

P($80 < Wedding Gift < $140) = probability that the random variable Wedding Gift takes on any value between $80 and $140

Page 12: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

12

5.2 Probability Extremes

If Prob(a) = 0, the event will never occur ie: Canada moves to Europeie: the price of cars drops below zeroie: your instructor turns into a giant

llama

If Prob(b) = 1, the event will always occur ie: you will get a mark on your final examie: you will either marry your true love or

notie: the sun will rise tomorrow

Page 13: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

13

5.2 Probability Rules

1) P(a) must be greater than or equal to 0 and less than or equal to 1 : 0≤ P(a) ≤1

2) If any set of events (ie: {A,B,C}) are exhaustive, then

P(A or B or C) = 1ex) Prob. of winning, losing or tying

3) If any set of events (ie: {A,B,C}) are mutually exclusive, then P(A or B or C)=P(A)+P(B)+P(C)

ex) Prob. of marrying the person to the right or left

Page 14: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

14

5.2 Probability Examples

1) P(coin flip=heads) = ½ 2) P(2 coin flips=2 heads) = ¼ 3) Probability of tossing 6 heads in a row =

1/644) Probability of rolling less than 4 with 1

six-sided die = 3/6 5) Probability of throwing a 13 with 2 dice=

06) Probability of winning rock, paper,

scissors = 1/3 (or 3/9)7) Probability of being in love or not in

love=18) Probability of passing the course = ?

Page 15: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

15

5.2.1 Probability Density Functions• The probability density function (pdf)

summarizes probabilities associated with possible outcomes

Discrete Random Variables – pdf

f(y) = Prob (Y=y)Σf(y) = 1

-(the sum of the probabilities of all possible outcomes is one)

Page 16: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

16

5.2.1 Dice Example

• The probabilities of rolling a number with the sum of two six-sided die

• Each number has different die combinations:

7={1+6, 2+5, 3+4, 4+3, 5+2, 6+1}

Exercise: Construct a table with one 4-sided and one 8-sided die

y f(y) y f(y)

2 1/36 8 5/36

3 2/36 9 4/36

4 3/36 10 3/36

5 4/36 11 2/36

6 5/36 12 1/36

7 6/36

Page 17: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

17

5.2.1 Probability Density Functions

Continuous Random Variables – pdf

f(y) = pdf for continuous random variable Y∫f(y)dy = 1 (sum/integral of all

probabilities of all possibilities is one)-probabilities are measured as areas under

the pdf, which must be non-negative-technically, the probability of any ONE

event is zero

Page 18: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

18

5.2.1 Continuous HeadacheContinuous Random Variables – pdf

f(y) = 0.2 for 2<y<7= 0 for y <2 or y >7

730

0.2

f(y)

Continuous probabilities are the area under the pdf curve.

Y

7

3

73 8.0)3(2.0)7(2.0]2.0[2.0)73(

y

ydyYP

Page 19: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

19

5.3 Expected ValuesExpected Value – measure of central

tendency; center of the distribution; population mean-If the variable is collected an infinite number of times, what average/mean would we expect?

Discrete Variable:μY=E(Y) = Σyf(y)

Continuous Variable:μ(Y)=E(Y) = ∫yf(y)dy

Page 20: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

20

5.3 Expected Example

What is the expected value from a dice roll?

E(W) = Σyf(y)=2(1/36)+3(2/36)+…

+11(2/36)+12(1/36) =7

Exercise: What is the expected value of rolling a 4-sided and an 8-sided die? A 6-sided and a 10-sided die?

Page 21: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

21

5.3 Expected Application – Pascal’s Wager

Pascal’s Wager, from Philosopher, Mathematician, and Physicist Blaise Pascal (1623-62) argued that belief in God could be justified through expected value:-If you live as if God exists, you get huge rewards if you are right, and wasted some time and effort if you’re wrong-If you live as if God does not exist, you save some time and effort if you’re right, and suffer huge penalties if you’re wrong

Page 22: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

22

5.3 Expected Application – Pascal’s Wager

Mathematically:E(belief) = Σutility * f(utility)

=(Utility if God exists)*p(God exists)+

+(Utility if no God)*p(no God)=1,000,000(0.01)+(50)(0.99)=10,000+49.52=10,049.52

Page 23: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

23

5.3 Expected Application – Pascal’s Wager

Mathematically:E(no belief)= Σutility * f(utility)

=(Utility if God exists)*p(God exists)+

+(Utility if no God)*p(no God)=-1,000,000(0.01)+(150)(0.99)=-10,000+148.5=-9,851.5

Since -9,851.5 is less than 10,049.52, Pascal argued that belief in God is rational.

Page 24: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

24

5.3.1 Properties of Expected Values

a) Constant Property

E(a) = a if a is a constant or non-random variable

Ie: E(14)=14Ie: E(β1+ β2Xi) = β1+ β2Xi

b) Constants and random variables

E(a+bW) = a+bE(W)If a and b are non-random and W is random

Page 25: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

25

5.3.1 Properties of Expected Values

Applications:

If E(єi) =0, thenE(Yi) = E(β1 + β2Xi + єi)

= β1 + β2Xi + E(єi)= β1 + β2Xi

E(6sided+10sided)=E (6-sided) + E (10 sided)= 3.5 + 5.5= 9

Page 26: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

26

5.3.1 Properties of Expected Values

c) “Not so Fast” Property

E(WV) ≠ E(W)E(V)E(W/V) ≠ E(W)/E(V)

d) Non-Linear Functions

E(Wk) = Σwkf(w)E(six-sided die2) =22(1/36)+32(2/36)+

…+112(2/36)+122(1/36)

=54.83

Page 27: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

27

5.4 Variance

Consider the following 3 midterm distributions:

1) Average = 70%; everyone in the class received 70%

2) Average = 70%; half the class received 50% and half received 90%

3) Average = 70%; most of the class was in the 70’s, with a few 100’s and a few 40’s who got a Bachelor in Pottery

Page 28: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

28

5.4 Variance

Although these midterm results share the same average, their distributions differ greatly.

While the first results are clustered together, the other two results are quite dispersed

Variance – a measure of dispersion (how far a distribution is spread out) for a random variable

Page 29: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

29

5.4 Variance Formula

σY2= Var(Y) = E(Y-E(Y))2

= E(Y2) – [E(Y)]2

Discrete Random Variable:σY

2= Var(Y)= Σ(y-E(Y))2f(y)

Continuous Random Variable:σY

2= Var(Y)= ∫(y-E(Y))2f(y)dy

Page 30: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

30

5.4 VariancesExample 1:E(Y)=70Yi =70 for all i

Var(Y) = Σ(y-E(Y))2f(y)= Σ(70-70)2 (1)= Σ(0)(1)=0

If all outcomes are the same, there is no variance.

Page 31: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

31

5.4 Variances

Example 2:E(Y)=70f(50)=0.5, f(90)=0.5

Var(Y) = Σ(y-E(Y))2f(y)= (50-70)2(0.5)+ (90-70)2(0.5)+=200+200=400

Page 32: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

32

5.4 VariancesExample 3:E(Y)=70f(40)=1/5, f(70)=3/5, f(100)=1/5

Var(Y) = Σ(y-E(Y))2f(y)= (40-70)2(1/5)+ (70-70)2(3/5)+ (100-70)2(1/5)=900/5+0+900/5=1800/5=360

Page 33: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

33

5.4 Standard DeviationWhile Variance is a good tool for

measuring dispersion, it is difficult to represent graphically (ie: Bell Curve)

Standard Deviation is more useful for a visual view of dispersion

Standard Deviation = Variance1/2

sd(W)=[var(W)]1/2

σ= (σ2)1/2

Page 34: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

34

5.4 SD Examples

In our first example, σ =01/2=0No dispersion exists

In our second example, σ =4001/2≈20

In our third example, σ =3601/2=19.0

Results where most dispersed in the second example.

Page 35: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

35

5.4.1 Properties of Variance

a) Constant PropertyVar(a) = 0 if a is a constant or non-random

variableIe: Var(14)=0Ie: Var(β1+ β2Xi) = 0

b) Constants and random variablesVar(a+bW) = b2 Var(W)If a and b are non-random and W is random

Page 36: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

36

5.4.1 Properties of Variance

Applications:

If Var(єi) =k, then

Var(Yi) = Var(β1 + β2Xi + єi)

= 0 + Var(єi)

= k

Exercise: Calculate the variance from:a) A coin flipb) A 4-sided die rollc) Both a and b, where the coin flip represents 0

or 1.

Page 37: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

37

5.4.1 Properties of Variance

c) Covariance Property

If W and V are random variables, and a, b, and c are non-random, then

Var(a+bW+cV) = Var(bW+cV)= b2 Var(W) + c2 Var (V)

+2bcCov(W,V)Where Covariance will be examined in 5.6

Page 38: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

38

5.4.1 Properties of VarianceApplication:

If Var(Cost of Gas)=10 centsAnd Var(Cost of a Slurpee)=5 centsAnd Cov(Cost of Gas, Cost of Slurpee)=-1

cent

Var(Cost of Gas+Cost of 2 Slurpees)= b2 Var(G) + c2 Var (Sl)+2bcCov(G,Sl)

=12(10)+22(5)+2(2)(-1)=10+20-4=26 cents

Page 39: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

39

5.5 Joint Probability Density Functions

Sometimes we are interested in the isolated occurrence or effects of one variable. In this case, a simple pdf is appropriate.

Often we are interested in more than one variable or effect. In this case it is useful to use:

Joint Probability Density Functions Conditional Probability Density

Functions

Page 40: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

40

5.5 Joint Probability Density Functions

Joint Probability Density Function--summarizes the probabilities

associated with the outcomes of pairs of random variables

f(w,z) = Prob(W=w and Z=z)∑ f(w,z) = 1

Similar statements are valid for continuous random variables.

Page 41: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

41

5.5 Joint PDF and You

Love and Econ Example:

On Valentine’s Day, Jonny both wrote an Econ 299 midterm and sent a dozen roses to his love interest.

He can either pass or fail the midterm, and his beloved can either embrace or spurn him. E = {Pass, Fail}; L = {Embrace, Spurn}

Page 42: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

42

5.5 Joint PDF and You

Love and Econ Example:

Joint pdf’s are expressed as follows:P(pass and embrace) = 0.32P(pass and spurn) = 0.08P(fail and embrace) = 0.48P(fail and spurn) = 0.12

(Notice that:∑f(E,L) = 0.32+0.08+0.48+0.12 = 1)

Page 43: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

43

5.5 Joint and Marginal Pdf’s

Marginal (individual) pdf’s can be determined from joint pdf’s. Simply add all of the joint probabilities containing the desired outcome of one of the variables.

Ie: f(Y=7)=∑f(Y=7,Z=zi)

Probability that Y=7 = sum of ALL joint probabilities where

Y=7

Page 44: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

44

5.5 Love and Economics

f(pass) = f(pass and embrace)+f(pass and spurn)

= 0.32 + 0.08 = 0.40

f(fail) = f(fail and embrace)+f(fail and spurn)

= 0.48 + 0.12 = 0.60

Exercise: Find f(embrace) and f(spurn)

Page 45: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

45

5.5 Love and Economics

Notice:

Since passing or failing are exhaustive outcomes, Prob (pass or fail) = 1

Also, since they are mutually exclusive,

Prob (pass or fail) = Prob (pass) + Prob (fail)

= 0.4 + 0.6 = 1

Page 46: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

46

5.5 Conditional Probability Density Functions

Conditional Probability Density Function--summarizes the probabilities

associated with the possible outcomes of one random variable conditional on the occurrence of a specific value of another random variable

Conditional pdf = joint pdf/marginal pdfOr

Prob(a|b) = Prob(a&b) / Prob(b)(Probability of “a” GIVEN “b”)

Page 47: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

47

5.5 Conditional Love and Economics

From our previous example:

Prob(pass|embrace) = Prob(pass and embrace)/

Prob (embrace)= 0.32/0.80= 0.4

Prob(fail|embrace) = Prob(fail and embrace)/

Prob (embrace)= 0.48/0.80= 0.6

Page 48: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

48

5.5 Conditional Love and Economics

From our previous example:

Prob(spurn|pass) = Prob(pass and spurn)/Prob (pass)

= 0.08/0.40= 0.2

Prob(spurn|fail) = Prob(fail and spurn)/Prob (fail)

= 0.12/0.60= 0.2

Page 49: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

49

5.5 Conditional Love and Economics

Exercise: Calculate the other conditional pdf’s:

Prob(pass|spurn)

Prob(fail|spurn)

Prob(embrace|pass)

Prob(embrace|fail)

Page 50: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

50

5.5 Statistical Independence

If two random variables (W and V) are statistically independent (one’s outcome doesn’t affect the other at all), then

f(w,v)=f(w)f(v)And:

1) f(w)=f(w|any v)2) f(v)=f(v|any w)

As seen in the Love and Economics example.

Page 51: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

51

5.5 Statistically Dependent Example

Bob can either watch Game of Thrones or Yodeling with the Stars: W={T, Y}. He can either be happy or sad V={H,S}. Joint pdf’s are as follows:

Prob(Thrones and Happy) = 0.7Prob(Thrones and Sad)=0.05Prob(Yodeling and Happy)=0.10Prob(Yodeling and Sad)=0.15

Page 52: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

52

5.5 Statistically Dependent Example

Calculate Marginal pdf’s:Prob(H)=Prob(T and H) + Prob(Y and H)

=0.7+0.10=0.8

Prob(S)=Prob(T and S) + Prob(Y and S)=0.05+0.15=0.2

Prob(H)+Prob(S)=1

Page 53: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

53

5.5 Statistically Dependent Example

Calculate Marginal pdf’s:Prob(T)=Prob(T and H) + Prob(T and S)

=0.7+0.05=0.75

Prob(Y)=Prob(Y and H) + Prob(Y and S)=0.10+0.15=0.25

Prob(T)+Prob(Y)=1

Page 54: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

54

5.5 Statistically Dependent Example

Calculate Conditional pdf’s:

Prob(H|T)=Prob(T and H)/Prob(T)=0.7/0.75=0.93

Prob(H|Y)=Prob(Y and H)/Prob(Y)=0.10/0.25=0.4

Exercise: Calculate the other conditional pdf’s

Page 55: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

55

5.5 Statistically Depressant Example

Notice that since these two variables are NOT statistically independent – Game of Thrones is utility enhancing – our above property does not hold.

P(Happy) ≠ P(Happy given Thrones) 0.8 ≠ 0.93

P(Sad) ≠ P(Sad given Yodeling) 0.2 ≠ 0.4 (1-0.6)

Page 56: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

56

5.5 Conditional Expectations and Variance

Assuming that our variables take numerical values (or can be interpreted numerically), conditional expectations and variances can be taken:

E(P|Q=500)=Σpf(p|Q=500)Var(P|Q=500)=Σ[p-E(P|Q=500)]2f(p|Q=500)

Ie) money spent on a car and resulting utility (both random variables expressed numerically).

Page 57: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

57

5.5 Conditional Expectations and Variance

Example: A consumer can spend $5000 or $10,000 on a car, yielding utility of 10 or 20. The conditional probabilities are :

f(10|$5,000)=0.7f(20|$5,000)=0.3

E(U|P=$5000) =ΣUf(U|P=$5000)=10(0.7) +20(0.3) =13

Var(U|P=$5K) =Σ[U-E(U|P=$5K)]2f(U|P=$5K)

=(10-13)2(0.7)+(20-13)2(0.3)= 21

Page 58: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

58

5.6 Covariance and Correlation

If two random variables are NOT statistically independent, it is important to measure the amount of their interconnectedness.

Covariance and Correlation are useful for this.

Covariance and Correlation are also useful in model testing, as you will learn in Econ 399.

Page 59: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

59

5.6 Covariance

Covariance – a measure of the degree of linear dependence between two random variables. A positive covariance indicates some degree of positive linear association between the two variables (the opposite likewise applies)

Cov(V,W)=E{[W-E(W)][V-E(V)]}

Page 60: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

60

5.6 Discrete and Continuous Covariance

Discrete Random Variable:

Continuous Random Variable:

v w

wvfwEwvEvWVCov ),())())(((),(

v w

wvwvfwEwvEvWVCov ),())())(((),(

Page 61: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

61

5.6 Covariance Example

Joe can buy either a burger ($2) or ice cream ($1) and experience utility of 1 or zero. C={$1, $2}, U={0,1}

Prob($1 and 0)=0.2Prob($1 and 1)=0.6Prob($2 and 0)=0.1Prob($2 and 1)=0.1

Page 62: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

62

5.6 Covariance Example

Prob($1 and 0)=0.2Prob($1 and 1)=0.6Prob($2 and 0)=0.1Prob($2 and 1)=0.1

Prob($1)=0.2+0.6=0.8Prob($2)=0.1+0.1=0.2Prob(0)=0.2+0.1=0.3Prob(1)=0.6+0.1=0.7

Page 63: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

63

5.6 Covariance Example

E(C)=∑cf(c)=$1(0.8)+$2(0.2)=$1.20

E(U)= ∑uf(u)= 0(0.3)+1(0.7)=0.7

Page 64: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

64

5.6 Covariance ExampleE(C) =$1.20E(U) =0.7

Cov(C,U)=∑∑(c-E(C))(u-E(U))f(c,u)=(1-1.20)(0-0.7)(0.2)

+(1-1.20)(1-0.7)(0.6)+(2-1.20)(0-0.7)(0.1)+(2-1.20)(1-0.7))0.1)

=(-0.2)(-0.7)(0.2)+(-0.2)(0.3)(0.6)+(0.8)(-0.7)(0.1)+(0.8)(0.3)(0.1)=0.028-0.036-0.056+0.032=-0.032 (Negative Relationship)

Page 65: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

65

5.6 Correlation

Covariance is an unbounded measure of interdependence between two variables.

Often, it is useful to obtain a BOUNDED measure of interdependence between two variables, as this opens the door for comparison.

Correlation is such a bounded variable, as it lies between -1 and 1.

Page 66: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

66

5.6 Correlation

Correlation Formulas:

VWWV

WVCovVWCorr

),(

),(

)()(

),())())(((),(

wVarvVar

wvfwEwvEvWVCorr v w

Page 67: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

67

5.6 Correlation Example

From the Data above:Var(C) =∑ (c-E(C)2f(v)

=(1-1.20)2(0.8)+(2-1.20)2(0.2)=0.032 + 0.128=0.16

Var(W) =∑ (u-E(U)2f(w)=(0-0.7)2(0.3)+(1-0.7)2(0.7)=0.147 + 0.063=0.21

Page 68: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

68

5.6 Correlation Example

From the Data above:

Corr(C,U) =Cov(C,U)/[sd(C)sd(U)]=-0.032 / [0.16(0.21)]1/2

=-0.175

Still represents a negative relationship.

Page 69: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

69

5.6 Graphical CorrelationIf Correlation = 1, observations of the two variables lie upon an upward sloping line

If Correlation = -1, observations of the two variables lie on a downward sloping line

If Correlation is between 0 and 1, observations of the two variables will be scattered along an upward sloping line.

If Correlation is between 0 and -1, observations of the two variables will be scattered along a downward sloping line.

Page 70: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

70

5.6 Correlation, Covariance and Independence

Covariance, correlation and independence have the following relationship:

If two random variables are independent, their covariance (correlation) is zero.

INDEPENDENCE => ZERO COVARIANCE

If two variables have zero covariance (correlation), they may or may not be independent.

ZERO COVARIANCE ≠> INDEPENDENCE

Page 71: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

71

5.6 Correlation, Covariance and Independence

INDEPENDENCE => ZERO COVARIANCE

ZERO COVARIANCE ≠> INDEPENDENCE

From these relationships, we know that

Non-zero Covariance => Dependencebut

Dependence ≠> Non-zero Covariance

Page 72: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

72

5.7 POPULATION VS. SAMPLE DATAPopulation Data – Full information on the ENTIRE population.-Includes population probability (pdf)-Uses the previous formulas-ex) data on an ENTIRE class

Sample Data – Partial information from a RANDOM SAMPLE (smaller selection) of the population-Individual data points (no pdf)-Uses the following formulas-ex) Study of 2,000 random students

Page 73: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

73

5.7 EstimatorsPopulation Expected Value:

μ = E(Y) = Σ y f(y)

Sample Mean:

__Note: From this point on, Y may be expressed as Ybar (or any other variable - ie:Xbar). For example, via email no equation editor is available, so answers may be in this format.

N

YY i

Page 74: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

74

5.7 Estimators

Population Variance:

σY2 = Var(Y) = Σ [y-E(y)]2 f(y)

Sample Variance:

1

)( 22

N

YYS iy

Page 75: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

75

5.7 Estimators

Population Standard Deviation:

σY = (σ2)1/2

Sample Standard Deviation:

Sy = (Sy2)1/2

Page 76: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

76

5.7 Estimators

Population Covariance:

Cov(V,W)=∑∑(v-E(v))(w-E(w))f(v,w)

Sample Covariance:

1

))((),(

N

WWVVWVCov ii

Page 77: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

77

5.7 Estimators

Population Correlation:

σvw = corr(V,W)= Cov(V,W)/ σv σw

Sample Correlation:

rvw = corr(V,W)= Cov(V,W)/ Sv Sw

Page 78: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

78

5.7 Estimators

Population Regression Function:

Yi = β1 + β2Xi + єiEstimated Regression Function:

ii XY 21

ˆˆˆ

Page 79: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

79

5.7 Estimators

OLS Estimation:

B2hat = ∑(Xi-Xbar)(Yi-Ybar)

---------------------- ∑(Xi-Xbar)2

B1hat = Ybar – B2hatXbar ^

Note: B2 may be expressed as b2hat

XY

S

YXCov

XX

YYXX

X

i

ii

21

22

22

ˆˆ

),(ˆ

)(

))((ˆ

Page 80: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

80

5.7 Estimators Example

Given the data set:

Find sample means, variance, covariance, correlation, and ols estimation

Price 4 3 3 6

Quantity 10 15 20 15

Page 81: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

81

5.7 Estimators Example

Sample Means:

Pbar = (4+3+3+6)/4 = 4

Qbar = (10+15+20+15)/4 = 15

Price 4 3 3 6

Quantity 10 15 20 15

Page 82: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

82

5.7 Estimators Example

Sample Variance:

Sp2 = [(4-4)2+(3-4)2+(3-4)2+(6-4)2]/(N-1)

=(0+1+1+4)/3=2

Sq2 =

[(10-15)2+(15-15)2+(20-15)2+(15-15)2]/(N-1)=(25+0+25+0)/3=50/3

Price 4 3 3 6 Pbar = 4

Quantity 10 15 20 15 Qbar=15

Page 83: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

83

5.7 Estimators Example

Sample Covariance:

Cov(p,q)= [(4-4)(10-15)+(3-4)(15-15)+(3-4)(20-15)+(6-4)(15-15)]/(N-

1) =[ 0 + 0 -5 +0] /3 = -5/3

Price 4 3 3 6 Pbar = 4

Quantity 10 15 20 15 Qbar=15

Page 84: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

84

5.7 Estimators Example

Sample Correlation

Corr(p,q)= Cov(p,q)/SpSq

= 5/3 / [2(50/3)]1/2

= -5/3 / (10/31/2)= -0.2886

Price 4 3 3 6 Pbar = 4

Quantity 10 15 20 15 Qbar=15

Page 85: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

85

5.7 Estimators Example

Ols Estimation

B2hat = ∑(Xi-Xbar)(Yi-Ybar)

---------------------- ∑(Xi-Xbar)2

= [(4-4)(10-15)+(3-4)(15-15)+(3-4)(20-15)+(6-4)(15-15)-----------------------------------------------------------------------------

(4-4)2+(3-4)2+(3-4)2+(6-4)2

=-5/6

Price 4 3 3 6 Pbar = 4

Quantity 10 15 20 15 Qbar=15

Page 86: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

86

5.7 Estimators Example

Ols Estimation

B1hat = Ybar – (B2hat)(Xbar)= 15- (-5/6)4= 90/6 + 20/6= 110/6

Yhat = 110/6 –(5/6)XQhat = 110/6 –(5/6)P

Price 4 3 3 6 Pbar = 4

Quantity

10 15 20 15 Qbar=15

ii

ii

PQ

XY

6

5

6

110ˆ

6

5

6

110

Page 87: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

87

5.7.1 Estimators as random variables

Each of these estimators will give us a result based upon the data available.

Therefore, two different data sets can yield two different point estimates.

Therefore the value of the point estimate can be seen as being the result of a chance experiment – obtaining a data set.

Therefore each point estimate is a random variable, with a probability distribution that can be analyzed using the expectation and variance operator.

Page 88: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

88

5.7.1 Estimators DistributionSince the same mean is a variable, we can easily apply expectation and summation rules to find the expected value of the sample mean:

YY

Yi

ii

i

NN

YE

NYE

NYE

YENN

YEYE

N

YY

1

1)(

1

1

Page 89: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

89

5.7.1 Estimators DistributionIf we make the simplifying assumption that there is no covariance between data points (ie: one person’s consumption is unaffected by the next person’s consumption), we can easily calculate variance for the sample mean:

N

NN

YVar

NYVar

NYVar

YVarNN

YVarYVar

YY

Yi

ii

22

2

222

2

1

1)(

1

1

Page 90: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

90

5.7.1 Estimators DistributionAlthough we can’t observe the population variance of Ybar, we can calculate its sample variance, therefore,

N

SYSampleVar

NYVar

Y

Y

2

2

Page 91: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

91

5.8 Common Economic Distributions

In order to test assumptions and models, economists need be familiar with the following distributions:

Normal t Chi-square FFor full examples and explanations of these

tables, please refer to a statistics text.

Page 92: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

92

5.8 Normal Distribution

The Normal (Z) Distribution produces a symmetric bell-shaped curve with a mean of zero and a standard deviation of one.

The probability that z>0 is always 0.5 The probability that z<0 is always 0.5 Z-tables generally (but not always) measure

area from the centre Probabilities decrease as you move from the

center

Page 93: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

93

5.8 Normal Example

Weekly weight gain can be argued to have a normal distribution:

On average, no weight is gained or lost A few pounds may be gained or lost It is very unlikely to lose or gain many pounds

Find Prob(Gain between 0 and 1 pound)

Prob(0<z<1) = 0.3413 = 34.13%

Page 94: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

94

5.8 Normal Example

Find Prob(Lose more than 2 pounds)

Prob(z<-2) = 0.5 - 0.4772 = 0.0228 (2.28%)

Find Prob (Do not gain more than 2 pounds)

Prob(z<2) = 0.5+0.4772 = 0.9772(97.72%)

Page 95: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

95

5.8 Converting to a normal distribution

Z distributions assume that the mean is zero and the standard deviation is one.

If this is not the case, the distribution needs to be converted to a normal distribution using the following formula:

x

xxZ

Page 96: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

96

5.8 Assignment Example

The average for the Fall 2005 Assignment #2 was 82%. Standard deviation was aprox. 6. What is the probability of a random student getting above 90%?

Prob(Y>90) = Prob[{(Y-82)/6}>{(90-82)/6}]

= Prob(Z>1.33)= 0.5 - Prob (0<Z<1.33)= 0.5 - 0.4082 =9.18%

Page 97: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

97

5.8 Assignment Example

What is the probability of getting a mark in the 80’s?

Prob(79<Y<90)= Prob[{(79-82)/6}<{(Y-82)/6}<{(90-82)/6}]= Prob(-0.5<Z<1.33)= Prob(0<Z<0.5) + Prob(0<Z<1.33)=0.1915 + 0.4082=0.5997=59.97%

Page 98: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

98

5.8 Assignment Example

Find the mark (Y*) wherein there is a 15% probability that Y<Y* (Bottom 15% of the class)

(Since 0.15<50, Z*<0)Prob(Z<Z*) = 0.5-Prob(0<Z<-Z*)

0.15 = 0.5-Prob(0<Z<-Z*) Prob (0<Z<-Z*)=0.35

From tables, -Z *= 1.04Therefore Z* = -1.04

Page 99: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

99

5.8 Example Continued

We know that

Z = (x-μ)/σSoX= μ+z(σ)X= 82+(-1.04)6X= 82-6.24X= 75.76There is a 15% chance that a student

scored less than 75.76%

Page 100: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

100

5.8 Other Distributions

All other distributions depend on DEGREES OF FREEDOM

Degrees of Freedom are generally dependant on two things:

Sample size (as sample rise rises, so does degrees of freedom)

Complication of test (more complicated statistical tests reduce degrees of freedom)

Simple conclusions are easier to make than complicated ones

Page 101: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

101

5.8 t-distribution

t-distributions can involve 1-tail or 2-tail tests

Interpolation is often needed within the table

Example 1:

Find the critical t-values (t*) that cuts of 1% of both tails with 27df

(Note: 1% off both tails = 0.5% off each tail)

For p=0.495, df 27 gives t*=2.77, -2.77

Page 102: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

102

5.8 t-distribution

Example 2: Find the critical t-value (t*) that cuts of 1%

of the right tail with 35dfFor 1T=0.01, df 30 gives t*=2.46

df 40 gives t*=2.42

Since 35 is halfway between 30 and 40, a good approximation of df 35 would be:

t*=(2.46+2.42)/2 = 2.44

Page 103: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

103

5.8 t-distribution

Typically, the following variable (similar to the normal Z variable seen earlier) will have a t-distribution: (we will see examples later)

)(

)(

EstimatorsdSample

EstimatorEEstimatort

Page 104: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

104

5.8 chi-square distribution

Chi-square distributions are 1-tail testsInterpolation is often needed within the

tableExample: Find the critical chi-squared value that cuts

off 5% of the right tail with 2dfFor Right Tail = 0.05, df=2Critical Chi-Squared Value = 5.99

Page 105: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

105

5.8 F-distribution

F-distributions are 1-tail testsInterpolation is often needed within the

tableExample: Find the critical F value (F*) that cuts of 1%

of the right tail with 3df in the numerator and 80df in the denominator

For Right Tail = 0.01, df1=3, df2=80,df2=60 gives F*=4.13 df2=120 gives

F*=3.95

Page 106: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

106

5.8 Interpolation

df2=60 gives F*=4.13 df2=120 gives F*=3.95

Since 80 is 1/3rd of the way between 60 and 120:

60 80 100 120Our F-value should be 1/3 of the way

between 4.13 and 3.95:4.13 ? 3.95Approximization:

F*=4.13-(4.13-3.95)/3=4.07

Page 107: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

107

5.8 Distribution Usage

Different testing of models will use different tables, as we will see later in the course.

In general:1) Normal tables do distribution

estimations2) t-tables do simple tests3) F-tables do simultaneous tests –Prob(a &

b)4) Chi-squared tables do complicated tests

devised by mathematicians smarter than you or I (they invented them, we use them)

Page 108: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

108

5.9 Confidence Intervals

Thus far, all our estimates have been POINT estimates; a single number emerges as our estimate for an unknown parameter.

Ie)

Even if we have good data and have an estimator with a small variance, the chances that our estimate will equal our actual value are very low.

Ie) If a coin is expected to turn heads half the time. The chance that it actually does that in an experiment is very low

74.3X

Page 109: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

109

5.9.1 Constructing Confidence Intervals

Confidence intervals or interval estimators acknowledge underlying uncertainties and are an alternative to point estimators

Confidence intervals propose a range of values in which the true parameter could lie, given a range of probability.

Confidence intervals can be constructed since our point estimates are RANDOM VARIABLES.

Page 110: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

110

5.9.1 Degrees of FreedomWhen given actual population data, we

converted into a z-score:

Z = (x-μ)/σ

With random samples, we convert into a t-score:

t = (x – E(x)) / sample sd(x) with n-1 degrees of freedom

This is proven by various complicated central limit theorems

Page 111: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

111

5.9.1 CI’s and Alpha

Probabilities of confidence intervals are denoted by α (alpha).

Given α, we construct a 100(1- α)% confidence interval. If α=5%, we construct a 95% confidence interval.

P(Lower limit<true parameter<Upper limit)=1- α

Page 112: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

112

5.9.1 FormulaGiven a repeated sample, we want to construct confidence intervals for the mean such that: 1*}/)(*{ tsXtP

XX

-t* t*

(1-α)%

Where t has n-1 degrees of freedom, and ±t* cuts α/2 off both tails.

t

Page 113: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

113

5.9.1 Formula

1}**{XXXstXstXP

Rearranging we get:

(1-α)%

XstX * X

stX * μX

Page 114: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

114

5.9.1 Formula

Our final formula becomes:

XstXCI

X*

Or in general:

estimatetruevalue stestimateCI *Which gives us an upper and lower

bound for our CI.

Page 115: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

115

5.9.1 Example

Flipping a coin has given us 25 heads with a value of 1, and 15 tails with a value of zero. Find the 95% CI if n=40.

We therefore have:

625.040

25

40

)0(15)1(25

C

Page 116: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

116

5.9.1 IMPORTANT - Estimated Standard Deviation of a Sample

MeanWe have already seen that sample standard deviation is found through the formula:

Standard deviation of a sample mean is found through:

1

)( 2

N

YYS iY

Nss YY/

Page 117: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

117

5.9.1 Example

49.039

375.9

140

)625.00(15)625.01(25

1

)(

22

2

C

C

iC

S

S

N

CCS

077.040/49.0/ Nss CC

Page 118: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

118

5.9.1 Example

A 95% CI has 2.5% off each tail. If n=40,t* = 2.02

]78.0,47.0[

)077.0(02.2625.0

*

C

C

C

CI

CI

stCCIC

Page 119: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

119

5.9.1 Interpretation:

In this example, we have a confidence interval of [0.47, 0.78].

In other words, in repeated samples, 95% of these intervals will include the probability of getting a “heads” when flipping a coin.

Page 120: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

120

5.9.1 Confidence Requirements

In order to construct a confidence interval, one needs:

a) A point estimate of the parameterb) Estimated standard deviation of the

parameterc) A critical value from a probability

distribution (or α and the sample size, n)

Page 121: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

121

5.10 Hypothesis TestingAfter a model has been derived, it is often useful to test

various hypotheses: Are a pair of dice weighted towards another number

(say 11)? Does a player get blackjack more often than he

should? Will raising tuition increase graduation rates? Will soaring gas costs decrease car sales? Will the recession affect Xbox sales? Does fancy wrapping increase the appeal of

Christmas presents? Does communication between rivals affect price?

Page 122: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

122

5.10 Hypothesis Testing

Question: Is our data CONSISTENT with a particular parameter having a specific value?

Although we may observe an outcome (ie: a Blackjack player has 150% of his starting chips) (assume the average outcome should be 80%),

We need to test if this outcome is:1) Consistent with typical chance or 2) Inconsistent – perhaps showing cheating

Page 123: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

123

5.10 Hypothesis Testing

Testing Consistency of a Hypothesized Parameter:

1) Form a null and an alternate hypothesis.H0 = null hypothesis = variable is equal to a

numberHa = alternate hypothesis = variable is not

equal to a numberEX)H0: Outcome=0.8Ha: Outcome≠0.8

Page 124: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

124

5.10 Hypothesis Testing

Testing Consistency of a Hypothesized Parameter:

2) Collect appropriate sample data3) Select an acceptable probability (α) of rejecting

a null hypothesis when it is true-Type one error

-Lower α, more unlikely to find a sample that rejects the null hypothesis

- α is often 10%, 5%, or 1%

Page 125: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

125

5.10 Hypothesis Testing

Testing Consistency of a Hypothesized Parameter:

4) Construct an appropriate test statistic-ensure the test statistic can be calculated from

the sample data-ensure its distribution is appropriate to that being

tested (ie: t-statistic for test for mean)

Page 126: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

126

5.10 Hypothesis Testing

Testing Consistency of a Hypothesized Parameter:

5) Establish (do not) reject regions-Construct bell curve

-Tails are Reject H0 regions

-Centre is Do not Reject H0 regions

Page 127: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

127

5.10 Hypothesis Testing

Testing Consistency of a Hypothesized Parameter:

6) Compare the test statistic to the critical statistic-If the test statistic lies in the tails, reject-If the test statistic doesn’t lie in the tails, do not

reject-Never Accept

7) Interpret Results

Page 128: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

128

5.10 Hypothetical ExampleJohnny is a poker player who wins an average of 8 times out of ten (the standard deviation is 0.5). Test the hypothesis that Johnny never wins.

1) H0: W=0Ha: W ≠0

2) We have estimated W=8. The standard deviation was 0.53) We let α=1%; we want a strong result.4) t= (estimate-hypothesis)/sd = (8-0)/0.5=165) t* for n-1=119, α=1%: t*=2.62

6) t*<t; Reject H0

Page 129: 1 Chapter 5 – Statistical Review Chapter 5 is a brief review of statistical concepts. It is NOT a replacement for a statistics course. By the end of the

129

5.10 Hypothetical Example

7) Allowing for a 1% chance of a Type 1 error, we reject the null hypothesis that Johnny never wins at Poker.

According to our data, it is consistent that Johnny sometimes wins at Poker.