Chapter 5 Joint Probability Distributions and Random Samples 5.1 - Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3

Chapter 5Joint Probability Distributions and Random Samples

5.1 - Jointly Distributed Random Variables 5.2 - Expected Values, Covariance, and

Correlation

5.3 - Statistics and Their Distributions

5.4 - The Distribution of the Sample Mean

5.5 - The Distribution of a Linear Combination

X

Den

sity

X = Age of women in U.S. at first birth

Population Distribution of X

Suppose X ~ N(μ, σ), then…

… etc….

μ = 25.4

σ = 1.5

x1

x

x

x3

x

x4

x2

x

x5

x

Each of these individual ages x is a particular value of the random variable X. Most are in the neighborhood of μ, but there are occasional outliers in the tails of the distribution.

[ ]E X


Sample,n = 400

X

Den

sity

Sample,n = 400

Sample,n = 400

Sample,n = 400

1x2x

3x4x 5x

… etc….

Sample,n = 400


σ = 1.5


μ = 25.4

Each of these sample mean values is a “point estimate” of the population mean μ…

x

How are these values distributed?

and thus a particular valueof the random

variable .X

[ ]E X

X

Den

sity

μ =

σ = 1.5

X

Den

sity

μ =

Suppose X ~ N(μ, σ), then…Suppose X ~ N(μ, σ), then…


Sampling Distribution of X

,n

X ~ N , for any sample size n.

… etc….


1x2x

3x4x 5x

μ = 25.4

How are these values distributed?


x


variable .X

[ ]E X

The vast majority of sample means are extremely close to μ, i.e., extremely small variability.

“standard error”1.5 yrs .075 yrs

400n

[ ]E X

Suppose X ~ N(μ, σ), then…Suppose X ~ N(μ, σ), then…


X

Den

sity

μ =

σ = 2.4

X

Den

sity

μ =



for any sample size n.,n

X ~ N ,

… etc….


1x2x

3x4x 5x

μ = 25.4


400n


x


variable .X


[ ]E X

for large sample size n.

X ~ Anything with finite μ and σ Suppose X N(μ, σ), then…


X

Den

sity

μ =

σ = 2.4

X

Den

sity

μ =



for any sample size n.,n

X ~ N ,

… etc….


1x2x

3x4x 5x

for large sample size n.

μ = 25.4


400n


variable .X



x

[ ]E X

Den

sity

~ ( , )X N

XD

ensi

ty

~ ,X Nn

n

X

“standard error”

Den

sity

~ ( , )X N

Den

sity

,X Nn

n

X


• Probability that a single house selected at random costs less than $300K = ? ( $300K)P X

Example:X = Cost of new house ($K)

= Cumulative area under density curve for X up to 300.

300

~ ( , )X N XZ

$270K

$75KX

~ (0,1)N300 270 0.475

= Z-score

~ (0,1)N300 270 0.475

XZ

Den

sity

~ ( , )X N

Den

sity

,X Nn

n

X




300

~ ( , )X N

$270K

$75KX

= Z-score

( 0.4)P Z 0.6554

Den

sity

~ ( , )X N

Den

sity

,X Nn

n

X


• Probability that the sample mean of n = 36 houses selected at random is less than $300K = ?


300

( $300K)P X

300

$270K

$75K36

$12.5K

$270K

$75KX

= Cumulative area under density curve for up to 300. X


~ ( , )X N 300 270 0.475

= Z-score

( 0.4)P Z 0.6554XZ

Den

sity

~ ( , )X N

Den

sity

,X Nn

n

X




300

( $300K)P X

300

$270K

$75K36

$12.5K

$270K

$75KX


~ ( , )X N 300 270 0.475

= Z-score

( 0.4)P Z 0.6554XZ

XZn

= Z-score( 2.4)P Z 0.9918 300 270 2.412.5

Den

sity

( , )X N

Den

sity

,X Nn

n

X


approximately

mild skew

large

X

Den

sity

has andX finite

Den

sity

,X Nn

n

X


approximately

continuous or discrete, large~ ( , ).X Dist as n ,

~ CENTRAL LIMIT THEOREM ~

X

Den

sity

has andX finite

Den

sity

,X Nn

n

X


approximately

continuous or discrete, large~ ( , ).X Dist as n ,

~ CENTRAL LIMIT THEOREM ~


$270K

$75KX

Den

sity

,X Nn

n

X

“standard error”D

ensi

ty

has andX finite

$270K

$75K


300 300

$270K

$75K36

$12.5K

X


( $300K)P X


XZn

= Z-score( 2.4)P Z 0.9918 300 270 2.412.5

= Cumulative area under density curve for X up to 300.

16

17

x f(x)

0 0.5

10 0.3

20 0.22 2 2 2 2 2

(0)(0.5) (10)(0.3) (20)(0.2) ($)

(0) (0.5) (10) (0.3) (20) (0.2) (

7.00

6 )17) ($

X

X

18

0 .255 .30 =.15+.1510 .29 =.10+.09+.1015 .12 =.06+.0620 .04

2 2 2 2

2 2 2

(0)(.25) (5)(.15) (10)(.29)(15)(.12) (20)(.04) ($)(0) (.25) (5) (0.15) (10) (.29)

(15) (.12) (20)(.04) (7) 30.5 ($ )

7.00

61 2

X

X

x ( )f x

19

x ( )f x

2 2

($)($

7.0)

061 3

X

X

20

21

possibly log-normal…

each based on 1000 samples

but remember Cauchy and 1/x2, both of which had nonexistent …CLT may not work!

heavily skewed tail

More on CLT…

More on CLT…

X

Den

sity



Random Variable

If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA!

1x

BUT…

X ~ Dist(μ, σ)

[ ]X E X

1X

More…



X

Den

sity

Random Variable

If this first individual has been randomly chosen, and the value of X measured, then the result is a fixed number x1, with no random variability… and likewise for x2, x3, etc. DATA!

BUT…However, if this is not the case, then this first “value” of X is unknown, thus can be considered as a random variable X1 itself… and likewise for X2, X3, etc.The collection {X1, X2, X3, …, Xn} of “independent, identically-distributed” (i.i.d.) random variables is said to be a random sample.

X ~ Dist(μ, σ)

[ ]X E X

1

1 [ ]n

ii

E Xn

More…X = Age of women in U.S.

at first birth


X

Den

sity

Random Variable

X ~ Dist(μ, σ)

[ ]X E X

Sample,size n

1x

Den

sity

X

n

[ ]X E X

etc……

Claim: ( , )n

X N XX i.e., [ ] ,XE X

Proof:1

1 n

ii

X Xn

1 2

for any random sample, , , nX X X

1

1[ ]n

ii

E X E Xn


for any n

1

1 1 ( )n

X X Xi

nn n

[ ]X

iE X

21

1 ( )n

ii

Var Xn

More…X = Age of women in U.S.

at first birth


X

Den

sity

Random Variable

X ~ Dist(μ, σ)

[ ]X E X 1x

Den

sity

X

n

[ ]X E X

etc……

Claim: ( , )n

X N 2

2 XX n

( )i.e., ( ) ,Var XVar Xn

Proof:1

1 n

ii

X Xn

1 2

for any random sample, , , nX X X

1

1( )n

ii

Var X Var Xn


for any n

2 21

1 1 ( )( ) ( )n

i

Var XVar X nVar Xnn n

More on

CLT…Recall…Normal Approximation to the Binomial Distribution

26

continuous discrete

P(Success) = P(Failure) = 1 –

Discrete random variableX = # Successes (0, 1, 2,…, n) in a random sample of size n

Suppose a certain outcome exists in a population, with constant probability . We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses).

Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function”

f(x) = , x = 0, 1, 2, …, n.

x n xnx (1 )

~ Bin( , )X n

Normal Approximation to the Binomial Distribution

27

continuous discrete





f(x) = , x = 0, 1, 2, …, n.

x n xnx (1 )

( , )

(1 )

X Nn

n

~ Bin( , )X n Furthermore, if 15 and (1 ) 15, thenn n

CLT

See Prob 5.3/7

http://pages.stat.wisc.edu/~ifischer/Intro_Stat/Lecture_Notes/5_-_Central_Limit_Theorem/5.3_-_Problems.pdf




Normal Approximation to the Binomial Distribution

28

continuous discrete





f(x) = , x = 0, 1, 2, …, n.

x n xnx (1 )

, (1 )X N n n

~ Bin( , )X n Furthermore, if 15 and (1 ) 15, thenn n

CLT

Xn

??

(1 ),X Nn n

29

POPULATIONPARAMETER ESTIMATOR

(Not to be confused with an “estimate”)

SAMPLING DISTRIBUTION

(or approximation)

1 2

1 2

ˆ ,

where { , , , } is a .

n

n

X X XX

nX X X

random sample

,Nn

= “true” population mean of a numerical random variable X, where ( , )X N

= “true” population probability of Success, where “Success vs. Failure” are the only possible binary outcomes.

ˆ , where

# Successes in a randomsample of

Xn

Xn

Bernoulli trials

(1 ),Nn

In general….

ˆ ??? ???

(may requiresimulation)

want “nice”

properties

[ ]E X

ˆ[ ]E

30

POPULATIONPARAMETER ESTIMATOR

(Not to be confused with an “estimate”)


(or approximation)

ˆ ??? ???


want “nice”

properties

In general….

ˆ ˆThe of a point estimator of a parameter is defined by [ ] .ˆ ˆ is said to be an estimator of if bias = 0, i.e., [ ] = .

E

E

bias

unbiased

Def :

ˆExamples: is an estimator of is an estimator of X

unbiasedunbiased

2

2 21( )

= is an estimator of 1

n

ii

X XS

n

unbiased

(see page 253)

POPULATION

31

PARAMETER ESTIMATOR(Not to be confused with an “estimate”)


(or approximation)

ˆ ??? ???


want “nice”

properties

In general….

ˆ[ ]E Bias2 2( ) [ ]Var Y E Y E Y Recall:

ˆLet :Y 22ˆ ˆ ˆ( ) ( ) [ ]Var E E

22ˆ ˆ ˆ( ) ( ) [ ]E Var E

22ˆ ˆ ˆ( ) ( ) [ ]E Var E

Rearrange terms: 2ˆ[ ] [ ]E E

= fixed, nonrandom

POPULATION

32

PARAMETER ESTIMATOR(Not to be confused with an “estimate”)


(or approximation)

ˆ ??? ???


want “nice”

properties

In general….

ˆ[ ]E Bias2 2( ) [ ]Var Y E Y E Y Recall:

2 2ˆ ˆ( ) ( )E Var Bias2 2

(MSE)

ˆ ˆ( ) ( )E Var Mean Square Error

Bias

Ideally, we would like to minimize MSE, but this is often difficult in practice. However, if Bias = 0, then MSE = Variance, so it is desirable to seek Minimum Variance Unbiased Estimators (MVUE)…

Documents

Chapter 5 Joint Probability Distributions and Random Samples 5.1 - Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3