60
21/12/27 www.uic.edu.hk/~xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Embed Size (px)

Citation preview

Page 1: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 1

STAT 4060 Design and Analysis of Surveys

Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Page 2: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 2

What we have learned:

1. Simple random sampling, confidence interval and choice of sample size.

2. Ratio and regression estimators, systematic sampling.

3. Stratified random sampling, allocation of stratum weights.

4. Cluster sampling.

Page 3: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 3

Population Parameter

Page 4: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 4

Sample Statistics

Page 5: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 5

Simple random sampling

We shall consider the use of simple random samples for estimating the three population characteristics:

the population mean

the population total

and the proportion P.

We shall discuss how any estimators behave in terms of their sampling distributions. The variance is often a crucial measure.

1

1, denoted , ;

N

jj

Y Y YN

1

, denoted , ;N

T T jj

Y Y Y

Page 6: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 6

Page 7: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 7

Proof of (1.9)

n

SfS

Nn

nS

Nn

N

yynnyVarn

YnYyynnYyVarnn

YnyEynnEyn

YyyyEn

YnyEyEyEyVar

jii

jii

jii

jijii

n

ii

222

22222

2222

222

22

1

22

)1(11

)),cov()1()((1

})),)(cov(1())(({1

})1({1

)(1

)/()()()(

Page 8: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 8

Confidence interval for the population mean

Page 9: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 9

Page 10: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 10

Page 11: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 11

Page 12: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 12

Page 13: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 13

Page 14: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 14

Ratio Estimation and Regression Estimation(Chapter 4, Textbook, Barnett, V., 1991)

2.1 Estimation of a population ratio: The ratio estimator In some situations it is useful to estimate a (positive) ratio of two

population characteristics: the totals, or means, of two (positive) variables X and Y.

Page 15: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

The sample average of ratio

unbiased for estimating the population mean

Two obvious estimators of R are

The ratio of the sample averages

is widely used.

23/4/19 www.uic.edu.hk/~xlpeng 15

1 1

1 1( / )

n n

i i ii i

r y x rn n

/ /T Tr y x y x

1 1

1 1( / )

N N

j j jj j

R R Y XN N

but biased for estimating R

Page 16: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

The bias in estimating R by r

The bias in estimating R by r is the expectation of the following difference:

(2.3)

23/4/19 www.uic.edu.hk/~xlpeng 16

( ) /r R y Rx x 1

1y Rx x X

X X

2

1 .y Rx x X x X

X X X

2

[( )( )]( )

y Rx E y Rx x XE r R E

X X

Page 17: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Discussion about the bias

23/4/19 www.uic.edu.hk/~xlpeng 17

Page 18: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 18

(2.5)

2

21

2 2 22

( )1

1

12

Nj j

j

Y YX X

Y RXf

nX N

fS RS R S

nX

( ) ( )j j j j jZ Y RX Y Y RX RX

Page 19: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.2 Ratio estimation of a population mean or total

23/4/19 www.uic.edu.hk/~xlpeng 19

( / )Ry rX X x y

( / )TR T Ry rX NX x y Ny

Page 20: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Variance of ratio estimator

23/4/19 www.uic.edu.hk/~xlpeng 20

Page 21: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 21

Page 22: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 22

The estimate of the ratio R of the present weight to prestudy weight for the herd is:

Solution:

000929.012

646.848,8)

500

121(

880

11)(

22

2

rSXn

frVar

030485.0000929.0)( rse

Page 23: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 23

This examines when the variance of (2.10) could be less or greater than that of (1.9)

Page 24: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 24

Page 25: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

Condition (2.15.1) demands that X and Y be linearly related, but, if the linear relationship does not pass through the origin, then, it suggests considering an alternative estimator known as regression estimator.

23/4/19 www.uic.edu.hk/~xlpeng 25

Page 26: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 26

A practicable simple linear regression model is (2.17)

.

An ideal (perfect) linear relationship is

(2.16)

)( jj XXbYY

(2.18)

jjj EXXbYY )(

Page 27: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 27

Consider the average (mean) of either (2.16) or (2.17),

( )Ly y b X x (2.19)

Page 28: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 28

2( ) [( ) ]L LVar y E y Y 2

2 2 2

2 2

{[( ) ( )] }

1( 2 )

1(1 )

L

Y YX X

Y YX

E y Y b x X

fS bS b S

nfS

n

21( )Y

fS Var y

n

(2.20)

y

Page 29: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 29

From (2.20),

2 2 21min { ( )} min ( 2 )b L b Y YX X

fVar y S bS b S

n

2 21(1 )Y YX

fS

n

The minimum is obtained with 2min / /YX X YX Y Xb b S S S S

Y

Thus the most efficient regression estimator of is

( / )( )L YX Y Xy y S S X x

(2.22)

Page 30: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 30

The optimal value of b of (2.22) suggests the obvious estimate:

1min 2 2

1

( )( )( )

( )

n

i iyx in

x ii

y y x xsb b

s x x

(2.24)

( )Ly y b X x (2.25)

which enjoys the following asymptotic properties:

1( ) ( )LE y Y O n

Page 31: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.3 Regression estimation

23/4/19 www.uic.edu.hk/~xlpeng 31

Asymptotic properties:

( )LVar y

2 2 2 3/21( / ) ( )Y YX X

fS S S O n

n

21( ) ( )L y yx

fV y s bs

n

(2.27)

(2.26) )()1(1 2/322

nOSn

fXYX

Page 32: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

2.4 Comparison of ratio and regression estimators

23/4/19 www.uic.edu.hk/~xlpeng 32

Page 33: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 33

2.4 Comparison of ratio and regression estimators

2 2 2 21( ) ( ) 2R L X YX Y X YX Y

fV y Var y R S R S S S

n

21X YX Y

fRS S

n

Page 34: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 34

Stratified Simple Random Sampling(Chapter 5, Textbook, Barnett, V., 1991)

Consider another sampling method:

Page 35: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Some Notations

23/4/19 www.uic.edu.hk/~xlpeng 35

To estimate the population mean of a finite population, we assume that the population is stratified, that is to say it has been divided into k non-overlapping groups, or strata, of sizes:

The stratum means and variances are denoted by

and

Page 36: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 36

Estimation of Population Characteristicsin Stratified Populations

Page 37: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Estimating

23/4/19 www.uic.edu.hk/~xlpeng 37

The stratified sample mean is defined as

Here we assume the weights Wi=Ni /N is given (known).

Page 38: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

The mean and variance of

23/4/19 www.uic.edu.hk/~xlpeng 38

Note that

Since

Because it is assumed that “sampling in different strata are independent”, that is

Page 39: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 39

Simple random sampling

Stratified sampling with proportional allocation

Page 40: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 40

(a) When stratum size is large enough:

N

N i

Page 41: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 41

(b) When stratum size is not large enough:

The stratified sample mean will be more efficient than the s.r. sample mean

If and only if variation between the stratum means is sufficiently large

compared with within-strata variation!

Page 42: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Optimum Choice of Sample Size

23/4/19 www.uic.edu.hk/~xlpeng 42

To achieve required precision of estimation Some cost limitation

The simplest form assumes that there is some overhead cost, c0 of administering

The survey, and that individual observations from the ith stratum each cost an

Amount ci. Thus the total cost is:

Page 43: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 43

I. Minimum variance for fixed cost (Cont.)

Page 44: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 44

I. Minimum variance for fixed cost (Cont.)

Then

Page 45: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

II. Minimum cost for fixed variance

23/4/19 www.uic.edu.hk/~xlpeng 45

Consider to satisfy for the minimum possible total cost.

Page 46: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 46

iii nwnwGiven ,

Page 47: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 47

Page 48: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Comparison of proportional allocation and optimum allocation

23/4/19 www.uic.edu.hk/~xlpeng 48

Thus the extent of the potential gain from optimum (Neyman) allocation

Compared with proportional allocation depends on the variability of the

stratum variances: the larger this is, the greater the relative advantage

Of optimum allocation.

Page 49: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 49

Cluster Sampling(Chapter 6, Textbook, Barnett, V., 1991)

Page 50: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 50

Page 51: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 51

Page 52: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 52

Page 53: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 53

Page 54: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 54

Comparison of s.r. sampling with cluster sampling

Page 55: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Systematic Sampling

23/4/19 www.uic.edu.hk/~xlpeng 55

Systematic sample can be viewed as a cluster sample of size m=1!

Systematic sample mean

Page 56: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Systematic Sampling

23/4/19 www.uic.edu.hk/~xlpeng 56

Comparison of s.r. sampling with systimatic sampling

Page 57: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 57

Page 58: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

Two ways of estimating ---

23/4/19 www.uic.edu.hk/~xlpeng 58

Y

Page 59: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19www.uic.edu.hk/~xlpeng 59

n

Page 60: 2015-7-3 xlpeng 1 STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%

23/4/19 www.uic.edu.hk/~xlpeng 60