Introduction to Sampling Theoryhome.iitk.ac.in/~shalab/swayamprabha/samp/sp-sampling-lect-37.pdf ·...

Preview:

Citation preview

Introduction to Sampling Theory

Lecture 37Two Stage Sampling (Subsampling)

ShalabhDepartment of Mathematics and  Statistics

Indian Institute of Technology Kanpur

1

Slides can be downloaded from http://home.iitk.ac.in/~shalab/sp

2

Two Stage Sampling With Unequal First Stage Units:

Consider two stage sampling when the first stage units are of

unequal size and SRSWOR is employed at each stage.

Let value of jth second stage unit of the ith first stage unit.

number of second stage units in ith first stage unit.

total number of second stage units in the population.

number of second stage units to be selected from ith

first stage units, if it is in the sample.

total number of second stage units in the sample.

:ijy

:iM

01

:N

ii

M M

:im

01

:n

ii

m m

3

( )1

1

1

1 1 1

1

1

1

1

1

1

1

1

i

i

i

N

i

m

i m ijji

M

i ijji

N

ii

MN N

ij i i Ni j i

i iNi

ii

ii

N

ii

y ym

Y yM

Y y YN

y M YY u Y

MN NM

MuM

M MN

Two Stage Sampling With Unequal First Stage Units:

4

Population

Cluster2

M2 Units

Cluster 1 

M1 Units

Cluster N

MN Units

Cluster 2

M2 Units

Cluster 1

M1 Units

Cluster n

Mn Units

Population N Clusters

First stage sample n clusters

Cluster 2

m2 Units

Cluster 1

m1 Units

Cluster n

mn Units…

Second stage sample n clusters 

Two Stage Sampling With Equal First Stage Units:

5

Two Stage Sampling With Equal First Stage Units:

Now we consider different estimators for the estimation of 

population mean.

1. Estimator Based on the First Stage Unit Means in the Sample: Bias

2 ( )1

1ˆi

n

S i mi

Y y yn

6

1. Estimator Based on the First Stage Unit Means in the Sample: Bias

2 ( )1

1 2 ( )1

11

1

1( )

1 ( )

1

[ ]

1

.

i

i

N

n

S i mi

n

i mi

n

ii

i i

N

ii

E y E yn

E E yn

E Yn

m M

YN

Y

Y

Since a sample of size is selected out of units by SRSWOR

7

1. Estimator Based on the First Stage Unit Means inthe Sample: BiasSo is a biased estimator of and its bias is given by

This bias can be estimated by

2Sy Y

2 2

1 1

1 1 1

1

( ) ( )

1 1

1 1

1 ( )( ).

S S

N N

i i ii i

N N N

i i i ii i i

N

Ni ii

Bias y E y Y

Y M YN NM

M Y Y MNM N

M M Y YNM

2 ( ) 2

1

1( ) ( )( )( 1)

n

S i i mi Si

NBias y M m y yNM n

8

1. Estimator Based on the First Stage Unit Means in theSample: Biaswhich can be seen as follows:

where

2 1 2 ( ) 21

1

1

1 1( ) ( )( ) |1

1 1 ( )( )1

1 ( )( )

n

S i i mi Si

n

i i ni

N

Ni ii

N

NE Bias y E E M m y y nNM n

N E M m Y yNM n

M M Y YNM

Y Y

1

1 .n

n ii

y Yn

9

1. Estimator Based on the First Stage Unit Means in theSample: Bias

An unbiased estimator of the population mean is thus obtained

as

Note that the bias arises due to the inequality of sizes of the first

stage units and probability of selection of second stage units varies

from one first stage to another.

2 ( ) 21

1 1 ( )( ).1

n

S i i mi Si

Ny M m y yNM N

Y

10

1. Estimator Based on the First Stage Unit Means in theSample: Variance

TheMSE can be obtained as

2 2 2

( )21 1

2 22

1

2 2

1

( ) ( | ) ( | )

1 1 ( | )

1 1 1 1 1

1 1 1 1 1

S S S

n n

i i mii i

n

b ii i i

N

b ii i i

Var y E Var y n Var E y n

Var y E Var y in n

S E Sn N n m M

S Sn N Nn m M

22

2 2

1 1

1 1, .1 1

iMN

Nb i i ij ii ji

S Y Y S y YN M

where

22 2 2( ) ( ) ( ) .S S SMSE y Var y Bias y

11

1. Estimator Based on the First Stage Unit Means in the Sample: Estimation of  VarianceConsider mean square between cluster means in the sample

It can be shown that

22( ) 2

1

1 .1

n

b i mi Si

s y yn

2 2 2

1

2 2( )

1

2 2 2

1

2 2

1 1

2 2 2

1 1 1( )

1 ( )1

1( ) ( )1

1 1 1 1 1 1 .

1 1 1( )

Also

So

Thus

i

i

N

b b ii i i

m

i ij i miji

M

i i ij iji

n N

i ii ii i i i

b b ii i i

E s S SN m M

s y ym

E s S y YM

E s Sn m M N m M

E s S E sn m M

1

n

12

1. Estimator Based on the First Stage Unit Means in the Sample: Estimation of  Varianceand an unbiased estimator  of        is

So an estimator of the variance can be obtained by replacing

by their unbiased estimators as

2bS

2 2 2

1

1 1 1ˆ .n

b b ii i i

S s sn m M

2 2b iS Sand

2 22

1

1 1 1 1 1ˆ ˆ( ) .N

S b ii i i

Var y S Sn N Nn m M

13

2. Estimation Based on First Stage Unit Totals 

where

Bias:

Thus is an unbiased estimator of .

( )*2 ( )

1 1

1 1ˆ n ni i mi

S i i mii i

M yY y u y

n M n

.ii

MuM

*2 ( )

1

2 ( )1

1 1

1( )

1 ( | )

1 1 .

n

S i i mii

n

i i mii

n N

i i i ii i

E y E u yn

E u E y in

E u Y u Y Yn N

*2Sy Y

14

2. Estimation Based on First Stage Unit Totals : Variance

* * *2 2 2

2( )2

1 1

*2 2 2

1

2 2

1

*2 2

1

( ) ( | ) ( | )

1 1 ( ) |

1 1 1 1 1

1 ( )1

1 ( ) .1

wherei

S S S

n n

i i i i mii i

N

b i ii i i

M

i ij iji

N

b i ii

Var y Var E y n E Var y n

Var u Y E u Var y in n

S u Sn N nN m M

S y YM

S u Y YN

15

3. Estimator Based  on  Ratio Estimator 

where

This estimator can be seen as if arising  by the ratio method of 

estimation as follows:

*( ) ( )** 1 1 2

2

1 1

ˆ

n n

i i mi i i mii i S

S n nn

i ii i

M y u yyY yuM u

1

1, .n

ii n i

i

Mu u uM n

16

3. Estimator Based on Ratio Estimator:

be the values of study variable and auxiliary variable in

reference to the ratio method of estimation. Then

The corresponding ratio estimator of is

* * *2

1

* *

1

*

1

1

1

1* 1.

n

i Si

n

i ni

N

ii

y y yn

x x un

X XN

***2

2*ˆ * 1 .*

SR S

n

yyY X yx u

Y

* *( ) , 1, 2,...,i

i i i mi iMy u y x i NM

Let and

17

3. Estimator Based on Ratio Estimator:

So the bias and mean squared error of can be obtained

directly from the results of the bias and MSE of the ratio

estimator.

Recall that in ratio method of estimation, the bias of ratio

estimator up to second order of approximation is

where

**2Sy

2

2

2

ˆ( ) ( 2 )

( ) ( , )

ˆ( ) ( ) ( ) 2 ( , )

R x x y

R

N nBias y Y C C CNn

Var x Cov x yYX XY

MSE Y Var y R Var x RCov x y

.YRX

18

3. Estimator Based on Ratio Estimator Bias:

The bias of up to second order of approximation is

where is the mean of auxiliary variable similar to as

**2Sy

* * *** 2 2 2

2 2

( ) ( , )( ) S S SS

Var x Cov x yBias y YX XY

*

2Sy

*2 ( )

1

1 .n

S i mii

x xn

*2Sx

19

3. Estimator Based on Ratio Estimator:Bias

Now we find

where

* *2 2( , ).S SCov x y

* *2 2 ( ) ( ) ( ) ( )

1 1 1 1

2( ) ( ) ( ) ( )2

1 1 1

1 1 1 1( , ) , ,

1 1 1( ), ( ) ( , ) |

n n n n

S S i i mi i i mi i i mi i i mii i i i

n n n

i i mi i i mi i i mi i mii i i

Cov x y Cov E u x u y E Cov u x u yn n n n

Cov u E x u E y E u Cov x y in n n

Cov

22

1 1 1

* 2

1

1 1 1 1 1,

1 1 1 1 1

n n n

i i i i i ixyi i i i i

N

bxy i ixyi i i

u X u Y E u Sn n n m M

S u Sn N nN m M

*

1

1

1 ( )( )1

1 ( )( ).1

i

N

bxy i i i ii

M

ixy ij i ij iji

S u X X uY YN

S x X y YM

20

3. Estimator Based on Ratio Estimator: Bias

Similarly, can be obtained by replacing x in place of y

in as

Substituting and in we obtain

the approximate bias as

* *2 2( , )S SCov x y

*2( )SVar x

* *2 2 22

1

*2 2

1

*2 2

1

1 1 1 1 1( )

1 ( )1

1 ( ) .1

i

N

S bx i ixi i i

N

bx i iiM

ix ij iii

Var x S u Sn N nN m M

S u X XN

S x XM

where

**2 2** 2

2 2 21

1 1 1 1 1( ) .N

bxy ixybx ixS i

i i i

S SS SBias y Y un N X XY nN m M X XY

**2( ),SBias y*

2( )SVar x* *2 2( , )S SCov x y

21

3. Estimator Based on Ratio Estimator: MSE** * * * * *2 *

2 2 2 2 2

** *2 2 22

1

** *2 2 22

1

( ) ( ) 2 ( , ) ( )

1 1 1 1 1( )

1 1 1 1 1( )

S S S S S

N

S by i iyi i i

N

S bx i ixi i i

MSE y Var y R Cov x y R Var x

Var y S u Sn N nN m M

Var x S u Sn N nN m M

* ** * 22 2

1

*2 2

1

*2 2

1

*

1 1 1 1 1( , )

1 ( )1

1 ( )1

.

i

N

S S bxy i ixyi i i

N

by i ii

M

iy ij iji

Cov x y S u Sn N nN m M

S u Y YN

S y YM

YR YX

where

22

3. Estimator Based  on  Ratio Estimator: MSE

Thus

Also

** *2 * * *2 *2 2 2 * *2 22

1

1 1 1 1 1( ) 2 2 .N

S by bxy bx i iy ixy ixi i i

MSE y S R S R S u S R S R Sn N nN m M

2** 2 * 2 2 * *2 22

1 1

1 1 1 1 1 1( ) 2 .1

N N

S i i i i iy ixy ixi i i i

MSE y u Y R X u S R S R Sn N N nN m M

23

3. Estimator Based on Ratio Estimator: Estimate ofVarianceConsider

It can be shown that

So

* * *( ) 2 ( ) 2

1

( ) ( )1

11

1 .1

n

bxy i i mi S i i mi Si

n

ixy ij i mi ij i miji

s u y y u x xn

s x x y ym

* * 2

1

2 2

1 1

1 1 1

( ) .

1 1 1 1 1 1 .

N

bxy bxy i ixyi i i

ixy ixy

n N

i ixy i ixyi ii i i i

E s S u SN m M

E s S

E u s u Sn m M N m M

24

3. Estimator Based on Ratio Estimator: Estimate ofVarianceThus

Also

* * 2

1

*2 *2 2 2

1

*2 *2 2 2

1

1 1 1ˆ

1 1 1ˆ

1 1 1ˆ .

n

bxy bxy i ixyi i i

n

bx bx i ixi i i

n

by by i iyi i i

S s u sn m M

S s u sn m M

S s u sn m M

2 2 2 2

1 1

2 2 2 2

1 1

1 1 1 1 1 1

1 1 1 1 1 1 .

n N

i ix i ixi ii i i i

n N

i iy i iyi ii i i i

E u s u Sn m M N m M

E u s u Sn m M N m M

25

3. Estimator Based on Ratio Estimator: Estimate ofVarianceA consistent estimator of MSE of can be obtained by

substituting the unbiased estimators of respective statistics

in as

where

**2Sy

**2( )SMSE y

** *2 * * *2 *2 2 2 * *2 22

1

2 2 2 * *2 2( ) ( )

1 1

1 1 1 1 1( ) 2 2

1 1 1 1 1 1* 21

n

S by bxy bx i iy ixy ixi i i

n n

i mi i mi i iy ixy ixi i i i

MSE y s r s r s u s r s r sn N nN m M

y r x u s r s r sn N n nN m M

** 2

*2

.S

S

yrx

Recommended