22
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010 Sampling Frames and Sample Design Pres. 5

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,

Embed Size (px)

Citation preview

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sampling Frames andSample Design

Pres. 5

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Frames & Sample Design

Objectives: Important to define objectives before designing a sample

Items to estimate – coverage error, duplication, omissions, etc.

Geographic level – national, sub-national (province or district, urban/rural, etc.)

Demographic characteristics – sex, age, person, household, etc.

Confidence levelMargin of error

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Frames & Sample Design

Frames: Material from which a sample is drawn

Each unit to be included in the universeThere should be no duplicatesEach unit should be well defined and distinguishable

from other units (it should be unique)Should be updated

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sampling Strategies

Probability household surveys It is usual to make inferences in a PES for a number of analytical

domains Relatively large samples necessary in each domain for reliable

estimates Stratified cluster sample design-common First-stage units or Primary Sampling Units (PSUs) - many countries use

geographically contiguous land areas usually called area clusters or EAs PPS systematic sample selection Second-stage, common to canvass all persons in selected households

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Importance of Stratification

Population subdivided into heterogeneous groups that are internally homogenous

Stratification based on variables correlated with the extent of coverage-geopolitical subdivisions

Internal homogeneity can be maintained with regard to socio-demographic variables e.g. urban stratum

Common strata may include: rural, urban, provinces etc.

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Multi-stage Cluster Sampling

Usually used when sampling hierarchical populations The hierarchical levels are called stages First stage units are called primary sampling units (PSUs) e.g.

EAs Second stage units are called secondary sampling units

(SSUs) e.g. households Last stage units are called ultimate sampling units (USUs) e.g.

persons within households which can be selected from EAs

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Why Area sampling?

At national level only a frame of EAs is required Data collection is more efficient Lower costs compared to simple random sampling (SRS) Supervision is easier However, estimates are prone to higher variability

compared to SRS

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Choices of PSUs

Must have clearly identifiable and stable boundaries Must completely cover the relevant population Preferably must have measures of size They should be mapped Must cover the whole country The number of PSUs must be relatively large

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Common problems with EAs

Incomplete coverage Inadequate maps Poor measures of size or lack of them

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

PES sample design

A single-stage stratified clustered sample design is commonly adopted

When the PSUs i.e. EAs are selected all households in selected EAs are canvassed, or more rarely only a sample (e.g. 1 every 5).

This is beneficial for matching operation

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Size

Sample size depends on estimate requirementsGeographic level (national, province, urban/rural)Demographic (sex, age)ReliabilityConfidence level

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Size

To estimate sample size in the case of proportions you must: Know the occurrence of the event in the population by

domain of estimation

Specify a confidence interval (e.g 95%)

Specify the margin of error or precision (e.g 1%)

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Size (contd.)

To estimate sample size in the case of proportions, the following formula can be used:

estimate toproportion theof varianceestimated )(

population totalof size

curveon distributi normal a of in tails area % cuts that of value

size sample

valueestimated

)1()(

)1(

2

1

2

1

n

ys

N

tt

n

Y

n

ys

N

ntYP

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample size (contd.)

From that it is deduced :

212

2

2

1

then

11

ondistributi binomial ,)(

% of pointsin precision theis )(

)1(

m

pqtn

N

n

pqys

n

ys

N

ntm

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Size (contd.)

Example: To estimate percentage of households omitted in the census

(expected about 5%); confidence interval at 95% (t=1.96) for a margin of error of 2 %

The sample size works out to be:

4562

955)96.1(

22

22

x

m

pqtnh

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Size (contd.)

Adjusting for non-response, e.g. 10%:

Adjusting for the design effect for a complex sample

design Design effect of 2 is a default value : 2 x 507 =1,014 This may apply to each province (analysis) domains. If they

are five provinces Sample size will be 5 x 1,014 = 5,070

50790.0

456

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample selection procedures

For greater convenience and efficiency, the sample of PSUs should be selected using a systematic procedure.

If there are good measures of size, probability proportional to size (PPS) should be used to increase the efficiency of the sample design.

Otherwise, the selection should be made with equal probabilities

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample selection procedures -- PPS

1) Order the EAs geographically (and, if applicable, by other stratification characteristic) to allow implicit stratification

 2) Record for each EA i of the stratum h the measure of size Mhi, typically the number of households or persons from the census mapping operation

 3) Cumulate the size measures down the list of EAs, the last cumulated number will be equal to the total number of households (or persons) in stratum h (Mh)

 4) Determine the number of EAs (nh) to be selected in a stratum according to the allocation

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample selection procedures –- PPS (contd.)

 5) Determine the sampling interval (Ih) by:

 6) Obtain a random number (Ah) between 1 and Ih inclusively;

 7) Determine the selected EAs as follows:  Shi=Ah + (i-1) x Ih, for i = 1,...,nh, rounded up to the next integer

The i-th EA selected will be the one for which the cumulated measure is closest to Shi without exceeding it.

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Illustration: Selection of Eight EAs with probability Proportional to size

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Sample Allocation – 2009 Kenyan PES

United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan, 21-24 November, 2010

Thank You!