24
Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset adequate estimates Sampling - A representative subset

Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset adequate estimates

Embed Size (px)

Citation preview

Page 1: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Sampling Populations

• Ideal situation

- Perfect knowledge

• Not possible in many cases

- Size & cost

• Not necessary

- appropriate subset adequate estimates

• Sampling

- A representative subset

Page 2: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Sampling Concepts

• Sampling unit

- The smallest sub-division of the population

• Sampling error

- Sampling error as the sample size

• Sampling bias

- systematic tendency

Page 3: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Steps in Sampling

1. Definition of the population

- Any inferences that population

2. Construction of a sampling frame

This involves identifying all the individual sampling units

within a population in order that the sample can be drawn

from them

Page 4: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Steps in Sampling Cont.

3. Selection of a sampling design

- Critical decision

4. Specification of information to be collected

- What data we will collect and how

5. Collection of the data

Page 5: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Sampling designs

• Non-probability designs

- Not concerned with being representative

• Probability designs

- Aim to representative of the population

Page 6: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Non-probability Sampling Designs

• Volunteer sampling

- Self-selecting

- Convenient

- Rarely representative

• Quota sampling - Fulfilling counts of sub-groups

• Convenience sampling

- Availability/accessibility

• Judgmental or purposive sampling

- Preconceived notions

Page 7: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Probability Sampling Designs

• Random sampling

• Systematic sampling

• Stratified sampling

Page 8: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Sampled locations in close proximity are likely to

have similar characteristics, thus they are

unlikely to be independent

Tobler’s Law and Independence

Everything is related to everything else, but near things are more related than distant things.

Page 9: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Point Pattern Analysis

Location information

Point data

• Geographic Patterns in Areal Data

Attribute values

Polygon representations

Spatial Patterns

Page 10: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Point Pattern Analysis

Regular Random Clustered

Page 11: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

1. The Quadrat Method

2. Nearest Neighbor Analysis

Point Pattern Analysis

Page 12: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

1. Divide a study region into m cells of equal size

2. Find the mean number of points per cell

3. Find the variance of the number of points per cell (s2)

the Quadrat Method

(xi – x)2i=1

i=m

m - 1s =

where xi is the number of points in cell i

Page 13: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

4. Calculate the variance to mean ratio (VMR):

the Quadrat Method

VMR = s2

x

VMR < 1 Regular (uniform)

VMR = 1 Random

VMR > 1 Clustered

5. Interpret VMR

Page 14: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

the Quadrat Method

6. Interpret the variance to mean ratio (VMR)

2 =(m - 1) s2

x= (m - 1) * VMR

comparing the test stat. to critical values from the 2 distribution with df = (m - 1)

Page 15: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Quadrat Method Example

Page 16: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Quadrat size

• Too small empty cells

• Too large miss patterns that occur within a single cell

• Suggested optimal sizes

• either 2 points per cell (McIntosh, 1950)

• or 1.6 points/cell (Bailey and Gatrell, 1995)

The Effect of Quadrat Size

Page 17: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• An alternative approach

- the distance between any given point and its nearest neighbor

• The average distance between neighboring points (RO):

2. Nearest Neighbor Analysis

diRO = i = 1

n

n

Page 18: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Expected distance:

The Nearest Neighbor Statistic

RE =2

1 where is the number of points per unit area

• Nearest neighbor statistic (R):

R =RO

RE

=1/ (2

x where x is the average observed distance di

Page 19: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Values of R:

• 0 all points are coincident

• 1 a random pattern• 2.1491 a perfectly uniform pattern

• Through the examination of many random point patterns, the variance of the mean distances between neighbors has been found to be:

Interpreting the Nearest Neighbor Statistic

V [RE] = 4 - 4n

where n is the number of points

Page 20: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Test statistic:

Interpreting the Nearest Neighbor Statistic

V [RE]Ztest = RO - RE

(4 - 4n=

RO - RE

= 3.826 (RO - RE) n

• Standard normal distribution

Page 21: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Nearest Neighbor Analysis Example

Page 22: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

• Observed mean distance (RO):

RO = (1 + 1 + 2 + 3 + 3 + 3) / 6 = 13 /6 = 2.167

• Expected mean distance (RE):

RE = 1/(2) = 1/(26/42]) = 1.323

and use these values to calculate the nearest neighbor statistic (R):

R = RO / RE = 2.167/1.323 = 1.638

• Because R is greater than 1, this suggest the points are somewhat uniformly spaced

Nearest Neighbor Analysis Example

Page 23: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Z-test for the Nearest Neighbor Statistic Example

• Research question: Is the point pattern random?

1. H0: RO ~ RE Point pattern is approximately random)

2. HA: RO RE (Pattern is uniform or clustered)

3. Select = 0.05, two-tailed because of H0

4. We have already calculated RO and RE, and together with the sample size (n = 6) and the number of points per unit area ( = 6/24), we can calculate the test statistic:

Ztest = 3.826 (RO - RE) n

= 3.826 (2.167 - 1.323) *

Page 24: Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates

Z-test for the Nearest Neighbor Statistic Example

5. For an = 0.05 and a two-tailed test, Zcrit=1.96

6. Ztest > Zcrit , therefore we reject H0 and accept HA, finding that the point pattern is significantly different from a random point pattern; more specifically it tends towards a uniform pattern because it exceeds the positive Zcrit value