Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset adequate estimates

Sampling Populations

• Ideal situation

- Perfect knowledge

• Not possible in many cases

- Size & cost

• Not necessary

- appropriate subset adequate estimates

• Sampling

- A representative subset

Sampling Concepts

• Sampling unit

- The smallest sub-division of the population

• Sampling error

- Sampling error as the sample size

• Sampling bias

- systematic tendency

Steps in Sampling

1. Definition of the population

- Any inferences that population

2. Construction of a sampling frame

This involves identifying all the individual sampling units

within a population in order that the sample can be drawn

from them

Steps in Sampling Cont.

3. Selection of a sampling design

- Critical decision

4. Specification of information to be collected

- What data we will collect and how

5. Collection of the data

Sampling designs

• Non-probability designs

- Not concerned with being representative

• Probability designs

- Aim to representative of the population

Non-probability Sampling Designs

• Volunteer sampling

- Self-selecting

- Convenient

- Rarely representative

• Quota sampling - Fulfilling counts of sub-groups

• Convenience sampling

- Availability/accessibility

• Judgmental or purposive sampling

- Preconceived notions

Probability Sampling Designs

• Random sampling

• Systematic sampling

• Stratified sampling

• Sampled locations in close proximity are likely to

have similar characteristics, thus they are

unlikely to be independent

Tobler’s Law and Independence

Everything is related to everything else, but near things are more related than distant things.

• Point Pattern Analysis

Location information

Point data

• Geographic Patterns in Areal Data

Attribute values

Polygon representations

Spatial Patterns

Point Pattern Analysis

Regular Random Clustered

1. The Quadrat Method

2. Nearest Neighbor Analysis

Point Pattern Analysis

1. Divide a study region into m cells of equal size

2. Find the mean number of points per cell

3. Find the variance of the number of points per cell (s2)

the Quadrat Method

(xi – x)2i=1

i=m

m - 1s =

where xi is the number of points in cell i

4. Calculate the variance to mean ratio (VMR):

the Quadrat Method

VMR = s2

x

VMR < 1 Regular (uniform)

VMR = 1 Random

VMR > 1 Clustered

5. Interpret VMR

the Quadrat Method

6. Interpret the variance to mean ratio (VMR)

2 =(m - 1) s2

x= (m - 1) * VMR

comparing the test stat. to critical values from the 2 distribution with df = (m - 1)

Quadrat Method Example

• Quadrat size

• Too small empty cells

• Too large miss patterns that occur within a single cell

• Suggested optimal sizes

• either 2 points per cell (McIntosh, 1950)

• or 1.6 points/cell (Bailey and Gatrell, 1995)

The Effect of Quadrat Size

• An alternative approach

- the distance between any given point and its nearest neighbor

• The average distance between neighboring points (RO):

2. Nearest Neighbor Analysis

diRO = i = 1

n

n

• Expected distance:

The Nearest Neighbor Statistic

RE =2

1 where is the number of points per unit area

• Nearest neighbor statistic (R):

R =RO

RE

=1/ (2

x where x is the average observed distance di

• Values of R:

• 0 all points are coincident

• 1 a random pattern• 2.1491 a perfectly uniform pattern

• Through the examination of many random point patterns, the variance of the mean distances between neighbors has been found to be:

Interpreting the Nearest Neighbor Statistic

V [RE] = 4 - 4n

where n is the number of points

• Test statistic:

Interpreting the Nearest Neighbor Statistic

V [RE]Ztest = RO - RE

(4 - 4n=

RO - RE

= 3.826 (RO - RE) n

• Standard normal distribution

Nearest Neighbor Analysis Example

• Observed mean distance (RO):

RO = (1 + 1 + 2 + 3 + 3 + 3) / 6 = 13 /6 = 2.167

• Expected mean distance (RE):

RE = 1/(2) = 1/(26/42]) = 1.323

and use these values to calculate the nearest neighbor statistic (R):

R = RO / RE = 2.167/1.323 = 1.638

• Because R is greater than 1, this suggest the points are somewhat uniformly spaced

Nearest Neighbor Analysis Example

Z-test for the Nearest Neighbor Statistic Example

• Research question: Is the point pattern random?

1. H0: RO ~ RE Point pattern is approximately random)

2. HA: RO RE (Pattern is uniform or clustered)

3. Select = 0.05, two-tailed because of H0

4. We have already calculated RO and RE, and together with the sample size (n = 6) and the number of points per unit area ( = 6/24), we can calculate the test statistic:

Ztest = 3.826 (RO - RE) n

= 3.826 (2.167 - 1.323) *

Z-test for the Nearest Neighbor Statistic Example

5. For an = 0.05 and a two-tailed test, Zcrit=1.96

6. Ztest > Zcrit , therefore we reject H0 and accept HA, finding that the point pattern is significantly different from a random point pattern; more specifically it tends towards a uniform pattern because it exceeds the positive Zcrit value

Documents

Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset adequate estimates