Upload
ruby-flowers
View
212
Download
0
Embed Size (px)
Citation preview
Sampling and Monitoring of Environmental Data
Md. Abdus SalamProfessor
Department of StatisticsJahangirnagar University
Savar, Dhaka
Components of the environment
• Water, air, soil, biota
1. Water• Industrial wastage• Marine pollution• Urban runoff• Water crisis• Waste water
2. Air
• Climate change
• Global warming
• Sea level rise
• Greenhouse gas
• Indoor air quality
• Volatile organic compound
• Particulate matter
Components of the environment
3. Soil– Soil conservation– Soil erosion– Soil contamination– Urban sprawl– Habitat destruction
4. Biota – Conservation– Species extinction– Endangered species– Poaching
Sampling
– Sampling consists of selection, acquisition, and quantification of a part of the population
– selection and acquisition apply to physical sampling units of the population,
– quantification pertains only to the variable of interest, which is a particular characteristic of the sampling units.
– A sampling procedure is expected to provide a sample that is representative with respect to some specified criteria.
Characteristics of Environmental Sampling– Selection and acquisition of sampling units is cheap– Characteristics of Environmental variables are the precise
chemical or biological characteristics of materials– Quantification of chemical and biological characteristics of
materials are highly expensive and time consuming.
Inaccessible and Sensitive DataComposite Sampling
Full retesting
• We will need either one test (if negative) or n + 1 tests (if positive)
• When p is small, this can be a highly economical approach• We need, on an average, (n + 1) – n(1 – p )n tests. • If p = 0.0005 and n = 20, just 1.2 tests are required on
average.
End
Test all items separately
Test all n IF Negative IF Positive
Composite sampling
• Group retesting
Test all n IF negativeIF positive
End
Test group n1 IF negative
IF positive
Test group n2 IF negative
IF positive
Test group n3 IF negative
IF positive
Test all n1
Test all n2
Test all n3
End End End
Composite Sampling
Cascading
Test all n IF NegativeIF Positive
End
Test n/2 IF NegativeIF Positive
Test n/2 IF NegativeIF Positive
End
Test n/4 IF Negative
IF Positive Test n/4 IF Negative
IF Positive
Test n/4 IF Negative
IF Positive Test n/4 IF Negative
IF Positive
End
Test in groups of n/8Etc. Etc. Etc.
Composite sampling for continuous Variables
• X is a continuous variable• X may measure pollution levels in a river
• We want to know if any observed xi in a sample of size n are illegally high vales above standard, xH.
• Thus characteristic A is now defined by ,say
• If we measure X for a composite sample, its value will be the sum of the constituents x i
• Suppose the value of X for the composite sample is x and put• Which is the equivalent of the sample mean of the distinct values making up the
composite sample.
• If any xi >xH, then for the whole sample we would be bound to have
• This reflects the fact that even for the minimal case of violation where all but one of the xi is zero and just one xi > xH and only one is just a little larger, we would still have
• The condition has been proposed as a basis for declaring that the composite sample indicate violation:
• It is known as the “rule of n”.• On this “rule of n” composite sample proceeds as follows:
• If , we declare all observations to be satisfactory.
• If , we would need to retest all sample members.
,/ nxx
nxx H /
nxx H /
nxx /
nxx H /
nxx H /
Hi xx
Hi xx
RANKED-SET SAMPLING
• Although one of the major activities in environmental statistics is that of obtaining relevant data for statistical investigation,
• We have to face the problem that circumstances for obtaining data by means of census, classical sample surveys or designed experiments.
• Quite often we have to take what limited data are in hand (‘encountered data’)
• which may be difficult to analyze using formal methods.
• If we are to collect even limited data for our purposes • We may need to abandon such hallowed principles as strict
randomization, not only in view of access constraints but also to contain costs and to improve efficiency.
• In many areas of environmental risk such as radiation or pollution
• We commonly find that the taking of measurements can involve substantial scientific processing of materials and correspondingly high attendant costs
• We need to look for highly efficient procedures
• One way of doing this is to use what is known as ranked-set sampling.– Example when we wish to estimate as basic quantity as a
population mean.– Suppose we are interested in the mean pollution level of the bathing water
around the inland lake used for recreational purposes.– We might decide to take a random sample of modest size at regular
intervals of time and to use the sample mean to estimate the mean pollution level of the lake on each occasion.
– But with the attendant costs even of such a simple monitoring process it is desirable to keep the sample as small (and cheap) as possible to achieve the desired level of assurance.
Ranked-set Sampling
• Ranked-set sampling could operate in the following way:
• If we want to sample of size 5– We would choose five sites at random,
• but rather than measuring pollution at each of them we would ask a local expert which would be likely to give the largest value.
• Alternatively choose the candidate for highest value by a cheaply observed concomitant variable such as the opacity of the water.
• We then repeat the process by selecting a second random set of five sites
• A second expert to guide us• And seeking to measure the second largest pollution level amongst
these,• And so on, until we seek the lowest pollution level in the final random
set of five sites.
• The resulting ranked-set sample of size 5 is then used for the estimation of the mean.
• Such an approach can also be used to estimate a measure of dispersion, quantile or even to carry out a test of significance or to fit a regression model.
• The gain can be dramatic: the sample mean is unbiased and efficiencies relative to simple random sampling may reach 300%.
Ranked-set Sample Mean
)(
1
.........,,,
.....,,.........,
:
....,,.........,
....,,.........,
)(
)()2(2)1(1
21
22221
11211
XVarXVar
thatandunbiasedisXthatshownbecanIt
xn
x
asestimatedbeshouldmeanThe
xxx
asdefinedthenissamplesetrankedThe
xxx
xxx
xxx
ii
nn
nnnn
n
n
Sampling in the Wild
• Sampling methods which are particularly suitable for examining living things.
• Sampling techniques are: 1. quadrat sampling, 2. capture-recapture or mark-recapture, 3. transect sampling and 4. adaptive sampling
• Quadrat Sampling
• Mainly used for ecological studies
• If we wish to count the numbers of one, or of several, species of plant in a meadow (to estimate population size or assess biodiversity)
• We might throw a quadrat at random and do out count, or counts, within the boundary.
• For aquatic wildlife • we might cast a net of given size into a pond, river, or sea and count
what it trawls.• A quadrat is usually a square (or round) metal frame of a meter or several meters
side (or diameter)• Where it lands defines the search area in which we take appropriate measures of
numbers of individual plants, biomass or extent of ground cover.
Recapture Sampling
• A wide range of sampling methods are based on the principle of initially ‘capturing’ and ‘marking’ a sample of the members of a closed (finite) population and subsequently observing, in a later (or separate) independent random sample drawn from the population, how many marked individuals are obtained.
• The term capture-recapture is usually used for animals or insects, while mark-recapture is often reserved for when studying plants.
• The sample information is then used to infer characteristics of the overall population, principally its total size.
• In its simplest approaches to capture-recapture we assume randomness of the samples with constant capture probabilities in a fixed population (no births, deaths, etc.) and no capture-related effects (of being ‘trap shy’ or trap happy’ or marks being lost)
• The Petersen estimator of population size N:
Where n is initial sample size,
m is the second random sample size
m’ is the number contain originally marked individuals
The variance of is
mmnN /ˆ
N̂
3)(
))((~)ˆ(
m
mmmnmnNVar
TRANSECT SAMPLING
• Transect sampling methods also developed principally for biological applications, with the aim of estimating the density, or the number, of a species of animal, fish, or plant distributed over a geographic region.
• In its simplest form, known as line-transect sampling, a line is drawn at random across a search region and the objects (be they tigers) are sought by moving along the line and noting how many of the target objects are observed as one goes from one end of the line to the other.
• Several lines may be drawn and the sample data accumulated from traversing all the lines.
• Estimation of the abundance of animals or plants is known to be difficult and time-demanding, and line-transect sampling is as efficient and effective an approach as is likely to be found, for most types of problem.
• Assumptions:• To be stationary (i.e. not moving);• To be similarly and independently able to be observed;• To be seen at right angles to the transect line;• To be seen on one occasion only;• To be unaffected (e.g. neither repelled nor attracted) by the observer.
• Other possible modifications include the prospects of:• Observing objects other than right angles to the transect line; we may
see them ahead of us;• Taking into account the fact that larger objects may be more visible
than smaller ones, at any specific distance;• Deciding to take observations in all directions from a single point rather
than by traversing a transect line.
• The latter approach is known as a point-transect sampling
Point Transect Sampling
• Several points may be chosen and a period of time is then spent at each point recording all observed objects of the type sought.
• This can particularly useful approach for birds and elusive animals.
• The Simplest Case: Strip Transects• Suppose a line transect of length l is chosen at random, extending over an
observation region which contains individual specimens of some object of interest distributed at random with density
• Thus if we observe n objects, our estimate of the density of the objects over the region is
•
wl
n
2ˆ
ADAPTIVE SAMPLING
• In a number of sampling situations, field researchers carrying out the survey may feel an inclination to adaptively increase sampling effort in the vicinity of observed value that are high or otherwise interesting.
• Adaptive cluster sampling refers to design in which an initial set of units is selected by some probability sampling procedure (with or without replacement), and, when ever the variable of interest of a selected unit satisfies a given criterion, additional units in the neighborhood of that unit are added to the sample.
• For the sorts of situations in which field researchers feel the inclination to depart from the preselected sample plan and add nearby or associated units to the sample, adaptive cluster sampling accommodates that inclination almost completely.
• Consider a survey of a rare and endangered bird species in which observers record the number of individuals of the species seen or heard at sites or units within the study area.
• At many of the sites selected for observation, zero abundance may be observed.
• But whenever substantial abundance is encountered, observation of neighboring sites is likely to reveal additional concentration of individual of the species.
• Such patterns of clustering are encountered with many types animals from whales to insects, with vegetation types from trees to lichens and with mineral and fossil fuel resources.
• A related pattern is found in epidemiological studies of rare, contagious diseases.
• Whenever an infected individual is encountered, addition to the sample of closely associated individuals reveals a higher than expected rate.