Upload
donhan
View
230
Download
2
Embed Size (px)
Citation preview
1
Excursions in Modern Mathematics
Sixth Edition
Peter Tannenbaum
2
Chapter 13Collecting Statistical Data
Censuses,
Surveys, and
Clinical Studies
3
Collecting Statistical DataOutline/learning Objectives
To identify whether a given survey or poll is
biased.
To list and discuss the quality of several
sampling methods.
To identify components of a well-constructed
clinical study.
4
Collecting Statistical DataOutline/learning Objectives
To define key terminology in the data collection
process.
To estimate the size of a population using the
capture-recapture method.
5
Collecting Statistical Data
13.1 The Population
6
Collecting Statistical Data
Population
Every statistical statement refers, directly or indirectly, to some group of individuals or objects.
N-value
Given a specific population, an obviously relevant question is, “How many individuals or objects are there in that population?”
7
Collecting Statistical Data
Census
The process of collecting data by going
through every member of the population.
8
Collecting Statistical Data
Over the last 45 years, the United States Fish and Wildlife
Service has been able to keep a remarkably accurate tally of the
number of bald eagle breeding pairs in the lower 48 states.
9
Collecting Statistical Data
A tremendous amount of effort has gone into collecting and
verifying these N-values, which, for a wildlife population, are
of remarkable accuracy. The above figure summarizes the
population numbers over the period 1963-2000.
10
Census
Mandated by the U.S. constitution
Must be carried out every 10 years
Adjusted by the 14th Amendment and courts
Plagued by undercounts
1999 Supreme Court ruling
11
Collecting Statistical Data
13.2 Sampling
12
Collecting Statistical Data
Survey
The practical alternative to a census is to collect data only from some members of the population and use that data to draw conclusions and make inferences about the entire population.
Poll
When the data collection is done by asking questions.
13
Collecting Statistical Data
Sample
The subgroup chosen to provide the data.
Sampling
The act of selecting a sample.
14
Collecting Statistical Data
Target population
The most important step in a survey is to
distinguish the population for which the survey
applies.
Sampling frame
The actual subset of the population from which
the sample will be drawn.
15
Are Polls Always Right?
1936 US presidential election pitted
Alfred Landon against incumbent
Franklin D. Roosevelt.
16
Collecting Statistical Data
Public Opinion Polls
Selection bias
When the choice of the sample has a built-in tendency
to exclude a particular group or characteristics within
the population.
Response rate
The percentage of respondents out of the total sample.
Nonresponse bias
When the response rate to a survey is low.
17
Collecting Statistical Data
Convenience Sampling
In convenience sampling the selection of which
individuals are in the sample is dictated by what
is easiest for the data collector.
A classic example is when interviewers set up
at a fixed location such as a mall or outside a
supermarket and ask passersby to be a part of
a public opinion poll.
18
Collecting Statistical Data
Quota Sampling
Quota sampling is a systematic effort to force the sample to be representative of a given population through the use of quotas– the sample should have so many women, so many men, so many blacks, so many whites, so many people living in urban areas, so many people living in rural areas, and so on.
19
Collecting Statistical Data
13.3 Random
Sampling
20
Collecting Statistical Data
Random sampling
Sampling methods that use randomness as
part of their design.
Random sample
Any sample obtained through random
sampling.
21
Collecting Statistical Data
Simple Random Sampling
It is based on the same principle a lottery is.
Any set of numbers of a given size has an
equal chance of being chosen as any other set
of numbers of that size.
Implementing on a national level is very costly
and would take a great deal of time.
22
Collecting Statistical Data
Stratified Sampling
The alternative to simple random sampling.
Break the sampling frame into categories,
called strata, and then randomly choose a
sample from these strata.
How a Gallop poll is based.
23
Collecting Statistical Data
13.4 Sampling:
Terminology and
Key Concepts
24
Collecting Statistical Data
Statistic
To describe any kind of numerical information drawn from a sample.
Parameter
An estimate for some unknown measure of the population.
Sampling error
To describe the difference between a parameter and a statistic used to estimate that parameter.
25
Collecting Statistical Data
Chance error
The result of the basic fact that a sample,
being just a sample, can only give us
approximate information about the population.
Sampling variability
Different samples are likely to produce different
statistics for the same population, even when
the samples are chosen in exactly the same
way.
26
Collecting Statistical Data
Sample bias
The result of choosing a bad sample and is a
much more serious problem than chance error.
Sample proportion
The size of the sample, denoted by n (to
contrast with N, the size of the population).
The ratio n/N is the sample proportion.
27
Example 1
As part of a sixth-grade statistics project, the
teacher brings to class a candy jar full of
gumballs of two different colors: red and green.
The assignment is to estimate the proportion of
red gumballs in the jar. To do this, the jar is
shaken well, and one of the students draws 25
gumballs from the jar. Of these, 8 are red and
17 are green.
28
Example 1
1. (a)
(b)
(c)
(d)
2. (a)
(b)
29
Example 1
3. (a)
(b)
(c)
(d)
4. (a)
(b)
30
In 1988, “Dear Abby” concluded that the
amount of cheating among married couples
is much less than people believe.
Status Women Men
Faithful 127,318 44,807
Unfaithful 22,468 15,747
Total 149,786 60,550
31
Collecting Statistical Data
13.5 The Capture-
Recapture Method
32
Collecting Statistical Data
The Capture-Recapture Method
Step 1. Capture (sample): Capture (choose) a sample of size n1, tag (mark, identify) the animals (objects, people), and release them back into the general population.
Step 2. Recapture (resample): After a certain period of time, capture a new sample of size n2, and take an exact head count of the tagged individuals. Let’s call this number k.
33
Collecting Statistical Data
Small Fish in a Big Pond
A large pond is stocked with catfish. As part of a research project we need to estimate the number of catfish in the pond.
Step 1. For our first sample we capture a predetermined number n1 of catfish, say n1 = 200. The fish are tagged and released unharmed back in the pond.
34
Collecting Statistical Data
Small Fish in a Big Pond
Step 2. After giving enough time for the released fish to
mingle and disperse throughout the pond, we capture a
second sample of n2 catfish. While n2 does not have to
equal n1, it is a good idea for the two samples to be of
approximately the same order of magnitude. Let’s say
that n2 = 250.
Of the 250 catfish in the second sample, 35 have tags
(were part of the original sample).
35
Collecting Statistical Data
Small Fish in a Big Pond
The ratio of tagged fish in the second sample is
the same as the ratio of tagged fish in the pond.
35/250 200/N
which in turn gives
N 200 X 250/35 1428.57
A sensible conclusion is that there are
approximately N = 1400 catfish in the pond.
36
Collecting Statistical Data
13.6 Clinical
Studies
37
Collecting Statistical Data
Clinical Studies Terminology
Clinical study (trial). Studies concerned with
determining whether a single variable or
treatment can cause a certain effect.
Confounding variables. All other possible
contributing causes that could produce the
same effect in a clinical study.
38
Collecting Statistical Data
Clinical Studies Terminology
Controlled study. The subjects are divided
into two different groups.
– Treatment group. Subjects receiving the actual
treatment.
– Control group. Subjects that are not receiving any
treatment.
39
Collecting Statistical Data
Clinical Studies Terminology
Randomized controlled study. The subjects
are assigned to the treatment group or the
control group randomly.
Placebo effect. A critical confounding
variable from the generally accepted principle
that just the idea that one is getting a
treatment, can produce positive results.
40
Collecting Statistical Data
Clinical Studies Terminology
Placebo. A make-believe form of treatment–
a harmless pill, an injection of saline solution,
or any other fake type of treatment intended to
look like the real treatment.
Controlled placebo study. A controlled
study in which the subjects in the control group
are given a placebo.
41
Collecting Statistical Data
Clinical Studies Terminology
Blind. A study in which neither the members of the treatment group nor the members of the control group know to which of the two groups they belong.
Double-blind study. A controlled placebo study in which neither the subjects nor the scientist conducting the experiment know which subjects are in the treatment group and which are in the control group.
42
Collecting Statistical DataConclusion
Census
Sample/ Survey/ Sample Bias
Simple Random/Stratified Sampling
Confounding Variables
Controlled Study