Chapter 4

Chapter 4

4.1 Sampling and Surveys

Sampling

• When we want to know more information about an entire group of individuals called a population, we collect information from a smaller group called the sample.

• We can draw conclusions from the sample about the entire population.

Which is which?

• The student government at a high school surveys 100 students at the school to get their opinions about a change to the bell schedule.

• Population: ALL students at the school• Sample: 100 students

Sample Survey

• Ice Cream: Sampling a tiny bit lets you know if you want a whole cone because the sample represents the whole very well!

• Exactly what population do you want to find out about?

• Exactly what do you want to measure?

Sample Survey

• Only call it a “sample survey” if it’s an organized plan to choose a sample that is representative of the entire population.

• Population can consist of people, animals, or things.

BAD sampling

• A convenience sample is a bad way to sample.• Involves asking people who are “close by.”• Provides unrepresentative data.• Creates BIAS- using a method that will usually

overestimate or underestimate the value we are wanting to know.

BAD sampling

• Voluntary response sample is another bad method.

• People with STRONG opinions are more likely to respond on a voluntary basis.

• Calling/writing in to a talk show, etc.

CYU on page 211- 1. Convenience 2. VRS

GOOD sampling

• Simple Random Sample• Everyone has an equal chance of being

chosen.• Ex: put everyone’s name on a slip of paper

and draw names.• Table of Random Digits (Table D in the back of

the book) is a good way to sample this way. Read p.212 to find out how to use it.

Stratified Random Sample

1. Break the population into smaller groups of similar make-up. The groups are called strata.

2. Choose a separate SRS in each stratum and combine them to form the full sample.

Cluster Sample

• 1. Divide the population in to smaller groups which mirror the population.

• 2. Randomly choose an entire cluster to participate in the sample.

• Don’t get stratified and cluster confused!• CYU on page 219- answers on next slide.

CYU p.219

1. We would have to choose 200 different seats and go to each one. This would take a long time. Also, people sometimes get up and get concessions, etc. and might not be there.

2. Use lettered rows as the strata. Each row is the same distance from the court and should be the same ticket price.

3. Use numbered sections as clusters. Each section contains seats with many different ticket prices so those people should mirror the entire population.

Inference for Sampling• We infer info about the population from what we

know about the sample.• Rely on random sampling by eliminating bias.• It is unlikely the results are exactly the same as

for the entire population.• However, the laws of probability allow

trustworthy inference about the population.• Results come with a margin of error.• Larger random samples give better info about the

population than smaller ones!

What can go wrong?

• Sampling Errors– Undercoverage (some group gets left out)

• Nonsampling Errors– Nonresponse (an individual chosen can’t be contacted

or refuses to participate)– Response Bias (a systematic pattern of incorrect

responses, sometimes people lie)– Wording of questions (confusing or leading questions)– Order the questions are asked in (see page 224)

CYU answers page 224

1. (a) sampling error (b) Nonsampling error (c) sampling error

2. The question makes it sound like diapers are NOT a problem in the landfill. Fewer people will probably suggest that we should ban them.

Chapter 4

4.2 Experiments

Observational Studies

• Observes individuals• Does not attempt to influence the responses• No interference with participants• GOAL: to describe a group/situation, compare

groups, examine relationship between variables.

Experiments

• Deliberately interferes with/imposes some treatment on individuals

• Measures the responses to such treatments• GOAL: determine whether a specific

treatment causes a change in the response (think about medical studies)

• When we need to understand cause and effect, experiments are the way to go

Potential Problems

• Lurking Variable: influences the response variable and makes it hard to see the relationship between the explanatory and response variables.

• Confounding: 2 variables are associated in a way that makes their effects hard to distinguish from one another.


1. experiment- treatment (brightness of screen) was imposed on the laptops.

2. Observational study3. Explanatory: # of meals eaten per week with

family. Response: GPA4. Observational study- might be lurking

variables.

Experiments

• Treatment- a specific condition applied to the individuals

• Experimental units- the individuals to which treatments are applied

• Subjects- name from experimental units when they are humans

• Factors- another name for explanatory variables

Random Assignment

• Experimental Units/Subjects are assigned to different treatments AT RANDOM (using a chance process)

Control Group

• Sometimes a control group is used. They do not receive a treatment. They provide a BASELINE for comparing effects of the other treatments. In other words, what would happen if we did nothing?


2. Use an alphabetical list of students- assign each one a # 1-29. Use Table D and choose 15 numbers between 1 and 29- these students will meet in small groups. The others will view the videos alone.3. A control group would allow us to have a group to compare to the treatment group. We can evaluate whether the group work is actually better.

Principles for Designing Experiments

1. Control2. Random Assignment3. Replication

Replication means you should use enough subjects so that the effects of a treatment can be distinguished from “chance” or a “fluke.”

Placebo Effect

Medical treatments:If some subjects take a pill and others don’t, they KNOW they are not taking anything.Usually a placebo will be given instead (a sugar pill that does nothing) so they don’t know what they are taking.The placebo effect is the effect of simply taking pills, even though they did not contain medicine.

Double-Blind

• Subjects don’t know which treatment they receive.

• People who interact with them and measure results don’t know, either.

• Sometimes this won’t work.


1. No. Women who thought they were getting an ultrasound may have had different reactions to pregnancy.

2. No. Mothers knew if they had an ultrasound.3. All mothers could have been treated as if

they were receiving an ultrasound, but for some the machine wouldn’t have been turned on. Women would have had to not see the screen for that to work.

Statistically Significant

• An effect so large there is probably no way it could have occurred by chance.

Blocking

• A group of subjects that are known to be similar in a way that would probably affect the response to the treatment

• Randomized block design- random assignment to treatments is carried out separately within each block

Matched Pairs Design

• Read page 249-251• Special case of a randomized block design that

uses blocks of size 2.

Chapter 4

4.3 Using Studies Wisely

Scope of Inference

• Most experiments don’t select subjects at random from the population.

• This limits inference about cause and effect.• Observational studies don’t randomly assign

subjects to groups so they can’t use cause and effect.

• We can only make inferences about the population.

Why can’t we use an experiment?

• Doctors have noticed that people who frequently use tanning beds are at a greater risk for skin cancer. Could this be due to some other lurking variable like sun exposure?

• An experiment could help settle this but forcing people to use tanning beds would be unethical.

Establishing Causation

• Sometimes we can’t do an experiment because it’s unethical, so we must do an observational study. This is what we look for:– Strong association– Consistent association– Larger values of x (explanatory) are associated with

stronger responses– The alleged cause precedes the effect in time (see

page 264)– Alleged cause is plausible

Data Ethics

• Read examples on page 265 and decide whether you think each one is ethical.

• It can be tricky to stay ethical when collecting data from people, especially when we impose a treatment.

• 3 basic standards to keep in mind– Planned studies must be reviewed by an institutional

review board (IRB)– Individuals must give informed consent– Data must be kept confidential.

Data Ethics

• Read more about each of the standards for data ethics on page 266.

• When I was preparing my thesis for my master’s degree, I did a research study. I had to:– Get my study approved through the IRB at my school– Have students/parents sign a consent form– Keep all student information/names confidential as I

prepared my reports/presentation

Documents

Chapter 4