Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Intervals and Tests STAT 101 Dr. Kari Lock Morgan 10/2/12 SECTION 4.3, 4.4, 4.5 Type

Statistics: Unlocking the Power of Data Lock5

Hypothesis Testing:Intervals and Tests

STAT 101

Dr. Kari Lock Morgan

10/2/12

SECTION 4.3, 4.4, 4.5• Type I and II errors (4.3)• More randomization distributions (4.4)• Connecting intervals and tests (4.5)


ProposalsProject 1 proposal comments

Give spreadsheet with data in correct format

Cases and variables


RemindersHighest scorer on correlation guessing game

gets an extra point on Exam 1! Deadline: noon on Thursday, 10/11.

First student to get a red card gets an extra point on Exam 1!

http://istics.net/gett/gcstart.php?group_id=duke


There are four possibilities:Errors

Reject H0 Do not reject H0

H0 true

H0 false TYPE I ERROR

TYPE II ERRORTrut

h

Decision

• A Type I Error is rejecting a true null

• A Type II Error is not rejecting a false null


• In the test to see if resveratrol is associated with food intake, the p-value is 0.035.

o If resveratrol is not associated with food intake, a Type I Error would have been made

• In the test to see if resveratrol is associated with locomotor activity, the p-value is 0.980.

o If resveratrol is associated with locomotor activity, a Type II Error would have been made

Red Wine and Weight Loss


A person is innocent until proven guilty.

Evidence must be beyond the shadow of a doubt.

Types of mistakes in a verdict?

Convict an innocent

Release a guilty

Ho Ha

Type I error

Type II error

Analogy to Law

p-value from data


• The probability of making a Type I error (rejecting a true null) is the significance level, α

• α should be chosen depending how bad it is to make a Type I error

Probability of Type I Error


If the null hypothesis is true:• 5% of statistics will be in the most extreme 5% • 5% of statistics will give p-values less than 0.05• 5% of statistics will lead to rejecting H0 at α = 0.05• If α = 0.05, there is a 5% chance of a Type I error

Distribution of statistics, assuming H0 true:



If the null hypothesis is true:• 1% of statistics will be in the most extreme 1% • 1% of statistics will give p-values less than 0.01• 1% of statistics will lead to rejecting H0 at α = 0.01• If α = 0.01, there is a 1% chance of a Type I error

Distribution of statistics, assuming H0 true:



Probability of Type II ErrorThe probability of making a Type II Error (not

rejecting a false null) depends on

Effect size (how far the truth is from the null)

Sample size

Variability

Significance level


Choosing αBy default, usually α = 0.05

If a Type I error (rejecting a true null) is much worse than a Type II error, we may choose a smaller α, like α = 0.01

If a Type II error (not rejecting a false null) is much worse than a Type I error, we may choose a larger α, like α = 0.10


Come up with a hypothesis testing situation in which you may want to…

• Use a smaller significance level, like = 0.01

• Use a larger significance level, like = 0.10

Significance Level


Randomization Distributionsp-values can be calculated by randomization

distributions:

simulate samples, assuming H0 is true calculate the statistic of interest for each sample find the p-value as the proportion of simulated

statistics as extreme as the observed statistic

Today we’ll see ways to simulate randomization samples for more situations


Randomization Distribution

In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and

What do we require about the method to produce randomization samples?

a) = 12b) < 12c)

We need to generate randomization samples assuming the null hypothesis is true.



In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and .

Where will the randomization distribution be centered?

a) 10.2b) 12c) 45d) 1.8

Randomization distributions are always centered around the null hypothesized value.


Randomization Distribution Center

A randomization distribution is centered at the value of the parameter

given in the null hypothesis.

A randomization distribution simulates samples assuming the null hypothesis is true, so



In a hypothesis test for H0: = 12 vs Ha: < 12, we have a sample with n = 45 and

What will we look for on the randomization distribution?

a) How extreme 10.2 is b) How extreme 12 isc) How extreme 45 isd) What the standard error ise) How many randomization samples we collected

We want to see how extreme the observed statistic is.



In a hypothesis test for H0: 1 = 2 vs Ha: 1 > 2 , we have a sample with and

What do we require about the method to produce randomization samples?

a) 1 = 2

b) 1 > 2

c) 26, 21d)

We need to generate randomization samples assuming the null hypothesis is true.




Where will the randomization distribution be centered?

a) 0b) 1c) 21d) 26e) 5

The randomization distribution is centered around the null hypothesized value, 1 - 2 = 0




What do we look for on the randomization distribution?

a) The standard errorb) The center pointc) How extreme 26 isd) How extreme 21 ise) How extreme 5 is

We want to see how extreme the observed difference in means is.



For a randomization distribution, each simulated sample should…

•be consistent with the null hypothesis•use the data in the observed sample•reflect the way the data were collected


• In randomized experiments the “randomness” is the random allocation to treatment groups

• If the null hypothesis is true, the response values would be the same, regardless of treatment group assignment

• To simulate what would happen just by random chance, if H0 were true:

o reallocate cases to treatment groups, keeping the response values the same

Randomized Experiments


Observational StudiesIn observational studies, the “randomness” is random

sampling from the population

To simulate what would happen, just by random chance, if H0 were true: Simulate resampling from a population in which H0 is true

How do we simulate resampling from a population when we only have sample data? Bootstrap!

How can we generate randomization samples for observational studies? Make H0 true, then bootstrap!


• = average human body temperate98.6H0 : = 98.6Ha : ≠ 98.6

• We can make the null true just by adding 98.6 – 98.26 = 0.34 to each value, to make the mean be 98.6

• Bootstrapping from this revised sample lets us simulate samples, assuming H0 is true!

Body Temperatures


• In StatKey, when we enter the null hypothesis, this shifting is automatically done for us

StatKey

Body Temperatures

p-value = 0.002

http://www.lock5stat.com/statkey


Creating Randomization Samples

State null and alternative hypothesesDevise a way to generate a randomization sample that

Uses the observed sample data Makes the null hypothesis true Reflects the way the data were collected

1. Do males exercise more hours per week than females?

2. Is blood pressure negatively correlated with heart rate?

𝑥𝑚−𝑥 𝑓=3

𝑟=−0.057


Exercise and Gender• H0: m = f , Ha: m > f

• To make H0 true, we must make the means equal. One way to do this is to add 3 to every female value (there are other ways)

• Bootstrap from this modified sample

• In StatKey, the default randomization method is “reallocate groups”, but “Shift Groups” is also an option, and will do this


Exercise and Gender

p-value = 0.095


Exercise and Gender

The p-value is 0.095. Using α = 0.05, we conclude….

a) Males exercise more than females, on averageb) Males do not exercise more than females, on averagec) Nothing

Do not reject the null… we can’t conclude anything.


Blood Pressure and Heart Rate• H0: = 0 , Ha: < 0

• Two variables have correlation 0 if they are not associated. We can “break the association” by randomly permuting/scrambling/shuffling one of the variables

• Each time we do this, we get a sample we might observe just by random chance, if there really is no correlation


Blood Pressure and Heart Rate

p-value = 0.219

Even if blood pressure and heart rate are not correlated, we would see correlations this extreme about 22% of the time, just by random chance.


Randomization DistributionPaul the Octopus (single proportion):

Flip a coin 8 times

Cocaine Addiction (randomized experiment): Rerandomize cases to treatment groups, keeping response

values fixed

Body Temperature (single mean): Shift to make H0 true, then bootstrap

Exercise and Gender (observational study): Shift to make H0 true, then bootstrap

Blood Pressure and Heart Rate (correlation): Randomly permute/scramble/shuffle one variable


Randomization DistributionsRandomization samples should be generated

Consistent with the null hypothesisUsing the observed dataReflecting the way the data were collected

The specific method varies with the situation, but the general idea is always the same


• As long as the original data is used and the null hypothesis is true for the randomization samples, most methods usually give similar answers in terms of a p-value

• StatKey generates the randomizations for you, so most important is not understanding how to generate randomization samples, but understanding why

Generating Randomization Samples


Bootstrap and Randomization Distributions

Bootstrap Distribution Randomization Distribution

Our best guess at the distribution of sample statistics

Our best guess at the distribution of sample statistics, if H0 were true

Centered around the observed sample statistic

Centered around the null hypothesized value

Simulate sampling from the population by resampling from the original sample

Simulate samples assuming H0 were true

Big difference: a randomization distribution assumes H0 is true, while a bootstrap distribution does not


Which Distribution? Let be the average amount of sleep college students get

per night. Data was collected on a sample of students, and for this sample hours.

A bootstrap distribution is generated to create a confidence interval for , and a randomization distribution is generated to see if the data provide evidence that > 7.

Which distribution below is the bootstrap distribution?(a) is centered around the sample statistic, 6.7


Which Distribution? Intro stat students are surveyed, and we find that 152

out of 218 are female. Let p be the proportion of intro stat students at that university who are female.

A bootstrap distribution is generated for a confidence interval for p, and a randomization distribution is generated to see if the data provide evidence that p > 1/2.

Which distribution is the randomization distribution?(a) is centered around the null value, 1/2


• There are two types of errors: rejecting a true null (Type I) and not rejecting a false null (Type II)

Randomization samples should be generated Consistent with the null hypothesis Using the observed data Reflecting the way the data were collected

Summary


To DoRead Sections 4.4, 4.5

Do Homework 4 (due Thursday, 10/4)

http://stat.duke.edu/courses/Fall12/sta101.002/HW4.pdf

http://stat.duke.edu/courses/Fall12/sta101.002/HW4.pdf

Documents

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Intervals and Tests STAT 101 Dr. Kari Lock Morgan 10/2/12 SECTION 4.3, 4.4, 4.5 Type