48
Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Embed Size (px)

Citation preview

Page 1: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Hypothesis Testing

Introduction to StatisticsChapter 8

Mar 2-4, 2010Classes #13-14

Page 2: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Hypothesis Test

A statistical method that uses the sample data to evaluate a hypothesis about a population parameter

Page 3: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Hypothesis-Testing Procedure State Hypothesis

Use hypothesis to predict characteristics of population Null (H0) vs. Alternative (HA)

Set criteria for decision Must be clearly set before testing Set alpha level (also before testing)

Obtain a random sample Larger samples are preferred

Collect data and compute sample statistics Calculate z-scores

Make a decision Compare obtained sample with hypothesis

Page 4: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

Suppose that we want to compare the crime rate in San Diego with the crime rate in the rest of the country… Is there more or less crime in San Diego than

the national average?

Page 5: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

First, we start with the hypothesis that “the crime rate on average in San Diego is the same as the national average”

To test our hypothesis, we ask what sample means would occur if many samples of the same size were drawn at random from our population if our hypothesis is true

Page 6: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

We can now refer to the sampling distribution of the mean, for an infinite series of samples of size n, drawn from a population whose mean is the same as the national average, and we compare our sample mean with those in this sampling distribution

If our hypothesis is true, then the distribution of sample means will be centered about the national average

Page 7: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1Suppose that the relationship between our

sample mean and those of the sampling distribution of the mean looks like this…

Our obtained value.

Our hypothesized value.

Page 8: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

If so, our sample mean is one that could reasonably occur if the hypothesis is true, and we will retain our hypothesis as one that could be true

The crime rate of San Diego is the same as the national average

Page 9: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

On the other hand, if the relationship between our sample mean and those of the sampling distribution of the mean looks like this…

Our hypothesized value.

Our obtained value.

Page 10: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 1

Our sample mean is so deviant that it would be quite unusual to obtain such a value when our hypothesis is true

In this case, we would reject our hypothesis and conclude that it is more likely that the crime rate of San Diego is not the same as the national average

The population represented by the sample differs significantly from the comparison population

Page 11: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Null Hypothesis

The hypothesis that we put to the test is called the null hypothesis, symbolized H0

The null hypothesis usually states the situation in which there is no difference (the difference is “null”) between populations

Page 12: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Alternative Hypothesis

The alternative hypothesis, symbolized HA, is the opposite of the null hypothesis

The alternative hypothesis is also identified as the research hypothesis, or the “hunch” that the investigator wants to test

Page 13: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Null and Alternative Hypotheses

Both H0 and HA are statements about population parameters, not sample statisticsA decision to retain the null hypothesis implies

a lack of support for the alternative hypothesisA decision to reject the null hypothesis implies

support for the alternative hypothesis

Page 14: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

When do we retain and when do we reject the null hypothesis?

When we draw a random sample from a population, our obtained value of the sample mean will almost never exactly equal the mean of our population

The decision to reject or retain the null hypothesis depends on the selected criterion for distinguishing between those sample means that would be common and those that would be rare if H0 was true

Page 15: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

When do we retain and when do we reject the null hypothesis?

If the sample mean is so different from what is expected when H0 is true that its appearance would be unlikely, H0 should be rejected

But what degree of rarity of occurrence is so great that it seems better to reject the null hypothesis than to retain it?

Page 16: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

When do we retain and when do we reject the null hypothesis?

This decision is somewhat arbitrary, but common research practice is to reject H0 if the sample mean is so deviant that its probability of occurrence in random sampling is .05 or less

Such a criterion is called the level of significance, symbolized

Page 17: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Rejection Regions For our purposes, we will adopt the .05 level of

significance. Therefore, we will reject H0 only if our obtained sample

mean is so deviant that it falls in the upper 2.5% or lower 2.5% of all the possible sample means that would occur when H0 is true. The portions of the sampling distribution that include the values

of the mean that lead to rejection of the null hypothesis are called rejection regions.

If our sample mean falls in the middle 95% of the distribution of all possible values of the mean that could occur when H0 is true, we will retain the null hypothesis.

Page 18: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Critical Values

We can use the normal curve table to calculate the Z values, called critical values, that separate the upper 2.5% and lower 2.5% of sample means from the remainder.

Page 19: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 2

Suppose our obtained sample mean (n = 100) of the crime rate in Boston is a score of 90

Suppose that the national average is known to be 85, with a standard deviation of 20

Even if the population mean really is a score of 85, because of random sampling variation we do not expect the mean of a sample randomly drawn from a population to be exactly 85 (although it could be)

Page 20: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 2

Page 21: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Using the Sampling Distribution of the Mean to Determine Probability

The important question is what is the relative position of the obtained sample mean among all those that could have been obtained if the hypothesis is true?

To determine the position of the obtained sample mean, it must be expressed as a Z score.

Page 22: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Z score

Before, you were finding the Z score of a single individual on a distribution of a population of individuals

In hypothesis testing, you are finding a Z score of your sample’s mean on a distribution of means

Page 23: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 2 In current study,

mean theoferror standard

) trueis Ho(when mean population edhypothesizmean sample obtainedZ

5.22

5

100

208590

Z

Page 24: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14
Page 25: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 2

Our sample mean is 2.5 standard errors of the mean greater than expected if the null hypothesis were true.

The value of 2.5 falls in the rejection region, so we reject H0 and retain HA.

We can conclude that the mean of the population from which the sample came from is not 85.

Page 26: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 2

The crime rate of Boston is, on average, different from (greater than) other cities of the country.

Notice that the conclusion is about the population represented by the sample under study and not simply the particular sample itself.

Page 27: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

What if we had used = .01?

Page 28: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

If we retain H0, what can we conclude?

The decision to retain H0 does not mean that it is likely that H0 is true.

Rather, this decision reflects the fact that we do not have sufficient evidence to reject the null hypothesis.

Certain other hypotheses would also have been retained if tested in the same way.

Page 29: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

If we retain H0, what can we conclude?

Consider our example where the hypothesized population mean is 85.

If we had obtained a sample mean of 86, the null hypothesis would have been retained.

But suppose the hypothesized population mean was 87. If we had obtained a sample mean of 86, the null hypothesis would

also have been retained.

Page 30: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

What if we obtain a mean of 80 and what

if we had used = .01? (Hypothesized population mean was 87)

Page 31: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 3 A teacher believes that by taking her summer course students will

achieve a higher score on a biology final exam taken at the end of semester. The exam has a maximum score of 275 points. The teacher has been at the college for 30 years and has kept the data on these exam scores. The known population mean is 200 with a standard deviation of 15.

40 students decide to spend the summer taking this prep course. In review of their scores on the final exam the teacher is pleased as she reports to dean the success of her summer course. The 40 students achieved a mean score of 205.

The dean hires someone to do a statistical analysis to determine the efficiency of the summer course.

State the Null (H0) and Alternative (HA) hypotheses. Use an alpha level of .01. What is your decision? Interpret this decision.

Page 32: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example 3

Page 33: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Strength of Decision

Rejecting the null hypothesis means that H0 is probably false, a strong decision.

Retaining the null hypothesis is a weak decision.

Page 34: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Two-tailed Test

The alternative hypothesis states that the population parameter may be either less than or greater than the value stated in H0. The critical region is divided between both tails of the sampling

distribution.

Page 35: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Two-tailed Test

This type of test is desirable in certain research situationsFor example, in cases in which the

performance of a group is compared to a known standard, it would be of interest to discover that the group is superior or inferior

Page 36: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

One-tailed Test

The alternative hypothesis states that the population parameter differs from the value stated in H0 in one particular direction.The critical region is located only in one tail of

the sampling distribution.

Page 37: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

One-tailed Test

Upper-tail Critical Lower-tail Critical

Page 38: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

One-tailed Test

The advantage of a one-tailed test is that it is more sensitive to detecting a false hypothesis in the direction of concern than a two-tailed test.

The major disadvantage of a one-tailed test is that it precludes any chance of discovering that reality is just the opposite of what the alternative hypothesis says.

Page 39: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Hypothesis Testing Test Result –

True State

H0 True H0 False

H0 True CorrectDecision

Type I Error

H0 False Type II Error CorrectDecision

)()( ErrorIITypePErrorITypeP

• Goal: Keep , reasonably small

Page 40: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Errors in Hypothesis Testing Type I Error:

Occurs when a researcher rejects a null hypothesis that is actually true

Concluding there IS an effect when there is NOT

Type II Error:Occurs when a researcher fails to reject a null

hypothesis that is falseBasically, here the hypothesis test has failed

to detect a real treatment effect

Page 41: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example - Efficacy Test for New drug

Drug company has new drug, wishes to compare it with current standard treatment

Federal regulators tell company that they must demonstrate that new drug is better than current treatment to receive approval

Firm runs clinical trial where some patients receive new drug, and others receive standard treatment

Numeric response of therapeutic effect is obtained (higher scores are better).

Parameter of interest: New - Std

Page 42: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example - Efficacy Test for New drug

Null hypothesis - New drug is no better than standard trt

00:0 StdNewStdNewH

• Alternative hypothesis - New drug is better than standard trt

0: StdNewAH

• Experimental (Sample) data:

StdNew

StdNew

StdNew

nn

ss

yy

Page 43: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Example - Efficacy Test for New drug

Type I error - Concluding that the new drug is better than the standard (HA) when in fact it is no better (H0). Ineffective drug is deemed better.

Type II error - Failing to conclude that the new drug is better (HA) when in fact it is. Effective drug is deemed to be no better.

Page 44: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Effect Size

Effect size is a measure of the strength of the relationship between two variables

In scientific experiments, it is often useful to know not only whether an experiment has a statistically significant effect, but also the size of any observed effects

In practical situations, effect sizes are helpful for making decisions

Page 45: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Effect Size

The concept of effect size appears in everyday language.

For example, a weight loss program may boast that it leads to an average weight loss of 30 pounds. In this case, 30 pounds is an indicator of the claimed effect size. Another example is that a tutoring program may claim that it raises school performance by one letter grade. This grade increase is the claimed effect size of the program.

Page 46: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Effect Size

Cohen’s dAn effect size measure representing the

standardized difference between two means

Page 47: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Effect Size

r

Md

deviation standard population

differencemean sCohen'

0.25

20

5

20

8590

r

Md

Example 2

Small effect (small to medium). See Table 8.2 (page 233).

Example 3

Small effect (small to medium). See Table 8.2 (page 233).

0.33

15

5

15

200205

r

Md

Page 48: Hypothesis Testing Introduction to Statistics Chapter 8 Mar 2-4, 2010 Classes #13-14

Credits http://psy.ucsd.edu/~sky/Psyc%2060%20Hypothesis%20Testing.ppt#3 http://www.stat.ufl.edu/~winner/sta6934/hyptest.ppt#2