Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step...

Preview:

Citation preview

Hypothesis Testing: One Sample Cases

Outline:

– The logic of hypothesis testing– The Five-Step Model– Hypothesis testing for single sample means

(z test)– One- vs. Two- tailed tests

Significant Differences• Hypothesis testing is designed to detect significant

differences: differences that did not occur by random chance.

• In the “one sample” case: we compare a random sample (from a large group) to a population.

• We compare a sample statistic to a population parameter to see if there is a significant difference.

Our Problem:

• The education department at a university has been accused of “grade inflation” so education graduates have much higher scores than students in general.

• Scores of all education graduates should be compared with the scores of all students.– There are 1000s of education graduates, far too many to

interview. – How can this be investigated without interviewing all education

graduates?

What we know:• The mean scores for all

students is 2.70 and the s.d = 0.7. These values are parameters.

The population value

• To the right is the statistical information for a random sample of education graduates:

= 2.70 = 0.70

Sample mean = 3.00

Sample size n = 117

X

Questions to ask:

• Is there a difference between the parameter (2.70) and the statistic (3.00) sample mean?

• Could the observed difference have been caused by random chance?

• Is the difference real (significant)?

Two Possibilities:

1. The sample mean (3.00) is the same as the pop. mean (2.70).

– The difference is trivial and caused by random chance.

2. The difference is real (significant).– Education graduates are different from all

students.

The Null and Alternative Hypotheses:1. Null Hypothesis (H0)

– The difference is caused by random chance.

– The H0 always states there is “no significant difference.” In this

case, we mean that there is no significant difference between the population mean and the sample mean.

2. Alternative hypothesis (H1)

– “The difference is real”.

– (H1) always contradicts the H0.

• One (and only one) of these explanations must be true. Which one?

Test the Explanations• We always test the Null Hypothesis.

• Assuming that the H0 is true:

– What is the probability of getting the sample mean (3.00) if the H0 is true and all education

graduates really have a mean of 2.70?

– In other words, the difference between the means is due to random chance.

– If the probability associated with this difference is less than 5%, reject the null hypothesis.

Test the Hypotheses• Use the 5% value as a guideline to identify differences that

would be rare or extremely unlikely if H0 is true. This

“alpha” value defines the “region of rejection.”

• Use the Z score formula for single samples and

• If the probability is less than 5%, the calculated or “observed” Z score will be beyond ±1.96 (the “critical” Z score).

Two-tailed Hypothesis Test:

When α = 5%, then 2.5% of the area is distributed on either side of the curve in area (C )

The 95% in the middle section represents no significant difference between the population and the sample mean.

The cut-off between the middle section and +/- 2.5% is represented by a Z-value of +/- 1.96.

Z= -1.96

c

Z = +1.96

c

Testing Hypotheses:Using The Five Step Model…

1. Make Assumptions and meet test requirements.

2. State the null hypothesis.3. Select the sampling distribution and

establish the critical region.4. Compute the test statistic.5. Make a decision and interpret results.

Step 1: Make Assumptions and Meet Test Requirements

• Random sampling– Hypothesis testing assumes samples were selected using random

sampling. – In this case, the sample of 117 cases was randomly selected from all

education graduates.

• Sampling Distribution is normal in shape– This is a “large” sample (n≥100).

Step 2 State the Null Hypothesis

• H0: μ = 2.7

• You can also state Ho: No difference between the sample mean and

the population parameter

– (In other words, the sample mean of 3.0 is really the same as the population mean of 2.7 – the difference is not real but is due to chance.)

– The sample of 117 comes from a population that has a score of 2.7. – The difference between 2.7 and 3.0 is trivial and caused by random

chance.

Step 2 (cont.) State the Alternate Hypothesis• H1: μ≠2.7

• Or H1: There is a difference between the sample mean and the population parameter – The sample of 117 comes from a population that does not have

a score of 2.7. In reality, it comes from a different population.– The difference between 2.7 and 3.0 reflects an actual difference

between education graduates and other students.– Note that we are testing whether the population the sample

comes from is from a different population or is the same as the general student population.

Step 3 Select Sampling Distribution and Establish the Critical Region

• Sampling Distribution= Z

– Alpha (α) = 5%

– α is the indicator of “rare” events.

– Any difference with a probability less than α is rare and will cause us to reject the H0.

Step 3 (cont.) Select Sampling Distribution and Establish the Critical Region

• Critical Region begins at Z= ± 1.96

– This is the critical Z score associated with α = 5%, two-tailed test.

– If the obtained Z score falls in the Critical Region, or “the region of rejection,” then we would reject the H0.

Step 4: Use Formula to Compute the Test Statistic Z for large samples (≥ 100)

Z

N

Test the Hypotheses

• Substituting the values into the formula, we calculate a Z score of 4.6357.

3.0 2.74.6357

0.7

117

Z

Step 5 Make a Decision and Interpret Results

• The obtained Z score falls in the Critical Region, so we reject the H0.

The Z value >1.96

– If the H0 were true, a sample outcome of 3.00 would be unlikely.

– Therefore, the H0 is false and must be rejected.

• Education graduates have a score that is significantly different from the general student body (Z = 4.62, α = 5%).*

• *Note: Always report significant statistics.

Looking at the curve:(Area C = Critical Region when α=.05)

Z= -1.96

c

Z = +1.96

c z= +4.62 I

Summary:

• The score of education graduates is significantly different from the score of the general student body.

• In hypothesis testing, we try to identify statistically significant differences that did not occur by random chance.

• In this example, the difference between the parameter 2.70 and the statistic 3.00 was large and unlikely

(p < 5%) to have occurred by random chance.

Summary (cont.)

• We rejected the H0 and concluded that the

difference was significant.

• It is very likely that Education graduates have scores higher than the general student body

Rule of Thumb:• If the test statistic is in the Critical Region

( α = 5%, beyond ±1.96):

– Reject the H0. The difference is significant.

• If the test statistic is not in the Critical Region (at α = 5%, between +1.96 and -1.96):

– Fail to reject the H0. The difference is not

significant.

Two-tailed vs. One-tailed Tests• In a two-tailed test, the direction of the

difference is not predicted.• A two-tailed test splits the critical region

equally on both sides of the curve.• In a one-tailed test, the researcher predicts

the direction (i.e. greater or less than) of the difference.

• All of the critical region is placed on the side of the curve in the direction of the prediction.

• Assumptions– Population Is Normally Distributed– If Not Normal, use large samples– Null Hypothesis Has =, , or Sign Only

• Z Test Statistic:

One-Tail Z Test for Mean (Known)

xz

n

Z

Reject H 0

Z

Reject H

0

H0: H1: <

H0: 0 H1: >

Must Be Significantly Below

=

Small values don’t contradict H0

Don’t Reject H0!

Rejection Region

•Does an average box of cereal contain 368 grams of cereal?

•A random sample of 25 boxes showed X = 372.5.

•The company has specified to be 15 grams.

•Test at the 5% level.

368 gm.

Example: One Tail Test

H0: 368 H1: > 368

_

Z0 1.645

What Is Z given = 5%?

= 5%

Finding Critical Values: One Tail

Critical Value = 1.645

InvNormal(.95) = 1.645

Use % points table below normal table in formula book

•= 5%

•n = 25

•Critical Value: 1.645

Test Statistic:

Decision:

Conclusion:

Do Not Reject H0 at = 5%

No Evidence True Mean Is More than 368

Z0 1.645

.05

Reject

Example Solution: One Tail

H0: 368 H1: > 368 372.5 368

1.5015

25

XZ

n

1.5<critical value of 1.645

1.5

Z0 1.50

p Value6.68%

From Z Table: Lookup InvNorm (1.50)

93.32%

Use the alternative hypothesis to find the direction of the test.

P(Z 1.50) = 6.68%

p Value Solution

0 1.50 Z

Reject

(p Value = 6.68%) ( = 5%). Do Not Reject.

p Value = 6.68%

= 5%

Test Statistic Is In the Do Not Reject Region

p Value Solution

•Does an average box of cereal contains 368 grams of cereal?

•A random sample of 25 boxes showed X = 372.5.

•The company has specified to be 15 grams.

•Test at the 5% level.

368 gm.

Example: Two Tail Test

H0: 368

H1:

368

•= 5%

•n = 25

•Critical Value: ±1.96

Test Statistic:

Decision:

Conclusion:

Do Not Reject Ho at = 5%

No Evidence that True Mean Is Not 368

Example Solution: Two Tail

Z0 1.96

2.5%

Reject

-1.96

2.5%

H0: 386

H1:

386372.5 368

1.5015

25

XZ

n

The Curve for Two- vs. One-tailed Tests at α = .05:

Two-tailed test:“is there a significant

difference?”

One-tailed tests:“is the sample meangreater than µ?”

“is the sample meanless than µ?”

11.36

Interpreting the p-value…• The smaller the p-value, the more statistical evidence exists

to support the alternative hypothesis.

• If the p-value is less than 1%, there is overwhelming evidence that supports the alternative hypothesis.

• If the p-value is between 1% and 5%, there is a strong evidence that supports the alternative hypothesis.

• If the p-value is between 5% and 10% there is a weak evidence that supports the alternative hypothesis.

• If the p-value exceeds 10%, there is no evidence that supports the alternative hypothesis.

Type I error• Occurs when an experimenter thinks she/he has a significant

result, but it is really due to chance• Analogous to a “false positive” on a drug test.• Risk of a Type I error is the same as the significance level, e.g.,

p < 5%• Solutions: use a more stringent significance level, use

replication

Type II error• Occurs when a researcher fails to find a significant result

when, in fact, there was something significant going on.• Analogous to a “false negative” on a drug test.• Must be calculated with a test of statistical “power,” e.g.,

given the sample size, how big would an effect have to be in order to detect it?

• Solutions: increase sample size, use more sensitive precise measures, use replication

Type I and Type II errors

In the larger population, H0 is correct

In the larger population H1 is correct

Based on sample, H0 is supported

Correct Decision

Incorrect Decision: Type II Error

Base on sample, H1 is accepted

Incorrect Decision: Type I Error

Correct Decision

Implications• To some extant, Type I and Type II errors trade off with one

another– Decreasing the chance of a Type I error may increase the chance

of a Type II error.• A Type I error is the more shocking of the two

– Type I entails shouting “Eureka” when you haven’t really found it.

– Scientific skepticism makes Type II errors more palatable

Recommended