Hypothesis testing

HYPOTHESIS TESTING

Prepared by Roderico Y. Dumaug, Jr.

For Intro to Statistics

Objectives

1) Able to formulate statistical hypothesis2) Discuss the two types of errors in hypothesis

testing3) Establish a decision rule for accepting or rejecting a

statistical hypothesis at a specified level of significance

4) Distinguish between the one-sample case and two-sample case tests of hypothesis concerning means

5) Choose the appropriate test statistics for a particular set of data.

Symbols Applicable1) Ho – Null Hypothesis

2) H1 – Alternative Hypothesis

3) β – Greek Letter Beta which is the probability of committing a Type 2 Error4) α – Greek letter Alpha which denotes a probability of committing a Type 1

Error and is known as the Level of Significance5) z6) σ – Greek letter Sigma which means the Variance7) σx - the standard deviation of the sampling distribution of the mean

8) µ - Greek letter ‘mu’ which is the mean of the normal population9) n – Sample size10) - Sample mean11) t – t distribution; a case where the population standard deviation is unknown12) s – standard deviation

Hypothesis Testing: Introduction

• Theory of Statistical Inference: Consists of methods which one makes inferences or generalizations about a population. Example is the Tests of Hypothesis.

• Population vs. Random Sample

Statistical Hypothesis

• Definition: A statistical hypothesis is an assertion or conjecture concerning one or more populations.

An assumption or statement, which may or may not be true concerning one or more population.

• Two types of Statistical Hypothesis:

a) The NULL HYPOTHESIS, Ho

b) The ALTERNATIVE HYPOTHESIS, H1

a)Nondirectional Hypothesis– Asserts that one value is different from another (or others). Also called as the 2-sided Hypothesis. “Not equal to” or ≠.

b)Directional Hypothesis – An assertion that one measure is Less than (or greater than) another measure of similar nature. Also called the 1-sided Hypothesis. “<“ or “>”

Examples of Statistical Hypothesis1) Ho: The average annual income of all the families in the City is Php36,000 (µ = Php

36,000). H1 : The average annual income of all the families in the City is not Php36,000 (µ ≠ Php36,000).

2) Ho: There is no significant difference between the average life of brand A light bulbs and that of brand B light bulbs (µA = µB).

H1 : There is a significant difference between the average life of brand A light bulbs and that of brand B light bulbs (µA ≠ µB).

3) Ho: The proportion of Metro Manila college students who prefer the taste of Papsi Cola is ²/₃(p = ²/₃)H1 : The proportion of Metro Manila college students who prefer the taste of Papsi Cola is less than ²/₃(p < ²/₃).

4) Ho: The proportion of TV viewers who watch talk shows from 9:00 to 10:00 in the evening is the same on Wednesday and Fridays (p1 = p2)

H1 : The proportion of TV viewers who watch talk shows from 9:00 to 10:00 in the evening is greater on Wednesday than on Fridays (p1 > p2).

Two types of Errors• Four possibilities on the Acceptance and

Rejection of a Ho: Consequences of Decisions in Testing Hypothesis

DECISION/FACT Ho is TRUE Ho is FALSE

ACCEPT Ho: CORRECT DECISION TYPE 2 ERROR denoted by β

REJECT Ho: TYPE 1 ERROR denoted by α CORRECT DECISION

αP (Type 1 Error)

P(Rejecting Ho when Ho is TRUE)

βP(Type 2 Error)P(Not Rejecting Ho when Ho is FALSE)

Elements of a Test of a Hypothesis

• Null Hypothesis (Ho)• Alternative Hypothesis (H1)• Test Statistic: A sample statistic used to decide

whether to reject the null hypothesis• Rejection Region• Calculation of Test Statistic• Conclusion: Numerical Value falls in the

Rejection Region or not

Level of Significance

• To specify the Probability of committing a Type 1 Error, α, which is popularly known as the Level of Significance

• We can determine the Critical Values which define the: – Region of Rejection (or Critical Region) and – Region of Acceptance

• The Critical Value serves as the basis for either Accepting or Rejecting a Hypothesis.

• When α = .05, the Region of Rejection is 0.05 and the Region of Acceptance is 0.95

One-Tailed and Two Tailed Tests• Where H1 is Directional, One-Tailed Test

• Where H1 is Non-Directional, Two-Tailed TestTYPE OF TESTS DIFFERENCE

One-Tailed Test Region of Rejection lies entirely in one end of the distribution. Hypothesizing a Range of Values

Two Tailed TestInvolves a Critical Region which is split into two equal parts placed in each tail of the distribution. A value of the parameter is being hypothesized.

Mathematical Formulation of H1 Region of Rejection

Greater Than ( >) Area of Rejection is placed entirely in the Right Tail of the Distribution

Less Than ( < ) Region of Rejection is in the Left Tail

Not Equal To (≠) Both Tails contain Equal areas serving as Critical Regions

Example: What form of Hypothesis Should be Used

• A civic organization is conducting a study to determine whether the proportion of women who smoke has increased since last study.

• A garment manufacturer suspects that that average order size for units of men’s underwear has decreased from last year’s.

• A doctor claims that the average age of heart attack patient is 45.

Let θ be the proportion of women who smoke during the last study

Therefore, Ho: θ = θo

H1: θ > θo

Let θ be the average order size for units of men’s underwear last year

Therefore, Ho: θ = θo

H1: θ < θo

Let θ be the average age of heart attack

Therefore, Ho: θ = 45H1: θ ≠ 45

Example:

• Given: z = 1.645, α = 0.05

Region of rejectionArea = 0.05

Region of AcceptanceArea = 0.95

1.645Right TailLeft Tail

Example:

• Given: z = -2.33, α = 0.01



-2.33

Example: Two Tailed

• Given: critical z values are ±1.96, α = 0.05



1.96-1.96


Critical Regions In Testing Hypothesis• Rejecting Ho

Level of Significance\ Type of Test One-Tailed Two-Tailed

Reject Ho

Computed value of z is GREATER than the Critical Value z > zo z > zo

Computed value of z is LESS than the Negative Critical Value z < - zo z < -zo

Steps in Hypothesis Testing

1) Formulate the Ho and the H1

2) Specify the level of significance α3) Choose the appropriate test statistic4) Establish the critical region5) Compute for the value of the statistical

test6) Make a decision and, if possible, draw

a conclusion

Test Concerning Means (from normally distributed data)

OUTLINE

I. One Sample Test (One Population)A. σ2 is known (assume that the population variance

is known)B. σ2 is unknown (the population variance is unknown)

II. Two Sample Test (One Population)

C. σ12 and σ2

2 are known

D. σ12 = σ2

2 = σ2 are unknown

E. k sample test

Test Concerning Means (from normally distributed data

I. One Sample Test (One Population)A. σ2 is known (assume that the population variance is known)

Conditions: We hypothesized that the MEAN of a Normal Population with a variance of σ2 is µo . We take a random sample of size n from this population and obtain a sample mean of which is somewhat different from µo .

To determine whether or not the observed difference between the computed value and the hypothesized µo is significant, we formulate the following hypothesis.

1) Ho: µ = µo 2) Ho: µ = µo 3) Ho: µ = µo

H1 : µ < µo H1 : µ ≠ µo H1 : µ >µo


A. σ2 known (assume that the population variance is known)

Since the parameter σ is known, the Z statistics is employed as the test statistics. Consequently, the z score corresponding to is:

where the denominator σx represents the standard error of the mean ( or the standard deviation of the sampling distribution of the mean) and is computed by the formula:

x

o xz

n

xSupposed α = 0.05 and the critical values are 1.96 and -1.96 then the ff decision rules applies:1. Reject Ho and accept H1, if z > 1.96 or z < -1.962. Cannot reject Ho (Accept H1), if z is within

the interval between -1.96 and 1.96



Rejection Region:

ZZ

2

ZZ 2

ZZ ZZ

500,8x


Example: One community college hypothesized that the mean starting monthly salary of its graduates is Php9000 and a stand deviation of Php1,000. A sample of 100 graduates were questioned and it was found that the average starting salary is Php8,500.00. Test this hypothesis at 5% level of significance.

Given: µo = 9,000 σ = 1,000 n = 100

vs.


000,9:H.)a o 000,9:H1 05.0.)b

96.1ZZ.)c 025.205.

59000500,8

n

xZ.)d

1001000

o

96.15

Compare.)e

-1.96 1.96

Region of AcceptanceArea: 0.95

Region of RejectionArea: 0.025


-5

TWO-TAILED TESTConclusion: REJECT HO

The data provide sufficientevidence to contradict the hypothesized mean of Php9000, it is actually LESSTHAN Php9000



Example: The average height of males in the freshmen class of a certain college has been 68.5 inches, with a standard deviation of 2.7 inches. Is there a reason to believe that there has been an increase in the average height if a random sample of 50 males in the present freshmen class have an average height 69.7 inches? Test at 0.025 level of significance.

Given: µo = 68.5 σ= 2.7Steps:

vs.

7.69x

5.68:H.)a o 5.68:H1 025.0.)b

96.1Z.)c 025.0 143.383818376516.0

2.1

071068.77.22.1

507.2

5.687.69

n

xZ.)d o

One-Tailed Test96.1143.3

Compare.)e

1.96



3.143

Conclusion: REJECT HO

The data provide sufficientevidence to indicate that the mean height is GREATER THAN 68.5 inches


B. σ2 is unknown (the population variance is unknown)

When the population standard deviation σ is unknown and the sample size n is less than 30, the T statistic is appropriate. The t value corresponding to a mean x of a sample taken from a normal population is

With df = n – 1, where estimated standard error of the sampling distribution . Thus, to test the hypothesis µ=µo against any suitable alternative when σ is unknown and n < 30,

s

xt

x

sns

x

x

ns

xt o With df = n -1



Rejection Region:

tT )1n(, tT )1n(,

)1n(,2tT

)1n(,2tT



Example: A major car manufacturer wants to test a new engine to see whether it meets new air pollution standards. The mean µ of all engines of this type must be less than 20 parts per million of carbon. Ten engines are manufactured for testing purposes, and the mean and standard deviation of the emission for this sample of engines were determined to be:

s = 3.0 parts/million

Do the data supply evidence to allow the manufacturer to conclude that this type of engine meets the pollution standard? Assume that the manufacturer is willing to risk a Type 1 error with probability α = 0.01.

Given: µo = 20 n = 10 s = 3.0

million/parts1.17x

1.17x

06.3

100.3201.17

ns

xT.)d 0

821.2ttt.)c 9,01.0)110(,01.0)1n(,

01.0.)b 20:H1 .vs20:H.)a o

One-Tailed Test

-2.821



821.206.3

Compare.)e


The data provide sufficientevidence that the engine type meets pollution control

-3.06



Example: Suppose a pharmaceutical company must demonstrate that a prescribed dose of a certain new drug will result in average increase in blood pressure of less than 3 points. Assume that only six patients can be used in the initial phase of human testing. Result: the six patients have blood pressure increase of 1.7, 3.0, 0.8, 3.4, 2.7, and 2.1 points. Use the results to determine if there is evidence that the new drug satisfies the requirement that the resulting increase in blood pressure averages less than 3 points.

Given:

95.0901666.03005.27

s

3069.18774.214

3069.187)79.35)(6(

)1n(n)x(xn

s22

28.2x 95.0s

3:H.)a o .vs 3:H0 01.0.)b

365.3ttt.)c 5,01.0)16(,01.0)1n(,

86.1

695.0

328.2

ns

xT.)d 0

One-Tailed Test

-3.365

365.386.1

Compare.)e

-1.86



Conclusion: DO NOT REJECT HO

The data do not provide sufficient evidence to conclude that the mean increase in blood pressure resulting from taking the drug is less than 3

Test Concerning Means (from normally distributed data)

OUTLINE

I. One Sample Test (One Population)A. σ2 is known (assume that the population variance

is known)B. σ2 is unknown (the population variance is unknown)


C. σ12 and σ2

2 are known

D. σ12 = σ2

2 = σ2 are unknown

E. k sample test


II. Two Sample Test (One Population)Test on the difference in Means

A. σ12 and σ2

2 are known

1) Ho: µ1-µ2 = µo 2) Ho: µ1-µ2 = µo 3) Ho: µ1-µ2 = µo

H1 : µ1-µ2 < µo H1 : µ1-µ2 ≠ µo H1 : µ1-µ2 >µo

Test Statistic:

2

22

1

21

o21

nn

)xx(Z



Rejection Region:

Note:

i.µo = 0

ii.µ1 - µ2 < µ0 µ1 < µ2, µ2 > µ1

iii.µ1 - µ2 > µo µ1 > µ2, µ2 < µ1

zZ

2

zZ 2

zZ zZ



Example: A university investigation, conducted to determine whether car ownership if students affect their academic achievement, was based on two random samples of 100 students, each drawn from the student body. The average and standard deviation of each group’s GPA (grade point average) are as shown.

Non-Car owners (n1=100) Car Owners (n2=100)

GPA GPA

Do the data present sufficient evidence to indicate a difference in the mean achievement between car owners and noncar owners? Test using α=0.10

Define: µ1 = mean GPA for Non-car owners; µ2 = mean GPA for Car owners; µ0 = 0

54.2x2 70.2x1 63.0s2 60.0s1

0:H.)a 210 .vs 0:H 211 10.0.)b

645.1zzz.)c 05.0210.0

2

84.1

100)63(

100)6(

)54.270.2(

nn

)xx(Z.)d

22

2

22

1

21

021

Two-Tailed Test

-1.645 1.645

645.184.1

Compare.)e





The data provide sufficient evidence to indicate a difference in the mean achievement between car owners and non-car owners, in fact non car owners have better academic performance than car owners.

1.84



B. σ12 = σ2

2 = σ2 are unknown

1) Ho: µ1-µ2 = µo 2) Ho: µ1-µ2 = µo 3) Ho: µ1-µ2 = µo

H1 : µ1-µ2 < µo H1 : µ1-µ2 ≠ µo H1 : µ1-µ2 >µo

Test Statistic: where

21

p

o21

n1

n1

S

)xx(T

2nn

s)1n(s)1n(S

21

222

211

p



B. σ12 = σ2

2 = σ2 are unknown

Rejection Region:

)22n1n(,tT

)22n1n(,2

tT

)22n1n(,2tT

)22n1n(,tT


B. σ12 = σ2

2 = σ2 are unknown

Example: A television network wanted to determine whether sports events or first run movies attract more viewers in the prime-time hours. It selected 28 prime-time evenings; of these, 13 had programs devoted to major sports events and the remaining 15 had first –run. The number of viewers (estimated by a television viewer rating firm) was reported for each program. If µ1 is the mean number of sports viewers per evening and µ2 is the mean number of movie viewers per evening, is there a difference in the mean number of viewers at 0.05 level of significance?

The TV network’s samples produce the results below:

Sports: n1 = 13 s1 = 1.8 million

Movies: n2 = 15 s2 = 1.6 million

million8.6x1 million3.5x2

Where:

0:H.)a 21o .vs 0:H 211

05.0.)b 056.2tt.)c 26,025.0

)22n1n(,2

34.2

151

131

69.1

0)3.58.6(

n1

n1

Sp

)xx(T.)d

21

021

69.121513

)6.1)(115()8.1)(113(2nn

s)1n(s)1n(Sp

22

21

222

211

056.234.2

Compare.)e

Two-Tailed Test

-2.056 2.056 2.34

Region of RejectionArea: 0.025 Region of Rejection

Area: 0.025Region of AcceptanceArea: 0.95


The data provide sufficient evidence to indicate a difference in the mean achievement between car owners and non-car owners, in fact non car owners have better academic performance than car owners.

Technology

Hypothesis testing