Upload
jillmitchell8778
View
1.015
Download
0
Embed Size (px)
DESCRIPTION
Hypothesis Testing
Citation preview
1
2
Data collection
Graphs
Measures • location • spread
Descriptive statistics
Statistical inference
Estimation
Hypothesis testing
Decision making
Raw data
Information
• Hypothesis testing is a procedure for
making inferences about a population
• A hypothesis test gives the opportunity
to test whether a change has occurred
or a real difference exists
• A hypothesis is a statement or claim that
something is true
– Null hypothesis H0
• No effect or no difference
• Must be declared true/false
– Alternative hypothesis H1
• True if H0 found to be false
– If H0 is false – reject H0 and accept H1
– If H0 is true – accept H0 and reject H1
H0 always contains = sign
• State the null and alternative hypotheses that would be used to test each
of the following statements:-
• A manufacturer claims that the average life of a transistor is at least 1000
hours (h)
H0: µ = 1000
H1: µ > 1000
• A pharmaceutical firm maintains that the average time for a certain drug
to take effect is 15 mins
H0: µ = 15
H1: µ ≠ 15
• The mean starting salary of graduates is higher than R50000 per annum
H0: µ = 50000
H1: µ > 50000
6
• Hypothesis testing consists of 6 steps:
1. State the hypothesis
2. State the value of α
3. Calculate the test statistic
4. Determine the critical value
5. Make a decision
6. Draw conclusion
7
Hypothesis Testing Step 1
State the hypothesis
Ho – Null hypothesis
H1 – Alternative Hypothesis
One/Two populations
Two tailed
H0: parameter = ?
H1: parameter ≠ ?
Right tailed
H0: parameter = ?
H1: parameter > ?
Left tailed
H0: parameter = ?
H1: parameter < ?
8
Hypothesis Testing Step 2
State the value of α Decision Actual situation
H0 is true H0 is false
Do not
reject H0
Correct
decision
Reject H0 Correct
decision
9
α - Probability of Type I error
Level of significance
1%, 5%, 10%
Determine critical value/s
Hypothesis Testing Step 2
State the value of α Decision Actual situation
H0 is true H0 is false
Do not
reject H0
Correct
decision
Type II error =
Reject H0 Type I error
=
Correct
decision
Two types of errors:
10
LEVEL OF SIGNIFICANCE
• Probability (α) of committing a type I error is called the LEVEL OF SIGNIFICANCE
• α is specified before the test is performed
• You can control the Type I error by deciding, before the test is performed, what risk level you are willing to take in rejecting H0 when it is in fact true
• Researchers usually select α levels of 0.05 or smaller
11
There are different test statistics for testing:
• Single population – Mean, proportion, variance
• Difference between two population – Means, proportions, variances
Hypothesis Testing Step 3
Calculate value of test statistic
REJECTION AND NON-REJECTION REGIONS
• To decide whether H0 will be rejected or not , a
value, called the TEST STATISTIC has to be
calculated by using certain sample results
• Distribution of test statistic often follows a normal
or t distribution.
• Distribution can be divided into 2 regions:-
– A region of rejection
– A region of non- rejection 12
13
Left tailed Two tailed Right tailed
H0
α
H0
α
H0
α/2 α/2
- Critical value – from tables
Hypothesis Testing Step 4
Determine the critical value/values
H1: parameter > ? H1: parameter ≠ ? H1: parameter < ?
14
Right tailed Two tailed Left tailed
H0
Accept Rej
H0 H0
α
H0
Rej Accept
H0 H0
α
Rej Accept Rej
H0 H0 H0
H0
α/2 α/2
Hypothesis Testing Step 5
Make decision
H1: parameter > ? H1: parameter ≠ ? H1: parameter < ?
15
H0
Test statistic
? ? ?
Accept H0
?
Reject H0
Hypothesis Testing Step 6
Draw conclusion
STEPS OF A HYPOTHESIS TEST
Step 1 • State the null and alternative hypotheses
Step 2 • State the values of α
Step 3 • Calculate the value of the test statistic
Step 4 • Determine the critical value
Step 5 • Make a decision using decision rule or graph
Step 6 • Draw a conclusion
16
Concept Questions
• 1 – 12 page 261 textbook
17
18
Hypothesis test for Population Mean, n ≥ 30
- population need not be normally distributed
- sample will be approximately normal
Testing H0: μ = μ0 for n ≥ 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: μ ≠ μ0 |z| ≥ Z1- α/2
H1: μ > μ0 z ≥ Z1- α
H1: μ < μ0 z ≤ -Z1- α Use σ if known
0xz
s
n
• Example
– It will be cost effective to employee an additional staff
member at a well known take away restaurant if the
average sales for a day is more than R11 000 per day.
– A sample of 60 days were selected and the average sales
for the 60 days were R11 841 with a standard deviation of
R1 630.
– Test if it will be cost effective to employ an additional staff
member.
– Assume a normal distributed population. Use α = 0,05.
• Solution
– The population of interest is the daily sales
– We want to show that the average sales is more than
R11 000 per day.
– H1 : μ > 11 000
– The null hypothesis must specify a single value of the
parameter
– H0 : μ = 11 000
– Need to test if R11 841 is significant more than
R11 000(μ)
( )x
0 z1-α
• Solution
– H0 : μ = 11 000
– H1 : μ > 11 000
– α = 0,05
–
– Reject H0
α = 0,05 0 11841 11000
3,991630
60
xz
s
n
1,65
Accept H0 Reject H0
At α = 0,05 if it will be cost effective to
employ an additional staff member –
the average monthly income is more
than R11 000
Critical value Z 1-α = Z 0.95 = 1.65
22
• Hypothesis test for Population Mean, n < 30
– If σ is unknown we use s to estimate σ
– We need to replace the normal distribution with the
t-distribution with (n - 1) degrees of freedom
Testing H0: μ = μ0 for n < 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: μ ≠ μ0 |t| ≥ tn - 1;1- α/2
H1: μ > μ0 t ≥ tn-1;1- α
H1: μ < μ0 t ≤ -tn-1;1- α
0xt
s
n
Concept Questions
23
Questions 13 – 19, page 267, textbook
24
• Example
– Health care is a major issue world wide. One of the
concerns is the waiting time for patients at clinics.
– Government claims that patients will wait less than 30
minutes on average to see a doctor.
– A random sample of 25 patients revealed that their average
waiting time was 28 minutes with a standard deviation of 8
minutes.
– On a 1% level of significance can we say that the claim
from government is correct?
25
• Solution
– The population of interest is the waiting time at clinics
– Want to test the claim that the waiting time is less than 30
minutes
– H1 : μ < 30
– The null hypothesis must specify a single value of the
parameter
– H0 : μ = 30
– Need to test if 28 is significant less than 30(μ) ( )x
• Solution
– H0 : μ = 30
– H1 : μ < 30
– α = 0,01
–
– Accept H0
26
-t1-α 0
α = 0,01
0 28 301,25
8
25
xt
s
n
-2,492
Reject H0 Accept H0
At α = 0,01 we can not say that
the average waiting time at
clinics is less than 30 minutes
tn-1;1-α = t24;0.99= -2.492
27
• Hypothesis testing for Population proportion
–
– Proportion always between 0 and 1
number of successesˆSample proportion = =
sample size
xp
n
Testing H0: p = p0 for n ≥ 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: p ≠ p0 |z| ≥ Z1- α/2
H1: p > p0 z ≥ Z1- α
H1: p < p0 z ≤ -Z1- α
0
0 0
ˆ
(1 )
p pz
p p
n
28
• Example
– A market research company investigates the claim of a
supplier that 35% of potential buyers are preferring their
brand of milk
– A survey was done in several supermarkets and it was
found that 61 of the 145 shoppers indicated that they will
buy the specific brand of milk
– Assist the research company with the claim of the supplier
on a 10% level of significance.
29
• Solution
– The population of interest is the proportion of buyers
– Want to test the claim that the proportion is 35% = 0,35
– H0 : p = 0,35
– The alternative hypothesis must specify that the proportion is not 35%
– H1 : p ≠ 0,35
–
– Need to test if 0,42 is significant different from 0,35(p)
number of successes 61ˆSample proportion p = = 0,42
sample size 145
x
n
p̂
30
• Solution
– H0 : p = 0,35
– H1 : p ≠ 0,35
– α = 0,10
– Reject H0
At α = 0,10 we can not say
that 35% of the clients will
prefer the brand of milk
0
0 0
ˆ 0,42 0,351,76
(1 ) 0,35(1 0,35)
145
p pz
p p
n
/2 0,05
-z1-/2 z1-/2
Reject H0 Accept H0 Reject H0
-1,65 +1,65
Z1-α/2 = Z0.95 = +/- 1.65
31
• Hypothesis testing for Population Variance
– Draw conclusions about variability in population
– Χ2 –distribution with (n - 1) degrees of freedom
Testing H0: σ2 = σ2
0
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: σ2 ≠ σ2
0
Χ2 ≤ Χ2n-1;α/2 or
Χ2 ≥ Χ2n-1;1- α/2
H1: σ2 > σ2
0 Χ2 ≥ Χ2n-1;1- α
H1: σ2 < σ2
0 Χ2 ≤ Χ2n-1;α
22
2
0
( 1)n s
Concept Questions
• Questions 20 – 24 page 272
32
33
• Example
– The variation in the content of a 340ml can of beer should
not be more than 10ml2.
– To test the validity of this, 25 cans of beers revealed a
variance of 12ml2.
– On a 5% level of significance can we say the variation in
the content of the cans is too large?
34
• Solution
– The population of interest is the variation in content
– Want to test the claim that the variance is more than 10ml2
– H1 : σ2 > 10
– The null hypothesis must specify that the variance is 10ml2
– H0 : σ2 = 10
– Need to test if 12(s2) is significant more than 10(σ2)
• Solution
– H0 : σ2 = 10
– H1 : σ2 > 10
– α = 0,05
– Accept H0
35
22
2
0
( 1) (25 1)1228,8
10
n s
At α = 0,05 we can not say
that the variation in the
content of the cans is more
than 10ml2
Χ2n – 1; 1-α
0,05
Accept H0 Reject H0
+36,42
Χ2n-1; 1-α = Χ2
24; 0.95 = 36.42
36
• Hypothesis tests for comparing two Populations
– Difference between two means
• Independent samples – Large samples
– Small samples
• Dependent samples
– Difference between two proportions
– Difference between two variances
Drawn from different
samples, samples
have no relation
Samples are
related
• Hypothesis tests for comparing two Populations
– H0: Population 1 parameter = Population 2 parameter
– H1: Population 1 parameter ≠ Population 2 parameter
– H1: Population 1 parameter > Population 2 parameter
– H1: Population 1 parameter < Population 2 parameter
μ1 / σ2
1 / p1 μ2 / σ2
2 / p2
38
Difference between two Population Means,
independent samples, n1 ≥ 30 and n2 ≥ 30
Testing H0: μ1 = μ2 for n1 ≥ 30 and n2 ≥ 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: μ1 ≠ μ2 |z| ≥ Z1- α/2
H1: μ1 > μ2 z ≥ Z1- α
H1: μ1 < μ2 z ≤ -Z1- α
Use σ12 and σ2
2 if known
1 2
2 2
1 2
1 2
x xz
s s
n n
Example
A leading television manufacturer purchases
cathode ray tubes from 2 businesses (A and B). A
random sample of 36 cathode ray tubes from
business A showed a mean lifetime of 7.2 years
and a std dev of 0.8 years, while a random sample
of 40 cathode ray tubes from business B showed a
mean lifetime of 6.7 years and a std dev of 0.7 yrs.
Test at a 1% significance level whether the mean
lifetime of the cathode ray tubes from business A is
longer than the mean lifetime of business B
39
Answer
40
Company A: n1 = 36,
x 1 = 7,2 and s1 = 0,8.
Company B: n2 = 40,
x 2 = 6,7 and s2 = 0,7.
H0: 1 = 2
H1: 1 > 2
= 0,01
z =
x 1 x 2
s12
n1
s22
n2
=
7,2 6,7
0,8 2
360,7
2
40
=
0,5
0,1733
= 2,8852
Z1 = Z0,99 = 2,33
Therefore, reject H0.
There is enough evidence to say that the mean lifetime of the tubes from Company A is longer
than that of Company B.
41
Difference between two Population Means, independent
samples, n1 < 30 and n2 < 30
Testing H0: μ1 = μ2 for n1 < 30 and n2 < 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: μ1 ≠ μ2 |t| ≥ tn1 + n2 – 2 ; 1- α/2
H1: μ1 > μ2 t ≥ tn1 + n2 – 2 ; 1- α
H1: μ1 < μ2 t ≤ -tn1 + n2 – 2 ;1- α
1 2
1 2
2 2
1 1 2 2
1 2
with1 2
1 1
2
p
p
x xt
sn n
n s n ss
n n
42
Difference between two Population Means, dependent
samples – pairs of observations
Observation
1 2 3 - - - - - - - - - - n
Sample 1
Sample 2
X11 X12 X13 - - - - - - - - - - X1n
X21 X22 X23 - - - - - - - - - - X2n
Difference (d) d1 d2 d3 dn
(X11 - X21) (X12 - X22) (X13 – X23) (X1n - X2n)
212
1 and
1
n
d
d dd d s
n n
43
Difference between two Population Means, dependent
samples
Testing H0: μ1 = μ2
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: μ1 ≠ μ2 |t| ≥ tn – 1 ; 1- α/2
H1: μ1 > μ2 t ≥ tn – 1 ; 1- α
H1: μ1 < μ2 t ≤ -tn – 1 ;1- α
d
dt
s
n
Concept Questions
44
Questions 25 – 29 p 283, textbook
45
Difference between two Population Proportions, large
independent samples
Testing H0: p1 = p2 for n1 ≥ 30 and n2 ≥ 30
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: p1 ≠ p2 |z| ≥ Z1- α/2
H1: p1 > p2 z ≥ Z1- α
H1: p1 < p2 z ≤ -Z1- α
1 2
1 2
1 1 2 2
1 2
ˆ ˆ
1 1ˆ ˆ(1 )
ˆ ˆˆwhere
p pz
p pn n
n p n pp
n n
46
Difference between two Population Variances, large
independent samples
Testing H0: σ2
1 = σ22
Alternative
hypothesis
Decision rule:
Reject H0 if Test statistic
H1: σ2
1 ≠ σ22 F ≥ Fn1-1 ; n2-1 ; α/2
H1: σ2
1 > σ22 F ≥ Fn1-1 ; n2-1 ; α
Assume population 1 has the larger
variance. Thus always: s21 > s2
2
2
1
2
2
sF
s
47
• Example • There is a belief that people staying in Cape Town
travel less than people staying in Johannesburg.
– Random samples of 43 people in Cape Town and 39
in Johannesburg were drawn.
– For each person the distance travelled during
October were recorded.
– Test the belief on a 5% level of significance
48
• Solution
– The population of interest is the km travelled
– Samples are large and independent
– We want to show that Cape Town travel less than
Johannesburg
– H1 : μ1 < μ2
– The null hypothesis must specify a single value of the
parameter
– H0 : μ1 = μ2
49
• From the data: –
– H0: μ1 = μ2
– H1: μ1 < μ2
–
– Accept H0
1 2
1 2
1 2
Cape Town Johanesburg
= 604 x = 633
n = 43 n = 39
s = 64 s = 103
x
-1,65 0
Reject H0 Accept H0
1 2
2 2 2 2
1 2
1 2
604 6331,51
64 103
43 39
x xz
s s
n n
The belief that people staying in
Cape Town travel less than people
staying in Johannesburg is not true
on a 5% level of significance
50
Example
• Pathological laboratories have a problem with time it takes a blood sample to be analyzed. They hope by introducing some new equipment, the time taken will be reduced.
– Blood samples for 10 different types of test were analyzed by the traditional laboratories and by the newly equipped laboratories.
– The time, in minutes, were captured for each test.
– Did the time reduce? α = 0,01
51
• Solution
– The population of interest is the test times
– The samples are dependent
– We want to show that new times is less than the old times
– H1 : μ1 > μ2
– The null hypothesis must specify a single value of the
parameter
– H0 : μ1 = μ2
52
- Calculate the
difference for each xi
- Calculate the average
differences and the
standard deviation of
the differences
2,2
19,14
d
d
x
s
Blood
sample
Existing
lab
New
lab Difference
1 47 70 -23
2 65 83 -18
3 59 78 -19
4 61 46 15
5 75 74 1
6 65 56 9
7 73 74 -1
8 85 52 33
9 97 99 -2
10 84 57 27
53
- The hypotheses test for this
problem is
H0: 1 = 2
H1: 1 > 2
2,2
19,14 10
0,36
d
dt
s
n
- The statistic is 0 t0.90,9
Accept H0 Reject H0
1,383
α =0,10
Using α = 0,10, introducing some new equipment the time taken did not reduce.
54
• Example
– A clothing manufacturer introduced two new swim suit ranges
on the market.
– Of the 266 clients asked if they will wear range A, 85 indicated
they will.
– Of the 192 clients asked if they will wear range B, 50 indicated
they will.
– Can we say there is a difference in the preferences of the two
ranges. Use α = 0,05.
55
• Solution
– The population of interest is the proportion of clients who
will wear the clothing
– We want to determine if the proportion of range A differ form
the proportion of range B
– H1: p1 ≠ p2
– The null hypothesis must specify a single value of the
parameter
– H0: p1 = p2
56
• From the data: –
–
– H0: p1 = p2
– H1: p1 ≠ p2 –
– Accept H0
85 50
1 2266 192ˆ ˆRange A : 0,32 Range B: 0,26p p
There is no difference in the preferences
of the two ranges if α = 0,05.
1 2
1 1
266 192
1 2
ˆ ˆ 0,32 0,261,93
1 1 0,29(1 0,29)ˆ ˆ(1 )
p pz
p pn n
266(0,32) 192(0,26)ˆ 0,29
266 192p
Reject H0 Accept H0 Reject H0
-1,96 +1,96
57
• Example
– An important measure to determine service delivery in the banking sector is the variability in the service times.
– An experiment was conducted to compare the service times of two bank tellers.
– The results from the experiment: • Teller A: nA = 18 and s2
A = 4,03
• Teller B: nB = 26 and s2B = 9,49
– Can we say that the variance in service time of teller A is less than that variance of teller B on a 5% level of significance.
– We will then test if the variance in service time of teller B is more that the variation of teller A: s2
B > s2A
Remember:
Population 1 has the larger variance.
Thus always: s21 > s2
2
58
• Solution
– The population of interest is the variation in the service time
of the two bank tellers.
– We want to determine if the variation of teller B is more than
the variation of teller A.
– H1: S2
1 > S22
– The null hypothesis must specify a single value of the
parameter
– H0: S2
1 = S22
• From the data:
– The results from the experiment:
• Teller A: nA = 18 and s2A = 4,03
• Teller B: nB = 26 and s2B = 9,49
– H0: S2
1 = S22
– H1: S2
1 > S22
– Reject H0
Accept H0 Reject H0
F26-1;18-1;0,05 = 2,18
59
Variation of teller B is more
than the variation of teller A
on a 5% level of
significance.
2
1
2
2
9,492,35
4,03
SF
S
Concept Questions
• Questions 30 -31, p 289, textbook
60
HOMEWORK
• Self review test, p289
• Izmivo
• Revision test
• Supplementary Questions p292 – 298
61