Upload
marlene-norris
View
235
Download
1
Embed Size (px)
Citation preview
1
Schaum’s Outline Probability and Statistics
Chapter 7
HYPOTHESIS TESTING
presented by Professor Carol Dahl
Examples byAlfred Aird
Kira Jeffery Catherine Keske
Hermann Logsend Yris Olaya
2Outline of Topics
Statistical Decisions Statistical Hypotheses Null Hypotheses Tests of Hypotheses Type I and Type II Errors Level of Significance Tests Involving the Normal Distribution One and Two – Tailed Tests P – Value
Topics Covered
3
Special Tests of Significance Large Samples Small Samples
Estimation Theory/Hypotheses Testing Relationship
Operating Characteristic Curves and Power of a Test
Fitting Theoretical Distributions to Sample Frequency Distributions
Chi-Square Test for Goodness of Fit
Outline of Topics (Continued)
4
“The Truth Is Out There”The Importance of Hypothesis Testing
Hypothesis testing
helps evaluate models based upon real data
enables one to build a statistical model
enhances your credibility as
analyst
economist
5Statistical Decisions
Innocent until proven guilty principle
Want to prove someone is guilty
Assume the opposite or status quo - innocent
Ho: Innocent
H1: Guilty
Take subsample of possible information
If evidence not consistent with innocent - reject
Person not pronounced innocent but not guilty
6Statistical Decisions
Status quo innocence = null hypothesis
Evidence = sample result
Reasonable doubt = confidence level
7
Statistical Decisions
Eg. Tantalum ore deposit
feasible if quality > 0.0600g/kg with 99% confidence
100 samples collected from large deposit at random.
Sample distribution
mean of 0.071g/kg
standard deviation 0.0025g/kg.
8
Statistical Decisions
Should the deposit be developed?
Evidence = 0.071 (sample mean)
Reasonable doubt = 99%
Status quo = do not develop the deposit
Ho: < 0.0600
H1: > 0.0600
9Statistical Hypothesis
General Principles
Inferences about population using sample statistic
Prove A is true by assuming it isn’t true
Results of experiment (sample) compared with model
If results of model unlikely, reject model
If results explained by model, do not reject
10Statistical Hypothesis
Event A fairly likely, model would be retained
Event B unlikely, model would be rejected
00 AA
AreaArea
BB
zz
11
Statistical Decisions
Should the deposit be developed?
Evidence = 0.071 (sample mean)
Reasonable doubt = 99%
Status quo = do not develop the deposit
Ho: = 0.0600
H1: > 0.0600
How likely Ho given = 0.071X
12
Need Sampling Statistic
Need statistic with
population parameter
estimate for population parameter
its distribution
13
Need Sampling Statistic
Population Normal - Two Choices
Small Sample <30
Known Variance Unknown Variance
n
X
ns
X
N(0,1) tn-1
14
Need Sampling StatisticPopulation Not-Normal
Large Sample
Known Variance Unknown Variance
ns
X
N(0,1) N(0,1) Doesn’t matter if know variance of not
If population is finite sampling no replacement need adjustment
n
X
15Normal Distribution
=0
SD=1 (68%)
X~N(0,1)
SD=2 (95%)
SD=3 (99.7%)
27
16
Statistical Decisions
Should the deposit be developed?
Evidence: 0.071 (sample mean)
0.0025g/kg (sample variance)
0.05 (sample standard deviation)
Reasonable doubt = 99%
Status quo = do not develop the deposit
Ho: = 0.0600
H1: > 0.0600
One tailed test
How likely Ho given = 0.071X
17Hypothesis test
Evidence: 0.071 (sample mean)
0.05g/kg (sample standard deviation)
Reasonable doubt = 99%
Status quo = do not develop the deposit
Ho: = 0.0600
H1: > 0.0600
199.0)Z
ns
X(P c
18Statistical Hypothesis
Eg. Z = (0.071 – 0.0600)/ (0.05/ 100) = 2.2
Conclusion: Don’t reject Ho , don’t develop deposit
2.2 Zc=2.33
19Null Hypothesis
Hypotheses cannot be proven
reject or fail to reject
based on likelihood of event occurring
null hypothesis is not accepted
20
Test of Hypotheses Maple Creek Mine and
Potaro Diamond field in Guyana
Mine potential for producing large diamonds
Experts want to know true mean carat size produced
True mean said to be 4 carats
Experts want to know if true with 95% confidence
Random sample taken
Sample mean found to be 3.6 carats
Based on sample, is 4 carats true mean for mine?
21Tests of Hypotheses
Tests referred to as:
“Tests of Hypotheses”
“Tests of Significance”
“Rules of Decision”
22
Types of Errors
195.0)96.1
n
4X96.1(P
Ho: µ = 4 (Suppose this is true)
H1: µ 4
Two tailed test
Choose = 0.05
Sample n = 100 (assume X is normal), = 1
23
Type I error () –reject true
195.0)96.1
n
4X96.1(P
Ho: µ = 4 suppose true
/2/2
24
Type II Error (ß) - Accept False
Ho: µ = 4 not true
µ = 6 true
X-µ not mean 0 but mean 2
μ=4 μ=6
0 2
ß
25Lower Type I
What happens to Type II
Ho: µ = 4 not true
µ = 6 true
ßμ=4 μ=6
0 2
26Higher µ
What happens to Type II?
Ho: µ = 4 not true
µ = 7 true
X-µ not mean 0 but mean 3
ßμ=4 μ=7
0 3
27
Ho True Ho False
Reject Ho Type I Error Correct Decision
Do Not Reject Ho
Correct Decision
Type II Error
)Error II Type(P
)Error I Type(P
Type I and Type II Errors
Two types of errors can occur in hypothesis testing
To reduce errors, increase sample size when possible
28To Reduce Errors
Increase sample size when possible
Population, n = 5, 10, 20Mean Sampling
Distributions Difference Sample Sizes
-0.5
0
0.5
1
1.5
2
2.5
-4 -2 0 2 4
29
Error Examples
Type I Error – rejecting a true null hypothesis
Convicting an innocent person
Rejecting true mean carat size is 4 when it is
Type II Error – not rejecting a false null hypothesis
Setting a guilty person free
Not rejecting mean carat size is 4 when it’s not
30
Level of Significance ()
α = max probability we’re willing to risk Type I Error
= tail area of probability density function
If Type I Error’s “cost” high, choose α low
α defined before hypothesis test conducted
α typically defined as 0.10, 0.05 or 0.01
α = 0.10 for 90% confidence of correct test decision
α = 0.05 for 95% confidence of correct test decision
α = 0.01 for 99% confidence of correct test decision
31Diamond Hypothesis Test Example
Ho: µ = 4
H1: µ 4Choose α = 0.01 for 99% confidence
Sample n = 100, = 1
X = 3.6, -Zc = - 2.575, Zc = 2.575
-2.575 2.575
.005.005
32
21001
42.3
n
-Xz
2
Example Continued
)z( 2.575- not )z( 2- 2 Observed not “significantly” different from expected
Fail to reject null hypothesis
We’re 99% confident true mean is 4 carats
1
21
33
Tests Involving the t Distribution
Billy Ray has inherited large, 25,000 acre homestead
Located on outskirts of Murfreesboro, Arkansas, near:
Crater of Diamonds State Park
Prairie Creek Volcanic Pipe
Land now used for
agricultural
recreational
No official mining has taken place
34
Case Study in Statistical Analysis Billy Ray’s Inheritance
Billy Ray must now decide upon land usage
Options:
Exploration for diamonds
Conservation
Land biodiversity and recreation
Agriculture and recreation
Land development
35
Consider Costs and Benefits of Mining
Cost and Benefits of Mining
Opportunity cost
Excessive diamond exploration damages land’s value
Exploration and Mining Costs
Benefit
Value of mineral produced
36
Consider Costs and Benefits of Mining
Cost and Benefits of Mining
Sample for geologic indicators for diamonds
kimberlite or lamporite
larger sample more likely to represent “true population”
larger sample will cost more
37
How to decide one tailed or two tailed
One tailed test
Do we change status quo only if its bigger than null
Do we change status quo only if its smaller than null
Two tailed test
Change status quo if its bigger of if it smaller
38Tests of Mean
Normal or t
population normal
known variance
small sample
Normal
population normal unknown variancesmall sample
t
large population Normal
39Difference Normal and t
00.10.20.30.40.50.6
-5 0 5
t “fatter” tail than normal bell-curve
40
Hypothesis and Sample
Need at least 30 g/m3 mine
Null hypothesis Ho: µ = 20
Alternative hypothesis H1: ?
Sample data: n=16 (holes drilled)
X close to normal
X =31 g/m³
variance (ŝ2/n)=0.286 g/m³
41
Normal or t?
One tailed
Null hypothesis Ho: µ = 30
Alternative hypothesis H1: µ > 30
Sample data: n = 16 (holes drilled)
X = 31 g/m³
variance (ŝ2) = 4.29 g/m³ = 4.29
standard deviation ŝ = 2.07
small sample, estimated variance, X close to normal
not exactly t but close if X close to normal
42
Tests Involving the t Distribution
tn-1 = X - µ ŝ/n
=0
Reject 5%
tc=1.75
t16-1
43
Tests Involving the t Distribution
tn-1 = X - µ = (31 - 30) = 1.93 ŝ/n 2.07/ 16
=0
Reject 5%
tc=1.75
t16-1
44
Wells produces oil
X= API Gravity
approximate normal with mean 37
periodically test to see if the mean has changed
too heavy or too light revise contract
Ho:
H1:
Sample of 9 wells, X= 38, ŝ2 = 2
What is test statistic?
Normal or t?
45
Two tailed t test on mean
tn-1 = X - µ ŝ/n
=0
tc
Reject /2%
Reject /2%
tc
46Two tailed t test on mean
Ho: µ= 37
H1: µ 37
Sample of 9 wells, X= 38, ŝ2 = 2, = 10%
tn-1 = X - µ = (38 – 37) = 1.5 ŝ/n 2/ 9
47
P-values - one tailed test
Level of significance for a sample statistic under null
Largest for which statistic would reject null
t16-1 = X - µ = (31 - 30) = 1.93
ŝ/n 2.07/ 16
tinv(1,87,15,1)
P=0.04
48P-value two tailed test
Ho: µ= 37
H1: µ 37
Sample of 9 wells, X= 38, ŝ2 = 2, = 10%
tn-1 = X - µ = (38 – 37) = 1.5 ŝ/n 2/ 9 =TDIST(1.5,8,2) = 0.172
49Formal Representation of p-Values
p-Value < = Reject Ho
p-Value > = Fail to reject Ho
50
More tests
Survey: - Ranking refinery managers
Daily refinery production
Sample two refineries of 40 and 35 1000 b/cd
First refinery: mean = 74, stand. dev. = 8
Second refinery: mean = 78, stand. dev. = 7
Questions: difference of means?
variances?
differences of variances
Again Statistics Can Help!!!!
51
Differences of Means
Ho: µ1 - µ2 = 0
Ho: µ1 - µ2 0
X1 and X2 normal, known variance
or large sample known variance
= 10%
2
2
2
1
2
1
21
nn
XX
5%
5%
-Zc Zc
52
Differences of Means
Ho: µ1 - µ2 = 0
Ho: µ1 - µ2 0
n1 = 40, n2 = 35
X1 = 74, 1 = 8
X2 = 78, 2 = 7
958.0
357
408
7874
nσ
nσ
XX22
2
2
2
1
2
1
21
5%
5%
-Z=-1.645c Zc-1.645
53Difference of Means
X normal
Unknown but equal variances
Do above test with
21
21
21
222
211
2121
nnnn
2nns)1n(s)1n(
XX2nn
t
54
Variance test (2 distribution)
2
22 S)1n(
Two tailed
/2/2
55
Variance test (2 distribution)
2
22 S)1n(
One tailed
56Hypothesis Test on Variance
Suppose best practice in refinery 2 = 6
Does refinery 2 have different variability than best practice?
Ho: 2 = 6
H1: 2 6.5
Example: 2nd mine, n –1 = 34, Standard deviation = 7
1)
S)1n((P 2
2
22
2c1c
57Hypothesis Test on Variance
/2
278.466
7)135(S)1n(2
2
2
2
Ho: 2 = (6.5) 2
H1: 2 6.52
Example: 2nd mine, n –1 = 34, Standard deviation = 7
= 10%
1)
S)1n((P 2
2
22
2c1c
58Hypothesis Test on Variance
/2
)34,05.0(chiinv),34,95.0(chiinv
Suppose best practice in refinery
Ho: 2 = 6.5
H1: 2 6.5
Example: 2nd mine, n –1 = 34, Standard deviation = 7
603.48,664.21
59
Variance test (2 distribution)
278.46S)1n(
2
22
Two tailed
0.050.05
21.664 48.602
60
Variance test (2 distribution)
More variance than best practice
One tailed
0.10
Ho: 2 = 6.5
H1: 2 > 6.5
61Variance test (2 distribution)
More variance than best practice
One tailed
0.10
Ho: 2 = 6.5
H1: 2 > 6.5 278.46S)1n(
2
22
chiinv(0.10,34)=44.903
62
Testing if Variances the Same F Distribution
2 samples of size n1 and n2
sample variances: ŝ12, ŝ2
2,
Ho: 12
= 22 => Ho: 2
2/12= 1
Ho: 12
22 => Ho: 2
2/12 1
F isS
S S
S
F 12n,11n2
1
2
2
2
2
2
1
2
2
2
2
2
1
2
1
63
Testing if Variances the Same F Distribution
Ho: 12/2
2= 1
H1: 12/2
2 1 2
2
2
1
SS
Two tailed
/2
/2
64
Testing if Variances the Same F Distribution
Ho: 22/1
2= 1
H1: 22/1
2>1 2
2
2
1
SS
One tailed
=10
65Example Testing if Variances the
Same
2 samples of size n1 = 40
and n2 = 35
sample variances: ŝ12= 82, ŝ2
2 = 72
Ho: 22/1
2= 1
Ho: 22/1
2 1 10.01))34,39,05.0(FinvS
S)34,39,95.0(Finv(P 2
1
2
2
2
2
2
1
[0.579, 1.749]
82/72=1.306
66
Testing if Variances the Same F Distribution
Ho: 12/2
2= 1
H1: 12/2
2 1 306.1
SS 2
2
2
1
Two tailed
0.050.05
Finv(0.95,39,34)=0.579 Finv(0.05,39,34)=1.749
67
Testing if Variances the Same F Distribution
Ho: 22/1
2= 1
H1: 22/1
2 1 306.1
SS 2
2
2
1
One tailed
0.05
Finv(0.10,39,34)=1.544
68Power of a test
Type II error:
= P(Fail to reject Ho | H1 is true)
Power = 1-
μ=4 μ=6
0 2
69Power of a test
Type II error:
= P(Fail to reject Ho | H1 is true)
Power = 1-
μ=4 μ=6
0 2
70Power of a test
Researcher controls level of significance,
Increase what happens to ß?
71Raise Type I ( )
What happens to Type II (ß)
Ho: µ = 4 not true
µ = 6 true
X-µ not mean 0 but mean 2
ßμ=4 μ=6
0 2
72Higher
What happens to Type II?
μ=4 μ=6
0 2
ß
Increase ß, reduce
73
Operating Characteristic Curve
-10 -5 5 10
H1H0
ß
μ=μ0 μ=μ1
Zβ
Can graph against
called operating characteristic curve
useful in experimental design
74Operating Characteristic Curve
H1H0
-10 -5 5 10
ß
μ=μ0 μ=μ2
Zβ
-10 -5 5 10
H1H0
ß
μ=μ0 μ=μ1
Zβ
75
Fitting a probability distribution
Is electricity demand a log-normal distribution
Observed Mean: 18.42
Observed Variance 43
Observations : 20
9.8261 13.2253 30.2449 9.255420.8787 20.2954 14.182 23.309935.6834 18.1785 20.275 17.265213.1139 24.3539 17.243 21.976415.9879 16.4685 12.8461 13.9045
76
Fitting a probability distribution
Does electricity demand follow a normal distribution?
9.8261 13.2253 30.2449 9.255420.8787 20.2954 14.182 23.309935.6834 18.1785 20.275 17.265213.1139 24.3539 17.243 21.976415.9879 16.4685 12.8461 13.9045
Observed Mean: 18.42
Observed Variance: 43
Observations : 20
77
1. Order observations from smallest Y1 to largest Yn
2. Compute cumulative frequency distribution 3. Plot ordered observations versus Pi
on special probability sheet 4. If straight line within critical range
can’t reject normal
You can test your model graphically:
78You can test your model graphically:
9.26 0.05 17.27 0.55
9.83 0.10 18.18 0.60
12.85 0.15 20.28 0.65
13.11 0.20 20.30 0.70
13.23 0.25 20.88 0.75
13.90 0.30 21.98 0.80
14.18 0.35 23.31 0.85
15.99 0.40 24.35 0.90
16.47 0.45 30.24 0.95
17.24 0.50 35.68 1.00
79
10010
99
95
90
80
7060504030
20
10
5
1
Data
Perc
ent
0.695AD*
Goodness of Fit
Lognormal base e Probability Plot for C1ML Estimates - 95% CI
Location
Scale
2.85735
0.334029
ML Estimates
Or use the Graph/Probability Plot …Option in Minitab
80
Statistical test of distribution
Ho: Xe N(µ,2)
H1: Xe does not follow N(µ,2)
Order data
Estimate sample mean & variance
Observed Mean: 18.42
Observed Variance: 43
Observations : 20
2 statistic goodness of fit of model
81Statistical test of distribution
9.26 17.27
9.83 18.18
12.85 20.28
13.11 20.30
13.23 20.88
13.90 21.98
14.18 23.31
15.99 24.35
16.47 30.24
17.24 35.68
Again order sample
Create m = 5 categories
<10
10-15
15-20
20-25
>25
82Statistical test of distribution
9.26 17.27
9.83 18.18
12.85 20.28
13.11 20.30
13.23 20.88
13.90 21.98
14.18 23.31
15.99 24.35
16.47 30.24
17.24 35.68
Actual frequencies
<10 2
10-15 5
15-20 5
20-25 6
>25 2
83Statistical test of distribution
actual expected
<10 2 Normdist(10,18.42,6.56,1)*20
10-15 5(Normdist(15,18.42,6.56,1) -Normdist(10,18.42,6.56,1)*20
15-20 5(Normdist(20,18.42,6.56,1) Normdist(15,18.42,6.56,1)*20
20-25 6
>25 2
Frequencies
84Statistical test of distribution
Frequencies
Observed Expected
<10 2 1.99
10-15 5 4.03
15-20 5 5.88
20-25 6 4.94
>25 2 3.16
852 Goodness of Fit Test
Is based on:
2= (oi-ei)2/ei
df = m – k – 1
k = number of parameters replaced by estimates
oi: observed frequency, ei: expected frequency
i=1
m
86Statistical test of distribution
Frequencies
oi ei
<10 2 1.99
10-15 5 4.03
15-20 5 5.88
20-25 6 4.94
>25 2 3.16
2= (oi-ei)2/ei
+(2-1.99)2/1.99
+(5-4.03)2/4.03
+(5-5.88)2/5.88
+(6-4.94)2/4.94
+(2-3.19)2/3.16
= 1.04
87Statistical test of distribution
Ho: X N(µ,2)
H1: X ~ does not follow N(µ,2)
df = m – k – 1= 5 – 2 - 1
2= (oi-ei)2/ei= 1.04
CHIINV(0.05,2)=5.99
88
Estimation Theory/Hypotheses Testing Relationship
Operating Characteristic Curves and Power of a Test
Fitting Theoretical Distributions to Sample Frequency Distributions
Chi-Square Test for Goodness of Fit
Outline of Topics (Continued)
89Sum Up Chapter 7
Hypothesis testing
null vs alternative
null with equal sign
null often status quo
alternative often what want to provetype I error vs type II error
type I called level of significance
P – values
1-ß = power of test
= probability of rejecting false
one tailed vs two tailed
90Sum Up Chapter 7
Hypothesis tests
mean – Normal test
population normal, known variance
large sample
mean – t test
population normal, unknown variance,
small sample
Statistical Decisions
Statistical Hypotheses
Null Hypotheses
Tests of Hypotheses
Type I and Type II Errors
Level of Significance
Tests Involving the Normal Distribution
One and Two – Tailed Tests
P – Value
n
X
ns
X
91Sum Up Chapter 7
Normal and t
92Sum Up Chapter 7
Hypothesis tests
difference of means – Normal test
population normal, known variance
2
2
2
1
2
1
21
nn
XX
93Sum Up Chapter 7
Hypothesis tests
variance
2
22 S)1n(
F isS
S 12n,11n2
1
2
2
2
2
2
1
Are variances equal
94Sum Up Chapter 7
2 and F
95Sum Up Chapter 7
How is random variable distributed
normal – graph cumulative frequency distribution
special paper
straight line
Statistical
2k-m-1= (oi-ei)2/ei
k = categories
m = estimated parameters
always 1 tailed
96
End of Chapter 7!