Upload
ne-ne-bellamaire
View
217
Download
0
Embed Size (px)
DESCRIPTION
hypo qm
Citation preview
1/13/2014
1
Hypothesis Testing in Statistics:
An Introduction
Lecture 7
Lecturer : Dr. Dwayne Devonish
MGMT 2012: Introduction to Quantitative MethodsLearning Objectives
• Students should be able to:
� List the steps of hypothesis testing,
�Distinguish between null and alternative
hypotheses,
�Distinguish and choose between different types
of one-sample hypothesis tests (z-test versus t-
test)
�Work out the critical values of these tests to
distinguish between rejection and nonrejection
decisions
Hypothesis Testing• Oftentimes, quantitative analysts want to make decisions or
conclusions about populations on the basis of the sample
data collected.
• Hypothesis testing allows these analysts to make claims
(statements) about certain characteristics of the population
(known as parameters such as population means or
proportions), and apply statistical tests on sample data to
assess how plausible (or likely to be true) these statements
are.
• These statistical tests provide sample evidence that leads
one to either accept or reject a particular hypothesis.
• Essentially, the researcher makes generalisations about the
population (parameters) on the basis of sample data
(statistics).
Steps in Hypothesis Testing• Step 1: State two opposing hypotheses (null vs
alternative hypotheses)
• Step 2: Determine the appropriate statistical test (and
associated distribution) to assess the hypotheses
• Step 3: Determine the ‘critical values’ that identify
nonrejection (acceptance) and rejection regions for the
null hypothesis
• Step 4: Apply the chosen statistical test to
obtain/compute a test-statistic (a.k.a. sample evidence)
• Step 5: Check to see if sample evidence or test statistic
falls within the acceptance or rejection regions
• Step 6: Make conclusion about the hypothesis to be
accepted or rejected.
Step 1: Null vs Alternative Hypothesis• Null Hypothesis: Ho : This hypothesis is always stated in
statistical terms, using population parameters .
• Ho: µ = 500 (where µ = 500 is a claim regarding the
population mean - e.g. mean income of all public
sector workers) is $500 per week.
• The null hypothesis is assumed to be true unless the
statistical test provides evidence that contradicts it.
• The null hypothesis normally represent the status quo,
and always specifies that a population parameter
equals a specific value.
• The null hypothesis is always the hypothesis that is
under investigation (hence, one seeks to determine if it
can be rejected or not).
Examples of Null Hypotheses
A. “Years ago, the average monthly salary of a senior
public sector worker was BDS$1,200”.
• Ho: µ = 1200
B. “Research has reported that approximately 60% of
men who smoke are likely to develop health
problems”.
• Ho: p = .60
(where ‘p’ is the population parameter for population
proportion of male smokers who develop health
problems
The null hypothesis always contains an ‘equal sign’ (=)
1/13/2014
2
Alternative Hypothesis• The alternative hypothesis Ha : This is always the opposite
of the Ho. If Ho is rejected, then there is evidence to
support Ha. This hypothesis is the researcher’s actual or
primary hypothesis that he or she wants to ‘prove to be
true’ based on sample (statistical) evidence.
• It never contains an equal sign but specifies a population
parameter: is not equal to ( ≠ ), greater than (>), or less
than ( < ) a specific value.
• E.g. HA : µ ≠ 500 (population mean is not equal to 500)
• HA : µ > 500 (population mean is greater than 500)
• HA : µ < 500 (population mean is less than 500)
• The type (or sign) of alternative hypothesis chosen
depends on what the researchers wants to prove within a
given scenario.
Example 1
• It has often been claimed in the past that the averagewaiting time to conduct business at Rogel Bank ofBarbados is 10 minutes. If we were to collectinformation today, can we say that this claim is stilltrue? (or can we say this claim is no longer true).
• Ho: µ = 10
• Ha: µ ≠ 10 (this is known as a two-sided hypothesis)
• Two-sided/tailed alternative hypothesis indicates twoopposite possibilities to the null hypothesis: i.e. themean waiting time today could be either higher orlower than 10 minutes.
Example 2• It has often been claimed in the past that the average
waiting time to conduct business at Rogel Bank of
Barbados is 10 minutes. If we were to collect
information today, can we say the mean waiting time,
due to improved technology, is less than 10 minutes.
• Ho: µ = 10
• Ha: µ < 10 (this is known as one-sided hypothesis)
• One-sided/tailed alternative hypothesis indicates only
one possibility to the null hypothesis; in this case, that
the mean waiting time is less than 10 minutes.
• Note: Ha: µ > 10 is also a one-sided hypothesis which
suggests that mean is greater than 10 minutes (one
possibility).
Step 2: Statistical Test and Distribution
• In order to test the hypotheses in the prior scenario,
the analyst must determine the appropriate statistical
test to use.
• There are many statistical tests that we will cover in
hypothesis testing but we will use examples from the
most basic types of statistical tests which are used to
test basic hypothesis testing scenarios:
• ONE-SAMPLE TESTS
One-Sample Tests
• One-sample tests can be used to examinewhether a population parameter such as apopulation mean (µ) or populationproportion/percentage (p) is a specific value (ornot).
• These tests are called one-sample tests becausethey examine whether claims about a populationmean or proportion is plausible or not on thebasis of statistical data or evidence derived froma single sample of observations collected.
Types of One-Sample Tests
A. One sample tests can be used for testing a populationmean: one-sample tests for a population mean, µ
B. One sample tests can be used for testing a populationproportion: one-sample tests for a populationproportion, p.
• Very soon, we will briefly look at one-sample tests forthe population mean to explain how the first threesteps of hypothesis testing (in an earlier slide) areimplemented.
• However, the next lecture will deal more deeply withtheir calculation and full use in various hypothesistesting scenarios (based on the last three steps).
1/13/2014
3
Remember Steps in Hypothesis Testing• Step 1: State two opposing hypotheses (null vs
alternative hypotheses)
• Step 2: Determine the appropriate statistical test (and
associated distribution) to assess the hypotheses
• Step 3: Determine the ‘critical values’ that identify
nonrejection (acceptance) and rejection regions for the
null hypothesis
• Step 4: Apply the chosen statistical test to
obtain/compute a test-statistic (a.k.a. sample evidence)
• Step 5: Check to see if sample evidence or test statistic
falls within the acceptance or rejection regions
• Step 6: Make conclusion about the hypothesis to be
accepted or rejected.
One-Sample tests for the Population
Mean• There are two (2) popular one-sample tests for the
population mean.
• The Z-test (based on z-distribution):
� This test is used under the following conditions:
1. If the sample size is large (n ≥ 30)
OR
2. If the population standard deviation is known (σ)
Either 1 or 2 alone or if both hold, z-test is applied.
NB: If population standard deviation is known but sample
size is small (n < 30), z-test is still used but the sample
standard deviation is used as an estimate/substitute.
One Sample tests for the Population
Mean
• The T-test (based on Student t-distribution):
�This test is used under the following conditions:
1. The sample size is small (n < 30)
AND
2. The population standard deviation is unknown
(σ = ??)
Both 1 and 2 must hold for t-test to be applied.
One Sample Z-test and T-test• Both tests assume that the population that one is generalising
to is a ‘normal’ population.
• The choice between these two tests rests on information
available in specific hypothesis-testing scenarios.
• Ask yourself two questions:
a) Is the sample size large (30 or greater)?
b) Is the population standard deviation known or unknown.
• Then, choose the appropriate test.
• These tests are used to determine whether a population mean
(µ) is a specific value (or not) on the basis of a sample mean
(x)̄ derived from a set of sample observations (data).
• NB: The z-test alone is used to test for population
proportions.
Remember Steps in Hypothesis Testing
�LET’S SEE HOW THESE TESTS WORK IN THE FIRST THREE
STEPS OF HYPOTHESIS TESTING
• Step 1: State two opposing hypotheses (null vs
alternative hypotheses)
• Step 2: Determine the appropriate statistical test (and
associated distribution) to assess the hypotheses
• Step 3: Determine the ‘critical values’ that identify
nonrejection (acceptance) and rejection regions for the
null hypothesis
Example A (step 1 and 2)• It has been said that the average monthly salary of all public
sector workers is BDS $5,000. A random sample of 40 public
sector workers was examined and a mean salary of BDS
$4800 and standard deviation of BDS $1500 were found.
Test whether the population mean has changed.
• Step 1: Hypotheses:
• Ho: µ = 5000
• Ha: µ ≠ 5000 (2-tailed)
• Step 2: Test??
a) Is the sample size large (30 or greater): YES
b) Is the population standard deviation known or unknown:
UNKNOWN
ANSWER: Z-TEST
1/13/2014
4
Example B (step 1 and 2)• It has been said that the average monthly salary of all public
sector workers is BDS $5,000 with a standard deviation of BDS
$2200 was found. A random sample of 20 public sector
workers was examined and a mean salary of BDS $4800 was
found. Test whether the population mean salary has
decreased.
• Step 1: Hypotheses:
• Ho: µ = 5000
• Ha: µ < 5000 (1-tailed)
• Step 2: Test??
a) Is the sample size large (30 or greater): NO
b) Is the population standard deviation known or unknown:
KNOWN (= $2200)
ANSWER: Z-TEST
Example C (step 1 and 2)• It has been said that the average monthly salary of all public
sector. workers is BDS $5,000. A random sample of 20 public
sector workers was examined and a mean salary of BDS
$6800 and a standard deviation of $1200 were found. Test
whether the population mean has increased.
• Step 1: Hypotheses:
• Ho: µ = 5000
• Ha: µ > 5000 (1-tailed)
• Step 2: Test??
a) Is the sample size large (30 or greater): NO
b) Is the population standard deviation known or unknown:
UNKNOWN
ANSWER: T-TEST
Step 3: Determine Critical Values for
Rejection/Acceptance Regions • Critical values are associated with the distribution of the
statistical tests selected in step 2 (z-tests vs t-tests) and separate
acceptance and rejection regions for the Ho.
• They are based on which statistical tests you are using and the
level of significance or alpha level (α).
• For the Z-test: The critical values are determined on the basis
of the alpha level or level of significance.
• The alpha level represents the probability of rejecting the null
hypothesis when it is in fact true (known as Type 1 error) –
although we are testing hypotheses, we can still make errors
because we are using sample (i.e. incomplete) data.
• So we always have to set an alpha level based on the level of
error we are willing to tolerate (usually 5% or 0.05 but can be
1% or .01; or 10% or .10).
Z-test: Critical Values• Critical values for the z-test distribution are also linked
to whether a hypothesis testing scenario is one-tailed
or two-tailed.
• Critical values are set on the distribution to separate
two key regions based on the alpha level: a rejection
region or nonrejection (acceptance) region for Ho.
• The alpha region can be known as the rejection region
where Ho is rejected (recall that alpha is probability of
rejecting Ho when it is true). The other part of
distribution region is the nonrejection region.
• The alpha region is usually on the tail(s)of the
distribution; split between both tails in a two-sided
scenario; or in one tail alone in a one-tailed scenario.
UPPER TAILUPPER TAIL
00 UPPER CVUPPER CV
Reject H0Reject H0Do Not Reject H0Do Not Reject H0
zz
Reject H0Reject H0
LOWER CVLOWER CV
�� Critical values (CV) can be lower (negative) values or upper Critical values (CV) can be lower (negative) values or upper (positive) values depending on 1(positive) values depending on 1--tailed or 2tailed or 2--tailed hypotheses.tailed hypotheses.
Samplingdistribution
of z-test
Samplingdistribution
of z-test
ZZ--DISTRIBUTION CURVEDISTRIBUTION CURVE
LOWERΤΑΙLLOWERΤΑΙL
�� This is an example of a twoThis is an example of a two--tailed scenario, you see tailed scenario, you see lower and upper CVs within which nonrejection of Ho lies.lower and upper CVs within which nonrejection of Ho lies.
Prior Z-test Example: One-tailed• It has been said that the average monthly salary of all public
sector. workers is BDS $5,000. A random sample of 40 public
sector workers was collected and a mean salary of BDS $4800
and standard deviation of BDS $1500 were found. Test at a 5%
level whether the population mean has decreased.
• Ho: µ = 5000
• Ha: µ < 5000 (1-tailed)
• The alpha level is 5% or 0.05. Given the 1-tailed nature of
scenario, the 5% of the left tail of z-distribution is shaded as the
rejection region (where Ho should be rejected): α = .05 = critical
value of -1.645 (see distribution table).
• α is on the left tail because we are testing whether the
population mean is less (on the left of zero on distribution). If
we were testing ‘greater than’; 5% of the right tail is shaded as
rejection region (critical value = 1.645).
1/13/2014
5
Z-DISTRIBUTION (RIGHT OR LEFT TAIL)
Critical values will be
negative if lower limit
of left tail was used
( -1.645, -1.960, etc)
‘GUIDESHEET FOR Z-TEST’
Critical values of the z-distribution
One-Tailed
Tests
α Left Tailed
Test
Right
Tailed Test
Two Tailed
Test
10% -1.28 +1.28 +/- 1.645
5% -1.645 +1.645 +/- 1.96
2% -2.05 +2.05 +/- 2.33
1% -2.33 +2.33 +/- 2.575
α = .05α = .05
00−zα = −1.645−zα = −1.645
Reject H0Reject H0
Do Not Reject H0Do Not Reject H0
zz
Samplingdistribution
of z-test
Samplingdistribution
of z-test
Lower OneLower One--Tailed Z TestTailed Z Test
�� Critical value of Critical value of --1.645 is always for .05 one1.645 is always for .05 one--tailed.tailed.�� Critical value of Critical value of --1.645 is always for .05 one1.645 is always for .05 one--tailed.tailed.
�� Lower (left) oneLower (left) one--tail tests always have negative critical tail tests always have negative critical values, higher (right) onevalues, higher (right) one--tail tests have positive ones.tail tests have positive ones.
�� Lower (left) oneLower (left) one--tail tests always have negative critical tail tests always have negative critical values, higher (right) onevalues, higher (right) one--tail tests have positive ones.tail tests have positive ones.
Prior Z-test Example: Two-tailed• It has been said that the average monthly salary of all public
sector workers is BDS $5,000. A random sample of 40 public
sector workers was collected and a mean salary of BDS
$4800 and standard deviation of BDS $1500 were found.
Test at a 5% level whether the population mean has
changed.
• Ho: µ = 5000
• Ha: µ ≠ 5000 (2-tailed)
• The alpha level is 5% or 0.05. Given 2-tailed nature of
scenario, the 5% must be shared between left and right tails
of z-distribution. Hence, α/2 = 2.5% or .025 in each tail.
Hence, there are lower and upper critical values for .025 in
each tail which are -1.96 to 1.96 (see distribution table).
Within this range lies the nonrejection region of Ho.
α/2 = .025α/2 = .025
00 1.961.96
Reject H0Reject H0Do Not Reject H0Do Not Reject H0
zz
Reject H0Reject H0
-1.96-1.96
�� TwoTwo--tailed tests provide a range of lower and upper tailed tests provide a range of lower and upper critical values (critical values (--11..96 96 to to 11..9696).).
Samplingdistribution
of z-test
Samplingdistribution
of z-test
TwoTwo--Tailed Tailed ZZ--TestsTests
α/2 = .025α/2 = .025
�� The alpha for twoThe alpha for two--tailed tests are always divided by tailed tests are always divided by 2 2 to be shared between the two tails equally.to be shared between the two tails equally.
T-Test: Critical Values
• The t-test is based on Student’s t-distribution and
the determination of its critical values are different.
• Critical values are based on two factors:
� Alpha level (again if 2-tailed, divide α by 2; if 1-
tailed, determine whether right or left tail contains
the alpha)
� Degrees of freedom (Sample size minus 1)
• Once you determine these factors, the critical values
can be found within the t-distribution tables.
1/13/2014
6
Prior T-test example (2-tailed)• It has been said that the average monthly salary of all public
sector. workers is BDS $5,000. A random sample of 20 public
sector workers was collected and a mean salary of BDS
$4800 and a standard deviation of $1200 were found. Test
at the 5% level whether the population mean has changed.
• Ho: µ = 5000
• Ha: µ ≠ 5000 (2-tailed)
• The alpha level is .05, two-tailed. So each tail has 2.5% or
.025. T-test distribution tables are easy to read . Look for
two-tailed section under 5% (where α/2), and then go down
to the relevant degrees freedom (remember sample size
minus one or 20-1 = 19). The critical values must be written
as lower and upper values: -2.093 to 2.093. Within this
range lies the nonrejection region of Ho.
T-TEST TABLES
α/2 = .025α/2 = .025
00 2.0932.093
Reject H0Reject H0Do Not Reject H0Do Not Reject H0
tt
Reject H0Reject H0
-2.093-2.093
�� TwoTwo--tailed tests provide a range of lower and upper tailed tests provide a range of lower and upper critical values (critical values (--2.093 to 2.093).2.093 to 2.093).
Samplingdistribution
of t-test
Samplingdistribution
of t-test
TwoTwo--Tailed Tailed TT--TestTest
α/2 = .025α/2 = .025
�� The alpha for twoThe alpha for two--tailed tests are always divided by tailed tests are always divided by 2 to be shared between the two tails equally.2 to be shared between the two tails equally.
2nd T-test example (1-tailed)• It has been said that the average monthly salary of all public
sector. workers is BDS $5,000. A random sample of 20 public
sector workers were collected and a mean salary of BDS
$4800 and a standard deviation of $1200 were found. Test
at the 5% level whether the population mean has reduced.
• Ho: µ = 5000
• Ha: µ < 5000 (1-tailed)
• The alpha level is .05 for 1-tail. It is .05 only on the left tail.
Look for one-tailed section under 5%, and then go down to
the relevant degrees freedom (remember sample size minus
one or 20-1 = 19). The critical value is a lower or negative
value, -1.729, given it is on left-side of distribution.
T-TEST TABLES
α = .05α = .05
00−tα = −1.729−tα = −1.729
Reject H0Reject H0
Do Not Reject H0Do Not Reject H0
tt
Samplingdistribution
of t-test
Samplingdistribution
of t-test
Lower OneLower One--Tailed TTailed T--TestTest
�� Critical value of Critical value of --1.729 is always for .05 one1.729 is always for .05 one--tailed.tailed.�� Critical value of Critical value of --1.729 is always for .05 one1.729 is always for .05 one--tailed.tailed.
�� Lower (left) oneLower (left) one--tail tests always have negative critical tail tests always have negative critical values, higher (right) onevalues, higher (right) one--tail tests have positive ones.tail tests have positive ones.
�� Lower (left) oneLower (left) one--tail tests always have negative critical tail tests always have negative critical values, higher (right) onevalues, higher (right) one--tail tests have positive ones.tail tests have positive ones.
1/13/2014
7
© 2002 Prentice-Hall, Inc. Chap 9-37
Summary of Alpha level
and the Critical Values for Z- and T-
Tests
H0: µµµµ = = = = 30 (or µµµµ ≥ ≥ ≥ ≥ 30)
Ha: µµµµ < 300
0
0
H0: µµµµ = = = = 30 (or µµµµ ≤≤≤≤ 30)
Ha: µµµµ > 30
H0: µµµµ = = = = 30
Ha: µµµµ ≠≠≠≠ 30
αααα
αααα
αααα/2
Critical
Value(s)
Rejection Regions
Next Lecture• This was only an introduction to hypothesis testing; there
are more steps until we are complete:
• Step 4: Apply/compute the chosen statistical test to
obtain a test-statistic (or aka sample evidence)
• Step 5: Check to see if sample evidence falls within the
acceptance or rejection regions
• Step 6: Make conclusion about the hypothesis to be
accepted or rejected.
• These final three steps allow us to actually compute the
sample/statistical evidence to see which hypotheses are
correct. We will go further into one-sample and two
sample tests next session.