51
Victor Katch Kinesiology Signific ance Testing Chapter 13

Victor Katch Kinesiology Significance Testing Chapter 13

Embed Size (px)

Citation preview

Page 1: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology

Significance Testing

Chapter 13

Page 2: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 2

Critical RegionThe critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis. For example, see the red-shaded region in previous Figure.

Page 3: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 3

Significance Level

The significance level (denoted by ) is the probability that the test statistic will fall in the critical region when the null hypothesis is actually true. Common choices for are 0.05, 0.01, and 0.10.

Page 4: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 4

Critical ValueA critical value is any value separating the critical region (where we reject the H0) from the values of the test statistic that does not lead to rejection of the null hypothesis, the sampling distribution that applies, and the significance level . For example, the critical value of z = 1.645 corresponds to a significance level of = 0.05.

Page 5: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 5

Two-tailed,Right-tailed,

Left-tailed Tests

The tails in a distribution are the extreme regions bounded

by critical values.

Page 6: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 6

Two-tailed TestH0: =

H1:

is divided equally between the two tails of the critical

region

Means less than or greater than

Page 7: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 7

Right-tailed Test

H0: =

H1: > Points Right

Page 8: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 8

Left-tailed Test

H0: =

H1: <

Points Left

Page 9: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 9

P-ValueThe P-value (or p-value or probability value) is the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. The null hypothesis is rejected if the P-value is very small, such as 0.05 or less.

Page 10: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 10

Conclusions in Hypothesis Testing

We always test the null hypothesis.

1. Reject the H0

2. Fail to reject the H0

Page 11: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 11

Accept versus Fail to Reject

Some texts use “accept the null hypothesis.”

We are not proving the null hypothesis.

The sample evidence is not strong enough to warrant rejection (such as not enough evidence to convict a suspect).

Page 12: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 12

Traditional method: Reject H0 if the test statistic falls

within the critical region.Fail to reject H0 if the test statistic

does not fall within the critical region.

Decision Criterion

Page 13: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 13

P-value method: Reject H0 if P-value (where is the significance level, such as 0.05).

Fail to reject H0 if P-value > .

Decision Criterion

Page 14: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 14

Another option: Instead of using a significance

level such as 0.05, simply identify the P-value and leave the decision to the reader.

Decision Criterion

Page 15: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology

Example: Finding P-values

Page 16: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 16

Wording of Final Conclusion

Page 17: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 17

Hypothesis testing about: • a population mean or mean difference (paired data)• the difference between means of two populations• the difference between two population proportions

Three Cautions:1. Inference is only valid if the sample is representative

of the population for the question of interest.

2. Hypotheses and conclusions apply to the larger population(s) represented by the sample(s).

3. If the distribution of a quantitative variable is highly skewed, consider analyzing the median rather than the mean – called nonparametric methods.

Page 18: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 18

Significance TestingSteps in Any Hypothesis Test

1. Determine the null and alternative hypotheses.2. Verify necessary data conditions, and if met,

summarize the data into an appropriate test statistic.

3. Assuming the null hypothesis is true, find the p-value.

4. Decide whether or not the result is statistically significant based on the p-value.

5. Report the conclusion in the context of the situation.

Page 19: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 19

Testing Hypotheses About One Mean or Paired Data

Step 1: Determine null and alternative hypotheses

1. H0: = 0 versus Ha: 0 (two-sided)

2. H0: 0 versus Ha: < 0 (one-sided)

3. H0: 0 versus Ha: > 0 (one-sided)

Often H0 for a one-sided test is written as H0: = 0. Remember a p-value is computed assuming H0 is true, and 0 is the value used for that computation.

Page 20: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 20

Situation 1: Population of measurements of interest is approximately normal, and a random sample of any size is measured. In practice, use method if shape is not notably skewed or no extreme outliers.

Situation 2: Population of measurements of interest is not approximately normal, but a large random sample (n 30) is measured. If extreme outliers or extreme skewness, better to have a larger sample.

Step 2: Verify Necessary Data Condition

Page 21: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 21

The t-statistic is a standardized score for measuring the difference between the sample mean and the null hypothesis value of the population mean:

Continuing Step 2: The Test Statistic

This t-statistic has (approx) a t-distribution with df = n - 1.

ns

xt 0

error standard

valuenullmean sample −=

−=

Page 22: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 22

• For H1 less than, the p-value is the area below t, even if t is positive.

• For H1 greater than, the p-value is the area above t, even if t is negative.

• For H1 two-sided, p-value is 2 area above |t|.

Step 3: Assuming H0 true, Find the p-value

Page 23: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 23

These two steps remain the same for all of the hypothesis tests.

Choose a level of significance , and reject H0 if the p-value is less than (or equal to) .

Otherwise, conclude that there is not enough evidence to support the alternative hypothesis.

Steps 4 and 5: Decide Whether or Not the Result is Statistically Significant based on the p-value and Report the Conclusion in the Context of the Situation

Page 24: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 24

Example Normal Body Temperature

What is normal body temperature? Is it actually less than 98.6 degrees Fahrenheit (on average)?

Step 1: State the null and alternative hypotheses

H0: = 98.6

Ha: < 98.6

where = mean body temperature in human population.

Page 25: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 25

Example Normal Body Temp (cont)

Data: random sample of n = 18 normal body temps

Step 2: Verify data conditions …

98.2 97.8 99.0 98.6 98.2 97.8 98.4 99.7 98.297.4 97.6 98.4 98.0 99.2 98.6 97.1 97.2 98.5

no outliers nor strong skewness.

Sample mean of 98.217 is close to sample median of 98.2.

x

x

x

x

xx

x

x

x

x

Page 26: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 26

Example Normal Body Temp (cont) Step 2: … Summarizing data with a test statistic

Test of mu = 98.600 vs mu < 98.600Variable N Mean StDev SE Mean T PTemperature 18 98.217 0.684 0.161 -2.38 0.015

Key elements:

Sample statistic: = 98.217 (under “Mean”)

Standard error: (under “SE Mean”)

(under “T”)

( ) 161.018

684.0.. ===

n

sxes

x

38.2161.0

6.98217.980 −=−

=−

=

ns

xt

Page 27: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 27

Example Normal Body Temp (cont) Step 3: Find the p-value

From output: p-value = 0.015

From Table A.3: p-value is between 0.016 and 0.010.

Area to left of t = -2.38 equals area to right of t = +2.38. The value t = 2.38 is between column headings 2.33 and 2.58 in table, and for df =17, the one-sided p-values are 0.016 and 0.010.

Page 28: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 28

Example Normal Body Temp (cont) Step 4: Decide whether or not the result is statistically significant based on the p-value

Using = 0.05 as the level of significance criterion, the results are statistically significant because 0.015, the p-value of the test, is less than 0.05. In other words, we can reject the null hypothesis.

Step 5: Report the Conclusion

We can conclude, based on these data, that the mean temperature in the human population is actually less than 98.6 degrees.

Page 29: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 29

Paired Data and the Paired t-TestData: two variables for n individuals or pairs;

use the difference d = x1 – x2.

Parameter: d = population mean of differences

Sample estimate: = sample mean of the differences

Standard deviation and standard error: sd = standard deviation of the sample of differences;

Often of interest: Is the mean difference in the population different from 0?

( )n

sdes d=..

d

Page 30: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 30

Steps for a Paired t-TestStep 1: Determine null and alternative hypothesesH0: d = versus Ha: d or Ha: d < or Ha: d >

Watch how differences are defined for selecting the Ha.

Step 2: Verify data conditions and compute test statisticConditions apply to the differences.

The t-test statistic is:

nsd

td

0

error standard

valuenullmean sample −=

−=

Steps 3, 4 and 5: Similar to t-test for a single mean. The df = n – 1, where n is the number of differences.

Page 31: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 31

Example Effect of Alcohol Study: n = 10 pilots perform simulation first under sober conditions and then after drinking alcohol.Response: Amount of useful performance time.

(longer time is better)Question: Does useful performance time decrease

with alcohol use?

Step 1: State the null and alternative hypotheses

H0: d = 0 versus Ha: d > 0

where d = population mean difference between alcohol

and no alcohol measurements if all pilots took these tests.

Page 32: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 32

Example Effect of Alcohol (cont) Data: random sample of n = 10 time differences

Step 2: Verify data conditions …

Boxplot shows no outliers nor extreme skewness.

Page 33: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 33

Example Effect of Alcohol (cont)

Step 2: … Summarizing data with a test statistic

Test of mu = 0.0 vs mu > 0.0Variable N Mean StDev SE Mean T PDiff 10 195.6 230.5 72.9 2.68 0.013

Key elements:

Sample statistic: = 195.6 (under “Mean”)

Standard error: (under “SE Mean”)

(under “T”)

( ) 9.7210

5.230.. ===

n

sdes d

d

68.29.72

06.1950=

−=

−=

nsd

td

Page 34: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 34

Example Effect of Alcohol (cont)

Step 3: Find the p-value

From output: p-value = 0.013

From Table A.3: p-value is between 0.007 and 0.015.

The value t = 2.68 is between column headings 2.58 and 3.00 in the table, and for df =9, the one-sided p-values are 0.015 and 0.007.

Page 35: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 35

Example Effect of Alcohol (cont)

Steps 4 and 5: Decide whether or not the result is statistically significant based on the p-value and Report the Conclusion

Using = 0.05 as the level of significance criterion, we can reject the null hypothesis since the p-value of 0.013 is less than 0.05. Even with a small experiment, it appears that alcohol has a statistically significant effect and decreases performance time.

Page 36: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 36

Testing The Difference between Two Means (Independent Samples)Step 1: Determine null and alternative hypotheses

H0: 1 – 2 = versus Ha: 1 – 2 or Ha: 1 – 2 < or Ha: 1 – 2 > Watch how Population 1 and 2 are defined.

Step 2: Verify data conditions and compute test statisticBoth n’s are large or no extreme outliers or skewness in either sample. Samples are independent. The t-test statistic is:

( )

2

22

1

21

21 0

error standard

valuenullmean sample

ns

ns

xxt

+

−−=

−=

Steps 3, 4 and 5: Similar to t-test for one mean.

Page 37: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 37

Example Effect of Stare on Driving

Question: Does stare speed up crossing times?

Step 1: State the null and alternative hypotheses

H0: 1 – 2 = versus Ha: 1 – 2 >

where 1 = no-stare population and 2 = stare population.

Randomized experiment: Researchers either stared or did not stare at drivers stopped at a campus stop sign; Timed how long (sec) it took driver to proceed from sign to a mark on other side of the intersection.

Page 38: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 38

Example Effect of Stare (cont) Data: n1 = 14 no stare and n2 = 13 stare responses

Step 2: Verify data conditions …

No outliers nor extreme skewness for either group.

Page 39: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 39

Example Effect of Stare (cont) Step 2: … Summarizing data with a test statistic

Sample statistic: = 6.63 – 5.59 = 1.04 seconds

Standard error:

21 xx −

43.013

822.0

14

36.1).(.

22

2

22

1

21

21 =+=+=−ns

ns

xxes( )

41.243.0

004.10

2

22

1

21

21 =−

=

+

−−=

ns

ns

xxt

Page 40: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 40

Example Effect of Stare (cont) Steps 3, 4 and 5: Determine the p-value and make

a conclusion in context.

The p-value = 0.013, so we reject the null hypothesis,

the results are “statistically significant”.

The p-value is determined using a t-distribution with df = 21 (df using Welch approximation formula) and finding area to right of t = 2.41. Table A.3 => p-value is between 0.009 and 0.015.

We can conclude that if all drivers were stared at, the mean crossing times at an intersection would be faster than under normal conditions.

Page 41: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 41

The Two Types of Errors and Their Probabilities

When the null hypothesis is true, the probability of a type 1 error, the level of significance, and the -level are all equivalent.

When the null hypothesis is not true, a type 1 error cannot be made.

Page 42: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 42

Type I ErrorA Type I error is the mistake of

rejecting the null hypothesis when it is true.

The symbol (alpha) is used to represent the probability of a type I error.

Page 43: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 43

Type II ErrorA Type II error is the mistake of failing

to reject the null hypothesis when it is false.

The symbol (beta) is used to represent the probability of a type II error.

Page 44: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology

Example: Assume that we a conducting a hypothesis test of the claim p > 0.5. Here are the null and alternative hypotheses: H0: p = 0.5, and H1: p > 0.5.

a) Identify a type I error.b) Identify a type II error.

Page 45: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology

Example: Assume that we a conducting a hypothesis test of the claim p > 0.5.

Here are the null and alternative hypotheses: H0: p = 0.5, and H1: p > 0.5.

Identify a type I error.

A type I error is the mistake of rejecting a true null hypothesis, so this is a type I error: Conclude that there is sufficient evidence to support p > 0.5, when in reality p = 0.5.

Page 46: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology

Example: Assume that we a conducting a hypothesis test of the claim p > 0.5. Here are the null and alternative hypotheses: H0: p = 0.5, and H1: p > 0.5.

Identify a type II error

A type II error is the mistake of failing to reject the null hypothesis when it is false, so this is a type II error: Fail to reject p = 0.5 (and therefore fail to support p > 0.5) when in reality p > 0.5.

Page 47: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 47

Type I and Type II Errors

Page 48: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 48

Controlling Type I and Type II Errors

For any fixed , an increase in the sample size

n will cause a decrease in

For any fixed sample size n , a decrease in will cause an increase in . Conversely, an increase in will cause a decrease in .

To decrease both and , increase the sample size.

Page 49: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 49

DefinitionPower of a Hypothesis Test

The power of a hypothesis test is the probability (1 - ) of rejecting a false null hypothesis, which is computed by using a particular significance level and a particular value of the population parameter that is an alternative to the value assumed true in the null hypothesis.

Page 50: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 50

Trade-Off in Probability for Two Errors

There is an inverse relationship between the probabilities of the two types of errors.Increase probability of a type 1 error =>

decrease in probability of a type 2 error

Page 51: Victor Katch Kinesiology Significance Testing Chapter 13

Victor Katch Kinesiology 51

Type 2 Errors and Power

Three factors that affect probability of a type 2 error1. Sample size; larger n reduces the probability of a type 2

error without affecting the probability of a type 1 error.

2. Level of significance; larger reduces probability of a type 2 error by increasing the probability of a type 1 error.

3. Actual value of the population parameter; (not in researcher’s control. Farther truth falls from null value (in Ha direction), the lower the probability of a type 2 error.

When the alternative hypothesis is true, the probability of making the correct decision is called the power of a test.