6
This article was downloaded by: [University of California Santa Cruz] On: 09 October 2014, At: 12:08 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Measurement in Physical Education and Exercise Science Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmpe20 Instructional Analogies for Two Hypothesis-Testing Concepts Marilyn A. Looney Published online: 18 Nov 2009. To cite this article: Marilyn A. Looney (2002) Instructional Analogies for Two Hypothesis-Testing Concepts, Measurement in Physical Education and Exercise Science, 6:1, 61-64, DOI: 10.1207/S15327841MPEE0601_4 To link to this article: http://dx.doi.org/10.1207/S15327841MPEE0601_4 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan,

Instructional Analogies for Two Hypothesis-Testing Concepts

Embed Size (px)

Citation preview

This article was downloaded by: [University of California Santa Cruz]On: 09 October 2014, At: 12:08Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Measurement in PhysicalEducation and Exercise SciencePublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hmpe20

Instructional Analogies for TwoHypothesis-Testing ConceptsMarilyn A. LooneyPublished online: 18 Nov 2009.

To cite this article: Marilyn A. Looney (2002) Instructional Analogies for TwoHypothesis-Testing Concepts, Measurement in Physical Education and ExerciseScience, 6:1, 61-64, DOI: 10.1207/S15327841MPEE0601_4

To link to this article: http://dx.doi.org/10.1207/S15327841MPEE0601_4

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,

sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 12:

08 0

9 O

ctob

er 2

014

TEACHER’S TOOLBOX

Instructional Analogies for TwoHypothesis-Testing Concepts

Marilyn A. LooneyDepartment of Kinesiology & Physical Education

Northern Illinois University

Statistical concepts related to hypothesis testing are difficult for some students tograsp. Explaining the concepts by way of analogies that relate to real-world experi-ences may help students master the material. Examples are provided that may helpstudents understand two hypothesis-testing concepts: the distinction between thealpha level and p values for test statistics, and the distinction between failing to re-ject the null hypothesis (Ho) and stating that Ho is true.

DISTINCTION BETWEEN P VALUES AND THE ALPHALEVEL

Instructional units on hypothesis testing often introduce the concept of making aType I error, that is, rejecting the null hypothesis when it is really true. The alphalevel (α) is defined as the probability of making such an error. Other terms that areoften used interchangeably with alpha level are level of significance and probabil-ity level (Vogt, 1993). If researchers were to make a judgment about rejecting Ho,they should select the Type I error rate they will accept before analyzing their data(Moore, 1985). Then a p, or probability value, is determined that indicates the prob-ability of getting the test statistic for these data by chance alone when the null hy-

MEASUREMENT IN PHYSICAL EDUCATION AND EXERCISE SCIENCE, 6(1), 61–64Copyright © 2002, Lawrence Erlbaum Associates, Inc.

Requests for reprints should be sent to Marilyn A. Looney, Department of Kinesiology & PhysicalEducation, Northern Illinois University, DeKalb, IL 60115. E-mail: [email protected]

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 12:

08 0

9 O

ctob

er 2

014

pothesis is true. If this p value is less than or equal to the alpha level (p ≤ α), then thenull hypothesis is rejected (Moore, 1985). Thus, the researcher is willing to say thatthe result was not due to chance for a fixed probability (α) of making the wrong de-cision. The usual convention is to report “p < α” (e.g., p < .01) to denote that the re-sults were statistically significant for a given alpha level. This notation is often con-fusing to students because the value on each side of the inequality symbolrepresents a probability. A reasonable analogy to use with students deals with testscores that can range from 0 to 100 just as the probability of a test statistic occurringwhen Ho is true (p value) ranges from 0 to 1 (Figure 1). In both the exam score and pvalue contexts, an evidence continuum exists that moves from weak to strong. Forexample, there is stronger evidence that Student A has mastered the test content byscoring 80 out of 100 points than Student B, who scored only 50 out of 100 points.Likewise p = .01 is stronger evidence against the null hypothesis than p = .35.Thirty-five times out of 100 a test statistic of the size exhibited by the data will oc-cur by chance when Ho is true (weak evidence against Ho) as opposed to 1 time outof 100 (strong evidence against Ho).

62 LOONEY

FIGURE 1 Two examples of evidence continua with specified cutoff standards. (Top) A pass–fail decision for examinee performance is made by comparing a student’s exam score to 70, the de-clared cutoff score. (Bottom) A reject–fail to reject decision for a null hypothesis is made by com-paring the p value for the test statistic to .05, the declared α level or cutoff probability.

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 12:

08 0

9 O

ctob

er 2

014

Most students are not interested in knowing whether there is strong or weak evi-dence to say they have mastered the test content. They usually focus on whetherthey passed or failed, which requires knowing the fixed cutoff or passing score forthe exam. The instructor can set this cutoff anywhere along the evidence contin-uum (e.g., 70), keeping in mind that another instructor might set a different passingscore for the same exam. The students must now compare their scores to the cutoffvalue to determine if they failed the exam (e.g., student scores ≤70). If their scoresare less than or equal to the cutoff, they failed the test for a standard set at 70. Like-wise, if the cutoff or alpha level for a statistical test in this example is .05, the prob-ability of the test statistic occurring by chance when Ho is true (p value) iscompared with .05. Keep in mind that the alpha level is the special name given tothe p value that serves as the cutoff for deciding whether to reject Ho. If the p valuefor the test statistic is less than or equal to .05, Ho is rejected at an alpha level of .05.Although the value on each side of the inequality symbol represents probability inthe hypothesis-testing context, each plays a different role just as the two examscores play different roles: student score versus cutoff or standard exam score. Theprobability on the left-hand side of the expression represents the probability asso-ciated with the data at hand, whereas the probability on the right hand side repre-sents the Type I error risk the researcher is willing to take. Although differentresearchers may select different alpha levels or levels of risk, the p values on theleft-hand side of the inequality will not vary for a particular test statistic computedfrom a specific set of data.

After being presented the exam score analogy, students should be able to deter-mine ifenoughevidenceexistsagainstHo to reject it at thepreselected riskofmakinga Type I error. With a better understanding of the distinction between the p value andalpha level, students also should be less prone to replicate mistakes that have oc-curred in issues of Research Quarterly for Exercise and Sport and Medicine and Sci-ence in Sports and Exercise published in 2001. I hope students will readily see theerror in the following statement, “Alpha was set at the p < .05 level” and suggest acorrect revision, “ The alpha level or level of significance was set at .05.”

DISTINCTION BETWEEN “FAILING TO REJECT Ho” AND“Ho IS TRUE”

Once students see that Ho is never rejected with certainty because of the Type I errorrisk that exists, they are better able to accept the potential for making a Type II error,that is, failing to reject Ho when it is false. Unfortunately, a common mistake madeby students is saying that Ho is true when they failed to reject Ho at a given alphalevel. They incorrectly conclude that failure to reject Ho is always equivalent to say-ing that Ho is true. These two statements are equivalent only when no Type II errorwas committed. Unfortunately, we never know for the study at hand whether aType II error has been committed.

INSTRUCTIONAL ANALOGIES 63

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 12:

08 0

9 O

ctob

er 2

014

The use of a courtroom analogy helps students see the error in their thinking. Ajury is charged with rendering a verdict of either guilty or not guilty based on theevidence presented by the prosecutor. Just because a jury returns a verdict of notguilty does not mean that the jury believes the defendant is innocent. A not guiltyverdict only indicates that there was not enough evidence presented against the de-fendant for the jury to say the defendant is guilty. Likewise, if the data do not pro-vide strong evidence against Ho, we will fail to reject Ho. Just as a guilty defendantmay have gone free due to insufficient evidence, so may have a Type II error beencommitted due to insufficient evidence (i.e., poor statistical power). Thus, whenwe fail to reject Ho it is not appropriate to say that Ho is true.

Two analogies have been presented to help students better understand two sta-tistical concepts. I do not claim to be the first to discover or use these examples in astatistical context. Possibly other researchers have discovered these examples ontheir own. Over time it is hard to sort out the collective knowledge that has devel-oped over one’s career from whether one actually has an original idea or exampleto share with colleagues. Thus, the safest conclusion to draw is that the source ofthe examples is “unknown.”

REFERENCES

Moore, D. S. (1985). Statistics: Concepts and controversies (2nd ed.). New York: Freeman.Vogt, W. P. (1993). Dictionary of statistics and methodology: A nontechnical guide for the social sci-

ences. Newbury, CA: Sage.

64 LOONEY

Dow

nloa

ded

by [

Uni

vers

ity o

f C

alif

orni

a Sa

nta

Cru

z] a

t 12:

08 0

9 O

ctob

er 2

014