Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

Chapter 14

Nonparametric Statistics

2

Introduction: Distribution-Free Tests

Distribution-free tests – statistical tests that don’t rely on assumptions about the probability distribution of the sampled populationNonparametrics – branch of inferential statistics devoted to distribution-free testsRank statistics (Rank tests) – nonparametric statistics based on the ranks of measurements

3

Single Population Inferences

The Sign test is used to make inferences about the central tendency of a single populationTest is based on the median η Test involves hypothesizing a value for the population median, then testing to see if the distribution of sample values around the hypothesized median value reaches significance

4


Sign Test for a Population Median ηOne-Tailed Test Two-Tailed Test H0:η1 = η0 H0: η1 = η0 Ha :η1 < η0

{or Ha: η1> η0] Ha: η1 η0

Test Statistic S = Number of sample measurements greater than η0 [or S = number of measurements less than η0]

S = Larger of S1 and S2, where S1 is the number of measurements less than η0 and S2 is the number of measurements greater than η0

Observed Significance Level p-value = P(x ≥ S) p-value = 2P(x ≥ S) Where x has a binomial distribution with parameters n and p = .5

Rejection region: Reject H0 if p-value ≤ .05

Conditions required for sign test – sample must be randomly selected from a continuous probability distribution

5


Large-Sample Sign Test for a Population Median η

Conditions required for sign test – sample must be randomly selected from a continuous probability distribution

One-Tailed Test Two-Tailed Test H0:η1 = η0 H0: η1 = η0 Ha :η1 < η0

{or Ha: η1> η0] Ha: η1 η0

Test Statistic

.5 .5

.5

S nz

n

Observed Significance Level p-value = P(x ≥ S) p-value = 2P(x ≥ S) Where x has a binomial distribution with parameters n and p = .5

Rejection region: z z Rejection region: / 2z z

6

Comparing Two Populations: Independent Samples

The Wilcoxon Rank Sum Test is used when two independent random samples are being used to compare two populations, and the t-test is not appropriate

It tests the hypothesis that the probability distributions associated with the two populations are equivalent

7


Rank Data from both samples from smallest to largest

If populations are the same, ranks should be randomly mixed between the samples

Test statistic is based on the rank sums – the totals of the ranks for each of the samples. T1 is the sum for sample 1, T2 is the sum for sample 2

Percentage Cost of Living Change, as Predicted by Government and University Economists

Government Economist (1) University Economist (2) Prediction Rank Prediction Rank

3.1 4 4.4 6 4.8 7 5.8 9 2.3 2 3.9 5 5.6 8 8.7 11 0.0 1 6.3 10 2.9 3 10.5 12

10.8 13

8


Wilcoxon Rank Sum Test: Independent Samples

Required Conditions: Random, independent samples

Probability distributions samples drawn from are continuous

One-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

{or Ha: D1 is shifted to the left of D2]

Ha :D1 is shifted either to the left or to the right of D2

Test Statistic T1, if n1<n2; T2, if n2 < n1 (Either rank sum can be used if n1 = n2)

T1, if n1<n2; T2, if n2 < n1 (Either rank sum can be used if n1 = n2) We will denote this rank sum as T

Rejection region: T1: T1 ≥ TU [or T1 ≤ TL] T1: T1 ≤ TL [or T1 ≥ TU]

Rejection region: T ≤ TL or T ≥ TU

Where TL and TU are obtained from table

9


Wilcoxon Rank Sum Test for Large Samples(n1 and n2 ≥ 10)

One-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

{or Ha: D1 is shifted to the left of D2]


Test Statistic

1 1 21

1 2 1 2

( 1)

2:( 1)

12

n n nT

Test statistic zn n n n

Rejection region: z>z(or z<-z)

Rejection region: |z|>z/2

10

Comparing Two Populations: Paired Differences Experiment

Wilcoxon Signed Rank Test: An alternative test to the paired difference of means procedure

Analysis is of the differences between ranks

Any differences of 0 are eliminated, and n is reduced accordingly

Softness Ratings of Paper

Product Difference Judge A B (A-B) Absolute Value of Difference Rank of Absolute Value

1 6 4 2 2 5 2 8 5 3 3 7.5 3 4 5 -1 1 2 4 9 8 1 1 2 5 4 1 3 3 7.5 6 7 9 -2 2 5 7 6 2 4 4 9 8 5 3 2 2 5 9 6 7 -1 1 2 10 8 2 6 6 10

T+ = Sum of positive ranks = 46 T- = Sum of negative ranks = 9

11

Comparing Two Populations: Paired Differences Experiment

Wilcoxon Signed Rank Test for a Paired Difference ExperimentLet D1 and D2 represent the probability distributions for populations 1 and 2, respectivelyOne-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

[or Ha: D1 is shifted to the left of D2]


Test Statistic T-, the rank sum of the negative distances (or T+, the rank sum of the positive distances)

T, the smaller of T+ or T-

Rejection region: T-: ≤ T0 [or T+: ≤ T0]

Rejection region: T ≤ T0

Where T0 is from table

Required Conditions

Sample of differences is randomly selected

Probability distribution from which sample is drawn is continuous

12

Comparing Three or More Populations: Completely Randomized Design

Kruskal-Wallis H-Test

An alternative to the completely randomized ANOVA

Based on comparison of rank sums

Number of Available Beds

Hospital 1 Hospital 2 Hospital 3

Beds Rank Beds Rank Beds Rank 6 5 34 25 13 9.5 38 27 28 19 35 26 3 2 42 30 19 15 17 13 13 9.5 4 3 11 8 40 29 29 20 30 21 31 22 0 1 15 11 9 7 7 6 16 12 32 23 33 24 25 17 39 28 18 14 5 4 27 18 24 16

R1 = 120 R2 = 210.5 R3 = 134.5

13

Comparing Three or More Populations: Completely Randomized Design

Kruskal-Wallis H-Test for Comparing k Probability Distributions

Required Conditions:•The k samples are random and independent•5 or more measurements per sample•Probability distributions samples drawn from are continuous

H0: The k probability distributions are identical Ha: At least two of the k probability distributions differ in location

Test statistic:

212

3( 1)1

j

j

RH n

n n n

Where Nj = Number of measurements in sample j Rj = Rank sum for sample j, where the rank of each measurement is computed according to its relative magnitude in the totality of data for the p samples n = Total Sample Size = n1 +n2 + ….+ nk Rejection region: 2H with (k-1) degrees of freedom

14

Comparing Three or More Populations: Randomized Block Design

The Friedman Fr Test

A nonparametric method for the randomized block design

Based on comparison of rank sums

Reaction Time for Three Drugs Subject Drug A Rank Drug B Rank Drug C Rank

1 1.21 1 1.48 2 1.56 3 2 1.63 1 1.85 2 2.01 3 3 1.42 1 2.06 3 1.70 2 4 2.43 2 1.98 1 2.64 3 5 1.16 1 1.27 2 1.48 3 6 1.94 1 2.44 2 2.81 3

R1 = 7 R2 = 12 R3 = 17

15

Comparing Three or More Populations: Randomized Block Design

The Friedman Fr-test

Required Conditions:•Random assignment of treatments to units within blocks•Measurements can be ranked within blocks•Probability distributions samples within each block drawn from are continuous

H0: The probability distributions for the p treatments are identical Ha: At least two of the p probability distributions differ in location

Test statistic:

2123 ( 1)

1r jF R b pbp p

Where b = Number of blocks p = number of treatments Rj = Rank sum of the jth treatment; where the rank of each measurement is computed relative to its position within its own block Rejection region: 2

rF with (p-1) degrees of freedom

16

Rank Correlation

Spearman’s rank correlation coefficient Rs provides a measure of correlation between ranks

Brake Rankings of New Car Models: Less than Perfect Agreement

Magazine Difference between Rank 1 and Rank 2

Car Model 1 2 D D2 1 4 5 -1 1 2 1 2 -1 1 3 9 10 -1 1 4 5 6 -1 1 5 2 1 1 1 6 10 9 1 1 7 7 7 0 0 8 3 3 0 0 9 6 4 2 4 10 8 8 0 0

2 10d

17

Rank Correlation

Conditions Required:Sample of experimental units is randomly selectedProbability distributions of two variables are continuous

One-Tailed Test Two-Tailed Test H0:p = 0 H0: p = 0 Ha :p < 0

{or Ha: p> 0] Ha: p 0

Test Statistic 2

2

61

( 1)i

s

dr

n n

Where di = ui –vi (difference in ranks of ith observations for samples 1 and 2 Rejection region: ,s sr r

(or ,s sr r when Ha: p> 0)

Rejection region: , / 2s sr r

Documents

Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions