17
Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

Embed Size (px)

Citation preview

Page 1: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

Chapter 14

Nonparametric Statistics

Page 2: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

2

Introduction: Distribution-Free Tests

Distribution-free tests – statistical tests that don’t rely on assumptions about the probability distribution of the sampled populationNonparametrics – branch of inferential statistics devoted to distribution-free testsRank statistics (Rank tests) – nonparametric statistics based on the ranks of measurements

Page 3: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

3

Single Population Inferences

The Sign test is used to make inferences about the central tendency of a single populationTest is based on the median η Test involves hypothesizing a value for the population median, then testing to see if the distribution of sample values around the hypothesized median value reaches significance

Page 4: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

4

Single Population Inferences

Sign Test for a Population Median ηOne-Tailed Test Two-Tailed Test H0:η1 = η0 H0: η1 = η0 Ha :η1 < η0

{or Ha: η1> η0] Ha: η1 η0

Test Statistic S = Number of sample measurements greater than η0 [or S = number of measurements less than η0]

S = Larger of S1 and S2, where S1 is the number of measurements less than η0 and S2 is the number of measurements greater than η0

Observed Significance Level p-value = P(x ≥ S) p-value = 2P(x ≥ S) Where x has a binomial distribution with parameters n and p = .5

Rejection region: Reject H0 if p-value ≤ .05

Conditions required for sign test – sample must be randomly selected from a continuous probability distribution

Page 5: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

5

Single Population Inferences

Large-Sample Sign Test for a Population Median η

Conditions required for sign test – sample must be randomly selected from a continuous probability distribution

One-Tailed Test Two-Tailed Test H0:η1 = η0 H0: η1 = η0 Ha :η1 < η0

{or Ha: η1> η0] Ha: η1 η0

Test Statistic

.5 .5

.5

S nz

n

Observed Significance Level p-value = P(x ≥ S) p-value = 2P(x ≥ S) Where x has a binomial distribution with parameters n and p = .5

Rejection region: z z Rejection region: / 2z z

Page 6: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

6

Comparing Two Populations: Independent Samples

The Wilcoxon Rank Sum Test is used when two independent random samples are being used to compare two populations, and the t-test is not appropriate

It tests the hypothesis that the probability distributions associated with the two populations are equivalent

Page 7: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

7

Comparing Two Populations: Independent Samples

Rank Data from both samples from smallest to largest

If populations are the same, ranks should be randomly mixed between the samples

Test statistic is based on the rank sums – the totals of the ranks for each of the samples. T1 is the sum for sample 1, T2 is the sum for sample 2

Percentage Cost of Living Change, as Predicted by Government and University Economists

Government Economist (1) University Economist (2) Prediction Rank Prediction Rank

3.1 4 4.4 6 4.8 7 5.8 9 2.3 2 3.9 5 5.6 8 8.7 11 0.0 1 6.3 10 2.9 3 10.5 12

10.8 13

Page 8: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

8

Comparing Two Populations: Independent Samples

Wilcoxon Rank Sum Test: Independent Samples

Required Conditions: Random, independent samples

Probability distributions samples drawn from are continuous

One-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

{or Ha: D1 is shifted to the left of D2]

Ha :D1 is shifted either to the left or to the right of D2

Test Statistic T1, if n1<n2; T2, if n2 < n1 (Either rank sum can be used if n1 = n2)

T1, if n1<n2; T2, if n2 < n1 (Either rank sum can be used if n1 = n2) We will denote this rank sum as T

Rejection region: T1: T1 ≥ TU [or T1 ≤ TL] T1: T1 ≤ TL [or T1 ≥ TU]

Rejection region: T ≤ TL or T ≥ TU

Where TL and TU are obtained from table

Page 9: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

9

Comparing Two Populations: Independent Samples

Wilcoxon Rank Sum Test for Large Samples(n1 and n2 ≥ 10)

One-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

{or Ha: D1 is shifted to the left of D2]

Ha :D1 is shifted either to the left or to the right of D2

Test Statistic

1 1 21

1 2 1 2

( 1)

2:( 1)

12

n n nT

Test statistic zn n n n

Rejection region: z>z(or z<-z)

Rejection region: |z|>z/2

Page 10: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

10

Comparing Two Populations: Paired Differences Experiment

Wilcoxon Signed Rank Test: An alternative test to the paired difference of means procedure

Analysis is of the differences between ranks

Any differences of 0 are eliminated, and n is reduced accordingly

Softness Ratings of Paper

Product Difference Judge A B (A-B) Absolute Value of Difference Rank of Absolute Value

1 6 4 2 2 5 2 8 5 3 3 7.5 3 4 5 -1 1 2 4 9 8 1 1 2 5 4 1 3 3 7.5 6 7 9 -2 2 5 7 6 2 4 4 9 8 5 3 2 2 5 9 6 7 -1 1 2 10 8 2 6 6 10

T+ = Sum of positive ranks = 46 T- = Sum of negative ranks = 9

Page 11: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

11

Comparing Two Populations: Paired Differences Experiment

Wilcoxon Signed Rank Test for a Paired Difference ExperimentLet D1 and D2 represent the probability distributions for populations 1 and 2, respectivelyOne-Tailed Test Two-Tailed Test H0:D1 and D2 are identical H0:D1 and D2 are identical Ha :D1 is shifted to the right of D2

[or Ha: D1 is shifted to the left of D2]

Ha :D1 is shifted either to the left or to the right of D2

Test Statistic T-, the rank sum of the negative distances (or T+, the rank sum of the positive distances)

T, the smaller of T+ or T-

Rejection region: T-: ≤ T0 [or T+: ≤ T0]

Rejection region: T ≤ T0

Where T0 is from table

Required Conditions

Sample of differences is randomly selected

Probability distribution from which sample is drawn is continuous

Page 12: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

12

Comparing Three or More Populations: Completely Randomized Design

Kruskal-Wallis H-Test

An alternative to the completely randomized ANOVA

Based on comparison of rank sums

Number of Available Beds

Hospital 1 Hospital 2 Hospital 3

Beds Rank Beds Rank Beds Rank 6 5 34 25 13 9.5 38 27 28 19 35 26 3 2 42 30 19 15 17 13 13 9.5 4 3 11 8 40 29 29 20 30 21 31 22 0 1 15 11 9 7 7 6 16 12 32 23 33 24 25 17 39 28 18 14 5 4 27 18 24 16

R1 = 120 R2 = 210.5 R3 = 134.5

Page 13: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

13

Comparing Three or More Populations: Completely Randomized Design

Kruskal-Wallis H-Test for Comparing k Probability Distributions

Required Conditions:•The k samples are random and independent•5 or more measurements per sample•Probability distributions samples drawn from are continuous

H0: The k probability distributions are identical Ha: At least two of the k probability distributions differ in location

Test statistic:

212

3( 1)1

j

j

RH n

n n n

Where Nj = Number of measurements in sample j Rj = Rank sum for sample j, where the rank of each measurement is computed according to its relative magnitude in the totality of data for the p samples n = Total Sample Size = n1 +n2 + ….+ nk Rejection region: 2H with (k-1) degrees of freedom

Page 14: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

14

Comparing Three or More Populations: Randomized Block Design

The Friedman Fr Test

A nonparametric method for the randomized block design

Based on comparison of rank sums

Reaction Time for Three Drugs Subject Drug A Rank Drug B Rank Drug C Rank

1 1.21 1 1.48 2 1.56 3 2 1.63 1 1.85 2 2.01 3 3 1.42 1 2.06 3 1.70 2 4 2.43 2 1.98 1 2.64 3 5 1.16 1 1.27 2 1.48 3 6 1.94 1 2.44 2 2.81 3

R1 = 7 R2 = 12 R3 = 17

Page 15: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

15

Comparing Three or More Populations: Randomized Block Design

The Friedman Fr-test

Required Conditions:•Random assignment of treatments to units within blocks•Measurements can be ranked within blocks•Probability distributions samples within each block drawn from are continuous

H0: The probability distributions for the p treatments are identical Ha: At least two of the p probability distributions differ in location

Test statistic:

2123 ( 1)

1r jF R b pbp p

Where b = Number of blocks p = number of treatments Rj = Rank sum of the jth treatment; where the rank of each measurement is computed relative to its position within its own block Rejection region: 2

rF with (p-1) degrees of freedom

Page 16: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

16

Rank Correlation

Spearman’s rank correlation coefficient Rs provides a measure of correlation between ranks

Brake Rankings of New Car Models: Less than Perfect Agreement

Magazine Difference between Rank 1 and Rank 2

Car Model 1 2 D D2 1 4 5 -1 1 2 1 2 -1 1 3 9 10 -1 1 4 5 6 -1 1 5 2 1 1 1 6 10 9 1 1 7 7 7 0 0 8 3 3 0 0 9 6 4 2 4 10 8 8 0 0

2 10d

Page 17: Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions

17

Rank Correlation

Conditions Required:Sample of experimental units is randomly selectedProbability distributions of two variables are continuous

One-Tailed Test Two-Tailed Test H0:p = 0 H0: p = 0 Ha :p < 0

{or Ha: p> 0] Ha: p 0

Test Statistic 2

2

61

( 1)i

s

dr

n n

Where di = ui –vi (difference in ranks of ith observations for samples 1 and 2 Rejection region: ,s sr r

(or ,s sr r when Ha: p> 0)

Rejection region: , / 2s sr r