Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company

Nonparametric Tests

IPS Chapter 15

© 2009 W.H. Freeman and Company

Objectives (IPS Chapter 15)

Nonparametric Tests

The Wilcoxon rank sum test

The Normal approximation for W

What hypotheses does Wilcoxon test?

The Wilcoxon signed rank test

The Normal approximation for W+

Dealing with ties

The Kruskal-Wallis test

Assumptions for inference

For the inference methods for means we have already studied, we

assumed that the variables have Normal distributions in the

population(s) from which we draw our data.

Robustness: some skewness was acceptable, especially if the sample

size was large.

What happens if plots suggest the data are clearly not Normal,

especially if the sample size is small?

Options for non-Normal data and small n1. Is lack of Normality due to outliers? If an outlier appears to be “real data,”

you have to leave it in, but if you have reason to think there is an error in

that data, you may be able to remove it.

2. Try transforming the data. For example use a logarithm for right-skewed

data.

3. Try another standard distribution. Other procedures can replace the t

procedures if data (especially right-skewed data) fits another distribution.

4. Use modern bootstrap methods and permutation tests. Heavy computing

avoids requiring Normality or any other specific form of sampling

distribution.

5. Use other nonparametric methods. Discussed in this chapter.

Ranks

Hypotheses for rank tests just replace the mean with the median.

For strongly skewed data, we prefer the median to the mean for describing the center of the data.

To rank observations, first arrange them in order from smallest to

largest. The rank of each observation is its position in this ordered list,

starting with rank 1 for the smallest observation.

Example: Weeds among the corn

Does the presence of small numbers of weeds reduce the yield of corn?

Lamb’s-quarter is a common weed in corn fields. A researcher planted corn at

the same rate in 8 small plots of ground, then weeded the corn rows by hand to

allow no weeds in 4 randomly selected plots and exactly 3 lamb’s–quarter plants

per meter of row in the other 4 plots.

Here are the yields of corn (bushels per acre) in each of the plots.

A Back-to-back stemplot shows non-Normality, possible outliers, and small sample sizes.


First rank all 8 observations together.

Arrange them in order from smallest to largest.

The shaded numbers are those with no weeds.

Note that 4 of the 5 highest yields are from the no weeds group.

The idea of rank tests is to look just at position in this list.

Working with ranks allows us to dispense with the numerical values of the data and the specific conditions on the shape of the distribution such as Normality.


If the presence of weeds reduces corn yields, we expect the ranks of the yields

from plots without weeds to be larger as a group than the ranks from plots with

weeds.

Compare the sums of the ranks from the two treatments.

If the weeds have no effect, we would expect the sum of the ranks in either group to be 18. Why?

Wilcoxon Rank Sum TestDraw an SRS of size n1 from one population and draw an independent SRS of

size n2 from a second population. There are N observations in all, where N = n1

+ n2. Rank all N observations. The sum W of the ranks for the first sample is

the Wilcoxon rank sum statistic. If the two populations have the same

continuous distribution, then W has mean

and standard deviation

The Wilcoxon rank sum test rejects the hypothesis that the two populations

have identical distributions when the rank sum W is far from its mean.

1( 1)

2W

n Nμ +=

1 2( 1)

12W

n n Nσ +=


In this study, we want to test the hypotheses

H0: No difference in distribution of yields.

Ha: Yields are systematically higher in weed-free plots.

The test statistic is the rank sum W = 23 for the weed-free plots.

Conditions for Wilcoxon test are met:

data come from a randomized comparative experiment.

yield of corn in bushels per acre has a continuous distribution.


N = 8, n1 (no weeds) = 4, and n2 (3 weeds per meter) = 4.

The sum of ranks for the weed-free plants has mean and standard deviation:

The observed rank sum W = 23 is only 1.4 standard deviations above the mean.

Software tells us that the P-value for P(W 23) is 0.1.

We cannot reject the null hypothesis.

We do not have enough evidence to say that yields are systematically

higher in weed-free plots.

A larger sample size might clarify the effect of weeds on corn yield.

.

1( 1) 4(9)18

2 2W

n Nμ += = =

1 2( 1) (4)(4)(9)3.464

12 12W

n n Nσ += = =

The Normal Approximation for W

To calculate the P-value for the rank sum Wilcoxon test, we need to

know the sampling distribution of W when the null hypothesis is true.

With or without software, P-values for the Wilcoxon test are often

based on the fact that the rank sum statistic W becomes

approximately Normal as the two sample sizes increase.

Test statistic: 1

1 2

( 1) / 2

( 1) /12W

W

W W n Nz

n n N

μσ− − +

= =+


We can improve this approximation by using the continuity correction. You

use this for a variable that takes only whole-number values, like W. Act as if

each whole number occupies the entire interval from 0.5 below the number to

0.5 above it.

Software tells us that the exact P-value for P(W 23) is 0.1.

23 181.44

3.464

value ( 1.44) 0.0749

W

W

Wz

P P Z

μσ− −

= = =

− = ≥ =

22.5 181.30

3.464

value ( 1.30) 0.0968

W

W

Wz

P P Z

μσ− −

= = =

− = ≥ =

Use software that gives the

exact P-value for the Wilcoxon

test rather than the Normal

approximation.

Here is the output from JMP; notice

that it only gives the normal

approximation for the Wilcoxon

Rank Sum test.

Using technologyOneway Analysis of yield By group

Quantiles Level Minimum 10% 25% Median 75% 90% Maximum no weeds

165 165 165.425 169.45 175.725 176.9 176.9

weeds 153.1 153.1 153.825 157.3 171.95 176.4 176.4 Wilcoxon / Kruskal- Wallis Tests (Rank Sums) Level Count Score Sum Score Mean (Mean-

Mean0)/Std0 no weeds 4 23.000 5.75000 1.299 weeds 4 13.000 3.25000 - 1.299 2- Sample Test, Normal Approximation

S Z Prob>|Z| 13 - 1.29904 0.1939

Small sample sizes. Refer to statistical tables for tests, rather than large- sample approximations.

Here's a place to find a small sample table:

http://www.socr.ucla.edu/Applets.dir/WilcoxonRankSumTable.htm

What hypotheses does Wilcoxon test?If we assume that our sample is Normally distributed, we can use

the two-sample t test for means.

When the distribution may not be Normal, we might restate the

hypotheses in terms of population medians rather than means.

The Wilcoxon rank sum test will test the hypotheses above only if an

additional condition is met: both populations must have

distributions of the same shape.

H0: μ1 = μ2

Ha: μ1 > μ2

H0: median1 = median2

Ha: median1 > median2

What hypotheses does Wilcoxon test?The same shape condition is too strict to be reasonable in practice.

A more useful statement of the hypotheses compares two

continuous distributions, whether or not they have the same

shape.

These hypotheses are considered “nonparametric” because they do

not include a parameter. They are just stated in words.

H0: the two distributions are the same

Ha: one has values that are systematically larger

Dealing with ties in rank testsUp until now, our data has had no two values exactly the same.

However, we often find observations tied at the same value.

The usual practice is to assign all tied values the average of the

ranks they occupy - these are sometimes called midranks.

In practice, software is required to use rank tests when the data contain

tied values.

ANOVA hypotheses:

Data should come from independent random samples, all Normally distributed with the

same standard deviation

Kruskal-Wallis hypotheses:

1. Data should come from independent random samples, the response has a continuous

(but not necessarily Normal) distribution.

2. Data should come from independent random samples, the response has a continuous

(but not necessarily Normal) distribution, and the samples come from population

distributions of the same shape (not necessarily Normal).

Comparing several samples: the Kruskal-Wallis test

0 0 1 3 9:

: not all four medians are equala

H M M M M

H

= = =


Lamb’s-quarter is a common weed in corn fields. A researcher planted corn at

the same rate in 16 small plots of ground, then randomly assigned the plots to 4

groups. He weeded the corn rows by hand to allow a fixed number of lamb’s-

quarter plants to grow in each meter of corn row. These numbers were 0, 1, 3,

and 9 in the four groups of plots. No other weeds were allowed to grow, and all

plots received identical treatment except for the weeds.

Here are the yields of corn (bushels per acre) in each of the plots.


Here are the summary statistics for the corn yield.

Can we safely use ANOVA? The standard deviations don’t pass the largest s < 2 (smallest s) test, and there were outliers in the original data that cannot be removed.

Can we use the median Kruskal-Wallis test? The different standard deviations suggest that the distributions do not all have the same shape.

Example: Weeds among the cornRank all 16 observations in order from smallest to largest.

Note the tied observations

Kruskal-Wallis test statistic


Kruskal-Wallis test statistic:

Using Table E with df = 3, the P-value is 0.10 < P < 0.15.

We do not reject the null hypothesis.

JMP output for the

Kruskal-Wallis test

and ANOVA.

Using technologyOneway Analysis of yield By group

Quantiles Level Minimum 10% 25% Median 75% 90% Maximum 0 weeds

165 165 165.425 169.45 175.725 176.9 176.9

1 weeds

157.3 157.3 158.25 163.65 166.575 166.7 166.7

3 weeds

153.1 153.1 153.825 157.3 171.95 176.4 176.4

9 weeds

142.4 142.4 147.4 162.55 162.775 162.8 162.8

Wilcoxon / Kruskal- Wallis Tests (Rank Sums) Level Count Score Sum Score Mean (Mean-

Mean0)/Std0 0 weeds 4 52.500 13.1250 2.184 1 weeds 4 33.500 8.3750 0.000 3 weeds 4 25.000 6.2500 - 1.032 9 weeds 4 25.000 6.2500 - 1.032 1- way Test, ChiSquare Approximation

ChiSquare DF Prob>ChiSq 5.5725 3 0.1344

Small sample sizes. Refer to statistical tables for tests, rather than large- sample approximations.

Documents

Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company