Non-Parametric Methods

Non-Parametric Methods

Peter T. Donnan

Professor of Epidemiology and Biostatistics

Statistics for Health Statistics for Health ResearchResearch

Objectives of Objectives of PresentationPresentation

• IntroductionIntroduction• Ranks & MedianRanks & Median• Wilcoxon Signed Rank TestWilcoxon Signed Rank Test• Paired Wilcoxon Signed RankPaired Wilcoxon Signed Rank• Mann-Whitney testMann-Whitney test• Spearman’s Rank Correlation Spearman’s Rank Correlation

CoefficientCoefficient• Others….Others….

• IntroductionIntroduction• Ranks & MedianRanks & Median• Wilcoxon Signed Rank TestWilcoxon Signed Rank Test• Paired Wilcoxon Signed RankPaired Wilcoxon Signed Rank• Mann-Whitney testMann-Whitney test• Spearman’s Rank Correlation Spearman’s Rank Correlation

CoefficientCoefficient• Others….Others….

What are non-parametric What are non-parametric tests? tests?

• ‘‘Parametric’ tests involve Parametric’ tests involve estimating parameters such as the estimating parameters such as the mean, and assume that distribution mean, and assume that distribution of sample means are ‘normally’ of sample means are ‘normally’ distributeddistributed

• Often data does not follow a Often data does not follow a Normal distribution eg number of Normal distribution eg number of cigarettes smoked, cost to NHS etc.cigarettes smoked, cost to NHS etc.

• Positively skewed distributionsPositively skewed distributions

• ‘‘Parametric’ tests involve Parametric’ tests involve estimating parameters such as the estimating parameters such as the mean, and assume that distribution mean, and assume that distribution of sample means are ‘normally’ of sample means are ‘normally’ distributeddistributed

• Often data does not follow a Often data does not follow a Normal distribution eg number of Normal distribution eg number of cigarettes smoked, cost to NHS etc.cigarettes smoked, cost to NHS etc.

• Positively skewed distributionsPositively skewed distributions

A positively skewed A positively skewed distributiondistribution

0 10 20 30 40 50

Units of alcohol per week

0

5

10

15

20

Freq

ue

nc

y

Mean = 8.03Std. Dev. = 12.952N = 30

What are non-parametric What are non-parametric tests? tests?

• ‘‘Non-parametric’ tests were Non-parametric’ tests were developed for these situations developed for these situations where fewer assumptions have to where fewer assumptions have to be madebe made

• NP tests STILL have assumptions NP tests STILL have assumptions but are less stringentbut are less stringent

• NP tests can be applied to Normal NP tests can be applied to Normal data but parametric tests have data but parametric tests have greater power greater power IFIF assumptions met assumptions met

• ‘‘Non-parametric’ tests were Non-parametric’ tests were developed for these situations developed for these situations where fewer assumptions have to where fewer assumptions have to be madebe made

• NP tests STILL have assumptions NP tests STILL have assumptions but are less stringentbut are less stringent

• NP tests can be applied to Normal NP tests can be applied to Normal data but parametric tests have data but parametric tests have greater power greater power IFIF assumptions met assumptions met

Ranks Ranks

•Practical differences between Practical differences between parametric and NP are that NP parametric and NP are that NP methods use the methods use the ranksranks of of values rather than the actual values rather than the actual valuesvalues

•E.g. E.g.

1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,7,13,22,38,45 - actual

1,2,3,4,5,6, 7, 8, 9,10 - rank1,2,3,4,5,6, 7, 8, 9,10 - rank

•Practical differences between Practical differences between parametric and NP are that NP parametric and NP are that NP methods use the methods use the ranksranks of of values rather than the actual values rather than the actual valuesvalues

•E.g. E.g.

1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,7,13,22,38,45 - actual

1,2,3,4,5,6, 7, 8, 9,10 - rank1,2,3,4,5,6, 7, 8, 9,10 - rank

MedianMedian

• The median is the value above The median is the value above and below which 50% of the data and below which 50% of the data lie. lie.

• If the data is ranked in order, it is If the data is ranked in order, it is the middle valuethe middle value

• In symmetric distributions the In symmetric distributions the mean and median are the samemean and median are the same

• In skewed distributions, median In skewed distributions, median more appropriatemore appropriate

• The median is the value above The median is the value above and below which 50% of the data and below which 50% of the data lie. lie.

• If the data is ranked in order, it is If the data is ranked in order, it is the middle valuethe middle value

• In symmetric distributions the In symmetric distributions the mean and median are the samemean and median are the same

• In skewed distributions, median In skewed distributions, median more appropriatemore appropriate

MedianMedian

• BPs:BPs:

135, 138, 140, 140, 141, 142, 135, 138, 140, 140, 141, 142, 143143

Median=Median=

• BPs:BPs:

135, 138, 140, 140, 141, 142, 135, 138, 140, 140, 141, 142, 143143

Median=Median=

MedianMedian

• BPs:BPs:

135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143

Median=140Median=140

• No. of cigarettes smoked:No. of cigarettes smoked:

0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10

Median=Median=

• BPs:BPs:

135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143



0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10

Median=Median=

MedianMedian

• BPs:BPs:

135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143



0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10

Median=2.5Median=2.5

• BPs:BPs:

135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143



0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10

Median=2.5Median=2.5

T-testT-test

• T-test used to test whether the T-test used to test whether the mean of a sample is sig different mean of a sample is sig different from a hypothesised sample meanfrom a hypothesised sample mean

• T-test relies on the sample being T-test relies on the sample being drawn from a normally distributed drawn from a normally distributed populationpopulation

• If sample If sample notnot Normal then use the Normal then use the Wilcoxon Signed Rank Test as an Wilcoxon Signed Rank Test as an alternativealternative

• T-test used to test whether the T-test used to test whether the mean of a sample is sig different mean of a sample is sig different from a hypothesised sample meanfrom a hypothesised sample mean

• T-test relies on the sample being T-test relies on the sample being drawn from a normally distributed drawn from a normally distributed populationpopulation

• If sample If sample notnot Normal then use the Normal then use the Wilcoxon Signed Rank Test as an Wilcoxon Signed Rank Test as an alternativealternative

Wilcoxon Signed Rank Wilcoxon Signed Rank TestTest

• NP test relating to the median as NP test relating to the median as measure of central tendencymeasure of central tendency

• The ranks of the absolute The ranks of the absolute differences between the data and differences between the data and the hypothesised median calculatedthe hypothesised median calculated

• The ranks for the negative and the The ranks for the negative and the positive differences are then positive differences are then summed separately (Wsummed separately (W- - and Wand W+ + resp.)resp.)

• The minimum of these is the test The minimum of these is the test statistic, Wstatistic, W

• NP test relating to the median as NP test relating to the median as measure of central tendencymeasure of central tendency

• The ranks of the absolute The ranks of the absolute differences between the data and differences between the data and the hypothesised median calculatedthe hypothesised median calculated

• The ranks for the negative and the The ranks for the negative and the positive differences are then positive differences are then summed separately (Wsummed separately (W- - and Wand W+ + resp.)resp.)

• The minimum of these is the test The minimum of these is the test statistic, Wstatistic, W

Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample

The median heart rate for an 18 The median heart rate for an 18 year old girl is supposed to be year old girl is supposed to be 82bpm. A student takes the 82bpm. A student takes the pulse rates of 8 female students pulse rates of 8 female students (all aged 18):(all aged 18):

83, 90, 96, 82, 85, 80, 81, 8783, 90, 96, 82, 85, 80, 81, 87

Do these results suggest that the Do these results suggest that the median might not be 82?median might not be 82?

The median heart rate for an 18 The median heart rate for an 18 year old girl is supposed to be year old girl is supposed to be 82bpm. A student takes the 82bpm. A student takes the pulse rates of 8 female students pulse rates of 8 female students (all aged 18):(all aged 18):

83, 90, 96, 82, 85, 80, 81, 8783, 90, 96, 82, 85, 80, 81, 87

Do these results suggest that the Do these results suggest that the median might not be 82?median might not be 82?


HH00::HH00::


HH00: median=82: median=82

HH11::


HH11::



HH11: median≠82: median≠82






Two-tailed testTwo-tailed test

Because one result equals 82 this Because one result equals 82 this cannot be used in the analysiscannot be used in the analysis



Two-tailed testTwo-tailed test

Because one result equals 82 this Because one result equals 82 this cannot be used in the analysiscannot be used in the analysis


ResultResult Above or Above or below below

medianmedian

Absolute Absolute difference from difference from

median=82median=82

Rank of Rank of differencedifference

8383 ++ 11 1.51.5

9090 ++ 88 66

9696 ++ 1414 77

8585 ++ 33 44

8080 -- 22 33

8181 -- 11 1.51.5

8787 ++ 55 55

WW++= 1.5+6+7+4+5=23.5= 1.5+6+7+4+5=23.5 WW--= 3+1.5=4.5= 3+1.5=4.5 So, So, W=4.5W=4.5

n=7, so the value of W > tabulated value of 2, so p>0.05n=7, so the value of W > tabulated value of 2, so p>0.05


Therefore, the student should Therefore, the student should conclude that these results could conclude that these results could have come from a population which have come from a population which had a median of 82 as the result is had a median of 82 as the result is not significantly different to the null not significantly different to the null hypothesis value.hypothesis value.

Therefore, the student should Therefore, the student should conclude that these results could conclude that these results could have come from a population which have come from a population which had a median of 82 as the result is had a median of 82 as the result is not significantly different to the null not significantly different to the null hypothesis value.hypothesis value.

Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test Normal ApproximationNormal Approximation

• As the number of ranks (n) As the number of ranks (n) becomes larger, the distribution becomes larger, the distribution of W becomes approximately of W becomes approximately NormalNormal

• Generally, if n>20Generally, if n>20

• Mean W=n(n+1)/4Mean W=n(n+1)/4

• Variance W=n(n+1)(2n+1)/24Variance W=n(n+1)(2n+1)/24

• Z=(W-mean W)/SD(W)Z=(W-mean W)/SD(W)

• As the number of ranks (n) As the number of ranks (n) becomes larger, the distribution becomes larger, the distribution of W becomes approximately of W becomes approximately NormalNormal

• Generally, if n>20Generally, if n>20

• Mean W=n(n+1)/4Mean W=n(n+1)/4

• Variance W=n(n+1)(2n+1)/24Variance W=n(n+1)(2n+1)/24

• Z=(W-mean W)/SD(W)Z=(W-mean W)/SD(W)

Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test AssumptionsAssumptions

• Population should be Population should be approximately symmetrical approximately symmetrical butbut need not be Normal need not be Normal

• Results must be classified as either Results must be classified as either being greater than or less than the being greater than or less than the median ie exclude results=medianmedian ie exclude results=median

• Can be used for small or large Can be used for small or large samplessamples

• Population should be Population should be approximately symmetrical approximately symmetrical butbut need not be Normal need not be Normal

• Results must be classified as either Results must be classified as either being greater than or less than the being greater than or less than the median ie exclude results=medianmedian ie exclude results=median

• Can be used for small or large Can be used for small or large samplessamples

Paired samples t-test Paired samples t-test

• DisadvantageDisadvantage: Assumes data are : Assumes data are a random sample from a a random sample from a population which is Normally population which is Normally distributeddistributed

• AdvantageAdvantage: Uses all detail of the : Uses all detail of the available data, and if the data are available data, and if the data are normally distributed it is the most normally distributed it is the most powerful testpowerful test

• DisadvantageDisadvantage: Assumes data are : Assumes data are a random sample from a a random sample from a population which is Normally population which is Normally distributeddistributed

• AdvantageAdvantage: Uses all detail of the : Uses all detail of the available data, and if the data are available data, and if the data are normally distributed it is the most normally distributed it is the most powerful testpowerful test

The Wilcoxon Signed Rank The Wilcoxon Signed Rank Test for Paired Comparisons Test for Paired Comparisons

• DisadvantageDisadvantage: Only the sign (+ : Only the sign (+ or -) of any change is analysedor -) of any change is analysed

• AdvantageAdvantage: Easy to carry out : Easy to carry out and data can be analysed from and data can be analysed from any distribution or populationany distribution or population

• DisadvantageDisadvantage: Only the sign (+ : Only the sign (+ or -) of any change is analysedor -) of any change is analysed

• AdvantageAdvantage: Easy to carry out : Easy to carry out and data can be analysed from and data can be analysed from any distribution or populationany distribution or population

Paired And Not Paired Paired And Not Paired Comparisons Comparisons

• If you have the same sample If you have the same sample measured on two separate measured on two separate occasions then this is a paired occasions then this is a paired comparisoncomparison

• Two independent samples is not Two independent samples is not a paired comparisona paired comparison

• Different samples which are Different samples which are ‘matched’ by age and gender are ‘matched’ by age and gender are pairedpaired

• If you have the same sample If you have the same sample measured on two separate measured on two separate occasions then this is a paired occasions then this is a paired comparisoncomparison

• Two independent samples is not Two independent samples is not a paired comparisona paired comparison

• Different samples which are Different samples which are ‘matched’ by age and gender are ‘matched’ by age and gender are pairedpaired

The Wilcoxon Signed Rank The Wilcoxon Signed Rank Test for Paired Comparisons Test for Paired Comparisons

• Similar calculation to the Wilcoxon Similar calculation to the Wilcoxon Signed Rank test, only the Signed Rank test, only the differences in the paired results differences in the paired results are rankedare ranked

• Example using SPSS:Example using SPSS:

A group of 10 patients with chronic A group of 10 patients with chronic anxiety receive sessions of anxiety receive sessions of cognitive therapy. Quality of Life cognitive therapy. Quality of Life scores are measured before and scores are measured before and after therapy.after therapy.

• Similar calculation to the Wilcoxon Similar calculation to the Wilcoxon Signed Rank test, only the Signed Rank test, only the differences in the paired results differences in the paired results are rankedare ranked

• Example using SPSS:Example using SPSS:

A group of 10 patients with chronic A group of 10 patients with chronic anxiety receive sessions of anxiety receive sessions of cognitive therapy. Quality of Life cognitive therapy. Quality of Life scores are measured before and scores are measured before and after therapy.after therapy.

QoL ScoreQoL Score

BeforeBefore AfterAfter66 9955 121233 9944 9922 3311 1133 2288 121266 99

1212 1010

Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test exampleexample

Wilcoxon Signed Rank Wilcoxon Signed Rank Test exampleTest example

p < 0.05

SPSS OutputSPSS Output

Mann-Whitney testMann-Whitney test

• Used when we want to compare Used when we want to compare two unrelated or INDEPENDENT two unrelated or INDEPENDENT groupsgroups

• For parametric data you would For parametric data you would use the unpaired (independent) use the unpaired (independent) samples t-testsamples t-test

• The assumptions of the t-test The assumptions of the t-test were:were:

1.1. The distribution of the measure in each The distribution of the measure in each group is approx Normally distributedgroup is approx Normally distributed

2.2. The variances are similarThe variances are similar

• Used when we want to compare Used when we want to compare two unrelated or INDEPENDENT two unrelated or INDEPENDENT groupsgroups

• For parametric data you would For parametric data you would use the unpaired (independent) use the unpaired (independent) samples t-testsamples t-test

• The assumptions of the t-test The assumptions of the t-test were:were:

1.1. The distribution of the measure in each The distribution of the measure in each group is approx Normally distributedgroup is approx Normally distributed

2.2. The variances are similarThe variances are similar

Example (1)Example (1)

The following data shows the number The following data shows the number

of alcohol units per week collected in a of alcohol units per week collected in a

survey:survey:

Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0

Women (n=14): Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,00,0,0,0,1,5,4,1,0,0,3,20,0,0

Is the amount greater in men compared Is the amount greater in men compared

to women?to women?

The following data shows the number The following data shows the number

of alcohol units per week collected in a of alcohol units per week collected in a

survey:survey:

Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0

Women (n=14): Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,00,0,0,0,1,5,4,1,0,0,3,20,0,0

Is the amount greater in men compared Is the amount greater in men compared

to women?to women?


How would you test whether the How would you test whether the

distributions in both groups are distributions in both groups are

approximately Normally approximately Normally distributed?distributed?



approximately Normally approximately Normally distributed?distributed?




approximately Normally distributed?approximately Normally distributed?

Plot histogramsPlot histograms Stem and leaf plotStem and leaf plot Box-plotBox-plot Q-Q or P-P plotQ-Q or P-P plot



approximately Normally distributed?approximately Normally distributed?

Plot histogramsPlot histograms Stem and leaf plotStem and leaf plot Box-plotBox-plot Q-Q or P-P plotQ-Q or P-P plot

Male Female

Gender

0

10

20

30

40

50U

nit

s o

f alc

oh

ol p

er

we

ek

25

6

7

Boxplots of alcohol units per week by Boxplots of alcohol units per week by gendergender


Are those distributions Are those distributions symmetrical?symmetrical?

Are those distributions Are those distributions symmetrical?symmetrical?


Are those distributions symmetrical?Are those distributions symmetrical?

Definitely not!Definitely not!

They are both highly skewed so not They are both highly skewed so not Normal. If transformation is still not Normal Normal. If transformation is still not Normal then use non-parametric test – Mann then use non-parametric test – Mann

WhitneyWhitney

Suggests perhaps that males tend to Suggests perhaps that males tend to have a higher intake than women.have a higher intake than women.

Are those distributions symmetrical?Are those distributions symmetrical?

Definitely not!Definitely not!

They are both highly skewed so not They are both highly skewed so not Normal. If transformation is still not Normal Normal. If transformation is still not Normal then use non-parametric test – Mann then use non-parametric test – Mann

WhitneyWhitney

Suggests perhaps that males tend to Suggests perhaps that males tend to have a higher intake than women.have a higher intake than women.

Mann-Whitney on SPSSMann-Whitney on SPSS

Normal approx (NS)

Mann-Whitney (NS)

Spearman Rank CorrelationSpearman Rank Correlation

• Method for investigating the Method for investigating the relationship between 2 relationship between 2 measured variables measured variables

• Non-parametric equivalent to Non-parametric equivalent to Pearson correlationPearson correlation

• Variables are either non-Variables are either non-Normal or measured on ordinal Normal or measured on ordinal scalescale

• Method for investigating the Method for investigating the relationship between 2 relationship between 2 measured variables measured variables

• Non-parametric equivalent to Non-parametric equivalent to Pearson correlationPearson correlation

• Variables are either non-Variables are either non-Normal or measured on ordinal Normal or measured on ordinal scalescale

Spearman Rank Correlation Spearman Rank Correlation ExampleExample

A researcher wishes to assess whetherA researcher wishes to assess whether

the distance to general practice the distance to general practice

influences the time of diagnosis of influences the time of diagnosis of

colorectal cancer. colorectal cancer.

The null hypothesis would be that The null hypothesis would be that

distance is not associated with time to distance is not associated with time to

diagnosis. Data collected for 7 patientsdiagnosis. Data collected for 7 patients

A researcher wishes to assess whetherA researcher wishes to assess whether

the distance to general practice the distance to general practice

influences the time of diagnosis of influences the time of diagnosis of

colorectal cancer. colorectal cancer.

The null hypothesis would be that The null hypothesis would be that

distance is not associated with time to distance is not associated with time to

diagnosis. Data collected for 7 patientsdiagnosis. Data collected for 7 patients

Distance (km)Distance (km) Time to diagnosis Time to diagnosis (weeks)(weeks)

55 66

22 44

44 33

88 44

2020 55

4545 55

1010 44

Distance from GP and time to Distance from GP and time to diagnosisdiagnosis

ScatterplotScatterplot

Distance from GP and time to Distance from GP and time to diagnosisdiagnosis

DistanceDistance(km)(km)

TimeTime(weeks)(weeks)

Rank for Rank for distancedistance

Rank for Rank for timetime

DifferenceDifferencein Ranksin Ranks

DD22

22 44 11 33 -2-2 44

44 33 22 11 11 11

55 66 33 77 -4-4 1616

88 44 44 33 11 11

1010 44 55 33 22 44

2020 55 66 5.55.5 0.50.5 0.250.25

4545 55 77 5.55.5 1.51.5 2.252.25

Total = 0Total = 0 dd22=28.5=28.5


The formula for Spearman’s rank The formula for Spearman’s rank

correlation is:correlation is:

where n is the number of pairswhere n is the number of pairs

The formula for Spearman’s rank The formula for Spearman’s rank

correlation is:correlation is:

where n is the number of pairswhere n is the number of pairs

16

12

2

nn

drs

Spearman’s on SPSSSpearman’s on SPSS

Spearman’s in SPSSSpearman’s in SPSS




In our example, rIn our example, rss=0.468=0.468

In SPSS we can see that this value is In SPSS we can see that this value is not significant, ie.p=0.29not significant, ie.p=0.29

Therefore there is no significant Therefore there is no significant

relationship between the distance to a relationship between the distance to a

GP and the time to diagnosis but note GP and the time to diagnosis but note that correlation is quite high!that correlation is quite high!

In our example, rIn our example, rss=0.468=0.468

In SPSS we can see that this value is In SPSS we can see that this value is not significant, ie.p=0.29not significant, ie.p=0.29

Therefore there is no significant Therefore there is no significant

relationship between the distance to a relationship between the distance to a

GP and the time to diagnosis but note GP and the time to diagnosis but note that correlation is quite high!that correlation is quite high!

Spearman Rank CorrelationSpearman Rank Correlation

• Correlations lie between –1 to +1Correlations lie between –1 to +1

• A correlation coefficient close to A correlation coefficient close to zero indicates weak or no zero indicates weak or no correlationcorrelation

• A significant rA significant rs s value depends on value depends on sample size and tells you that its sample size and tells you that its unlikely these results have arisen unlikely these results have arisen by chanceby chance

• Correlation does NOT measure Correlation does NOT measure causality only associationcausality only association

• Correlations lie between –1 to +1Correlations lie between –1 to +1

• A correlation coefficient close to A correlation coefficient close to zero indicates weak or no zero indicates weak or no correlationcorrelation

• A significant rA significant rs s value depends on value depends on sample size and tells you that its sample size and tells you that its unlikely these results have arisen unlikely these results have arisen by chanceby chance

• Correlation does NOT measure Correlation does NOT measure causality only associationcausality only association

Chi-squared testChi-squared test

• Used when comparing 2 or more groups Used when comparing 2 or more groups of categorical or nominal data (as of categorical or nominal data (as opposed to measured data)opposed to measured data)

• Already covered!Already covered!

• In SPSS Chi-squared test is test of In SPSS Chi-squared test is test of observed vs. expected in single observed vs. expected in single categorical variablecategorical variable

• Used when comparing 2 or more groups Used when comparing 2 or more groups of categorical or nominal data (as of categorical or nominal data (as opposed to measured data)opposed to measured data)

• Already covered!Already covered!

• In SPSS Chi-squared test is test of In SPSS Chi-squared test is test of observed vs. expected in single observed vs. expected in single categorical variablecategorical variable

More than 2 groupsMore than 2 groups

• So far we have been comparing 2 So far we have been comparing 2 groupsgroups

• If we have 3 or more independent If we have 3 or more independent groups and data is not Normal we groups and data is not Normal we need NP equivalent to ANOVAneed NP equivalent to ANOVA

• If independent samples use If independent samples use Kruskal-WallisKruskal-Wallis

• If related samples use If related samples use FriedmanFriedman• Same assumptions as beforeSame assumptions as before

• So far we have been comparing 2 So far we have been comparing 2 groupsgroups

• If we have 3 or more independent If we have 3 or more independent groups and data is not Normal we groups and data is not Normal we need NP equivalent to ANOVAneed NP equivalent to ANOVA

• If independent samples use If independent samples use Kruskal-WallisKruskal-Wallis

• If related samples use If related samples use FriedmanFriedman• Same assumptions as beforeSame assumptions as before

More than 2 groupsMore than 2 groups

Parametric related to Non-Parametric related to Non-parametric testparametric test

Parametric TestsParametric Tests Non-parametric Non-parametric TestsTests

Single sample t-testSingle sample t-test

Paired sample t-testPaired sample t-test

2 independent samples t-2 independent samples t-testtest

One-way Analysis of One-way Analysis of VarianceVariance

Pearson’s correlationPearson’s correlation

Parametric / Non-Parametric / Non-parametricparametric

Parametric Tests Non-parametric Tests

Single sample t-test Wilcoxon-signed rank test

Paired sample t-test

2 independent samples t-test

One-way Analysis of Variance

Pearson’s correlation




Paired sample t-test Paired Wilcoxon-signed rank









Mann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)








Mann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)


Kruskal-Wallis







Mann-Whitney test(Note: sometimes called Wilcoxon Rank Sums test!)


Kruskal-Wallis

Pearson’s correlation Spearman Rank

Summary Summary Non-parametricNon-parametric

• Non-parametric methods have fewer Non-parametric methods have fewer assumptions than parametric testsassumptions than parametric tests

• So useful when these assumptions So useful when these assumptions not metnot met

• Often used when sample size is small Often used when sample size is small and difficult to tell if Normally and difficult to tell if Normally distributeddistributed

• Non-parametric methods are a Non-parametric methods are a ragbag of tests developed over time ragbag of tests developed over time with no consistent frameworkwith no consistent framework

• Read in datasets LDL, etc and carry Read in datasets LDL, etc and carry out appropriate Non-Parametric out appropriate Non-Parametric teststests

ReferencesReferences

Corder GW, Foreman DI. Non-parametric Statistics for Non-Statisticians. Wiley, 2009.Nonparametric statistics for the behavioural Sciences. Siegel S, Castellan NJ, Jr. McGraw-Hill, 1988 (first edition was 1956)

Documents

Non-Parametric Methods