Click here to load reader
Upload
naida
View
87
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Statistics for Health Research. Non-Parametric Methods. Peter T. Donnan Professor of Epidemiology and Biostatistics. Objectives of Presentation. Introduction Ranks & Median Wilcoxon Signed Rank Test Paired Wilcoxon Signed Rank Mann-Whitney test Spearman’s Rank Correlation Coefficient - PowerPoint PPT Presentation
Citation preview
Non-Parametric Methods
Peter T. Donnan
Professor of Epidemiology and Biostatistics
Statistics for Health Statistics for Health ResearchResearch
Objectives of Objectives of PresentationPresentation
• IntroductionIntroduction• Ranks & MedianRanks & Median• Wilcoxon Signed Rank TestWilcoxon Signed Rank Test• Paired Wilcoxon Signed RankPaired Wilcoxon Signed Rank• Mann-Whitney testMann-Whitney test• Spearman’s Rank Correlation Spearman’s Rank Correlation
CoefficientCoefficient• Others….Others….
• IntroductionIntroduction• Ranks & MedianRanks & Median• Wilcoxon Signed Rank TestWilcoxon Signed Rank Test• Paired Wilcoxon Signed RankPaired Wilcoxon Signed Rank• Mann-Whitney testMann-Whitney test• Spearman’s Rank Correlation Spearman’s Rank Correlation
CoefficientCoefficient• Others….Others….
What are non-parametric What are non-parametric tests? tests?
• ‘‘Parametric’ tests involve Parametric’ tests involve estimating parameters such as the estimating parameters such as the mean, and assume that distribution mean, and assume that distribution of sample means are ‘normally’ of sample means are ‘normally’ distributeddistributed
• Often data does not follow a Often data does not follow a Normal distribution eg number of Normal distribution eg number of cigarettes smoked, cost to NHS etc.cigarettes smoked, cost to NHS etc.
• Positively skewed distributionsPositively skewed distributions
• ‘‘Parametric’ tests involve Parametric’ tests involve estimating parameters such as the estimating parameters such as the mean, and assume that distribution mean, and assume that distribution of sample means are ‘normally’ of sample means are ‘normally’ distributeddistributed
• Often data does not follow a Often data does not follow a Normal distribution eg number of Normal distribution eg number of cigarettes smoked, cost to NHS etc.cigarettes smoked, cost to NHS etc.
• Positively skewed distributionsPositively skewed distributions
A positively skewed A positively skewed distributiondistribution
0 10 20 30 40 50
Units of alcohol per week
0
5
10
15
20
Freq
ue
nc
y
Mean = 8.03Std. Dev. = 12.952N = 30
What are non-parametric What are non-parametric tests? tests?
• ‘‘Non-parametric’ tests were Non-parametric’ tests were developed for these situations developed for these situations where fewer assumptions have to where fewer assumptions have to be madebe made
• NP tests STILL have assumptions NP tests STILL have assumptions but are less stringentbut are less stringent
• NP tests can be applied to Normal NP tests can be applied to Normal data but parametric tests have data but parametric tests have greater power greater power IFIF assumptions met assumptions met
• ‘‘Non-parametric’ tests were Non-parametric’ tests were developed for these situations developed for these situations where fewer assumptions have to where fewer assumptions have to be madebe made
• NP tests STILL have assumptions NP tests STILL have assumptions but are less stringentbut are less stringent
• NP tests can be applied to Normal NP tests can be applied to Normal data but parametric tests have data but parametric tests have greater power greater power IFIF assumptions met assumptions met
Ranks Ranks
•Practical differences between Practical differences between parametric and NP are that NP parametric and NP are that NP methods use the methods use the ranksranks of of values rather than the actual values rather than the actual valuesvalues
•E.g. E.g.
1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,7,13,22,38,45 - actual
1,2,3,4,5,6, 7, 8, 9,10 - rank1,2,3,4,5,6, 7, 8, 9,10 - rank
•Practical differences between Practical differences between parametric and NP are that NP parametric and NP are that NP methods use the methods use the ranksranks of of values rather than the actual values rather than the actual valuesvalues
•E.g. E.g.
1,2,3,4,5,7,13,22,38,45 - actual1,2,3,4,5,7,13,22,38,45 - actual
1,2,3,4,5,6, 7, 8, 9,10 - rank1,2,3,4,5,6, 7, 8, 9,10 - rank
MedianMedian
• The median is the value above The median is the value above and below which 50% of the data and below which 50% of the data lie. lie.
• If the data is ranked in order, it is If the data is ranked in order, it is the middle valuethe middle value
• In symmetric distributions the In symmetric distributions the mean and median are the samemean and median are the same
• In skewed distributions, median In skewed distributions, median more appropriatemore appropriate
• The median is the value above The median is the value above and below which 50% of the data and below which 50% of the data lie. lie.
• If the data is ranked in order, it is If the data is ranked in order, it is the middle valuethe middle value
• In symmetric distributions the In symmetric distributions the mean and median are the samemean and median are the same
• In skewed distributions, median In skewed distributions, median more appropriatemore appropriate
MedianMedian
• BPs:BPs:
135, 138, 140, 140, 141, 142, 135, 138, 140, 140, 141, 142, 143143
Median=Median=
• BPs:BPs:
135, 138, 140, 140, 141, 142, 135, 138, 140, 140, 141, 142, 143143
Median=Median=
MedianMedian
• BPs:BPs:
135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143
Median=140Median=140
• No. of cigarettes smoked:No. of cigarettes smoked:
0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10
Median=Median=
• BPs:BPs:
135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143
Median=140Median=140
• No. of cigarettes smoked:No. of cigarettes smoked:
0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10
Median=Median=
MedianMedian
• BPs:BPs:
135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143
Median=140Median=140
• No. of cigarettes smoked:No. of cigarettes smoked:
0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10
Median=2.5Median=2.5
• BPs:BPs:
135, 138, 140, 140, 141, 142, 143135, 138, 140, 140, 141, 142, 143
Median=140Median=140
• No. of cigarettes smoked:No. of cigarettes smoked:
0, 1, 2, 2, 2, 3, 5, 5, 8, 100, 1, 2, 2, 2, 3, 5, 5, 8, 10
Median=2.5Median=2.5
T-testT-test
• T-test used to test whether the T-test used to test whether the mean of a sample is sig different mean of a sample is sig different from a hypothesised sample meanfrom a hypothesised sample mean
• T-test relies on the sample being T-test relies on the sample being drawn from a normally distributed drawn from a normally distributed populationpopulation
• If sample If sample notnot Normal then use the Normal then use the Wilcoxon Signed Rank Test as an Wilcoxon Signed Rank Test as an alternativealternative
• T-test used to test whether the T-test used to test whether the mean of a sample is sig different mean of a sample is sig different from a hypothesised sample meanfrom a hypothesised sample mean
• T-test relies on the sample being T-test relies on the sample being drawn from a normally distributed drawn from a normally distributed populationpopulation
• If sample If sample notnot Normal then use the Normal then use the Wilcoxon Signed Rank Test as an Wilcoxon Signed Rank Test as an alternativealternative
Wilcoxon Signed Rank Wilcoxon Signed Rank TestTest
• NP test relating to the median as NP test relating to the median as measure of central tendencymeasure of central tendency
• The ranks of the absolute The ranks of the absolute differences between the data and differences between the data and the hypothesised median calculatedthe hypothesised median calculated
• The ranks for the negative and the The ranks for the negative and the positive differences are then positive differences are then summed separately (Wsummed separately (W- - and Wand W+ + resp.)resp.)
• The minimum of these is the test The minimum of these is the test statistic, Wstatistic, W
• NP test relating to the median as NP test relating to the median as measure of central tendencymeasure of central tendency
• The ranks of the absolute The ranks of the absolute differences between the data and differences between the data and the hypothesised median calculatedthe hypothesised median calculated
• The ranks for the negative and the The ranks for the negative and the positive differences are then positive differences are then summed separately (Wsummed separately (W- - and Wand W+ + resp.)resp.)
• The minimum of these is the test The minimum of these is the test statistic, Wstatistic, W
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
The median heart rate for an 18 The median heart rate for an 18 year old girl is supposed to be year old girl is supposed to be 82bpm. A student takes the 82bpm. A student takes the pulse rates of 8 female students pulse rates of 8 female students (all aged 18):(all aged 18):
83, 90, 96, 82, 85, 80, 81, 8783, 90, 96, 82, 85, 80, 81, 87
Do these results suggest that the Do these results suggest that the median might not be 82?median might not be 82?
The median heart rate for an 18 The median heart rate for an 18 year old girl is supposed to be year old girl is supposed to be 82bpm. A student takes the 82bpm. A student takes the pulse rates of 8 female students pulse rates of 8 female students (all aged 18):(all aged 18):
83, 90, 96, 82, 85, 80, 81, 8783, 90, 96, 82, 85, 80, 81, 87
Do these results suggest that the Do these results suggest that the median might not be 82?median might not be 82?
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
HH00::HH00::
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
HH00: median=82: median=82
HH11::
HH00: median=82: median=82
HH11::
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
HH00: median=82: median=82
HH11: median≠82: median≠82
HH00: median=82: median=82
HH11: median≠82: median≠82
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
HH00: median=82: median=82
HH11: median≠82: median≠82
Two-tailed testTwo-tailed test
Because one result equals 82 this Because one result equals 82 this cannot be used in the analysiscannot be used in the analysis
HH00: median=82: median=82
HH11: median≠82: median≠82
Two-tailed testTwo-tailed test
Because one result equals 82 this Because one result equals 82 this cannot be used in the analysiscannot be used in the analysis
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
ResultResult Above or Above or below below
medianmedian
Absolute Absolute difference from difference from
median=82median=82
Rank of Rank of differencedifference
8383 ++ 11 1.51.5
9090 ++ 88 66
9696 ++ 1414 77
8585 ++ 33 44
8080 -- 22 33
8181 -- 11 1.51.5
8787 ++ 55 55
WW++= 1.5+6+7+4+5=23.5= 1.5+6+7+4+5=23.5 WW--= 3+1.5=4.5= 3+1.5=4.5 So, So, W=4.5W=4.5
n=7, so the value of W > tabulated value of 2, so p>0.05n=7, so the value of W > tabulated value of 2, so p>0.05
Wilcoxon Signed Rank Test:Wilcoxon Signed Rank Test:ExampleExample
Therefore, the student should Therefore, the student should conclude that these results could conclude that these results could have come from a population which have come from a population which had a median of 82 as the result is had a median of 82 as the result is not significantly different to the null not significantly different to the null hypothesis value.hypothesis value.
Therefore, the student should Therefore, the student should conclude that these results could conclude that these results could have come from a population which have come from a population which had a median of 82 as the result is had a median of 82 as the result is not significantly different to the null not significantly different to the null hypothesis value.hypothesis value.
Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test Normal ApproximationNormal Approximation
• As the number of ranks (n) As the number of ranks (n) becomes larger, the distribution becomes larger, the distribution of W becomes approximately of W becomes approximately NormalNormal
• Generally, if n>20Generally, if n>20
• Mean W=n(n+1)/4Mean W=n(n+1)/4
• Variance W=n(n+1)(2n+1)/24Variance W=n(n+1)(2n+1)/24
• Z=(W-mean W)/SD(W)Z=(W-mean W)/SD(W)
• As the number of ranks (n) As the number of ranks (n) becomes larger, the distribution becomes larger, the distribution of W becomes approximately of W becomes approximately NormalNormal
• Generally, if n>20Generally, if n>20
• Mean W=n(n+1)/4Mean W=n(n+1)/4
• Variance W=n(n+1)(2n+1)/24Variance W=n(n+1)(2n+1)/24
• Z=(W-mean W)/SD(W)Z=(W-mean W)/SD(W)
Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test AssumptionsAssumptions
• Population should be Population should be approximately symmetrical approximately symmetrical butbut need not be Normal need not be Normal
• Results must be classified as either Results must be classified as either being greater than or less than the being greater than or less than the median ie exclude results=medianmedian ie exclude results=median
• Can be used for small or large Can be used for small or large samplessamples
• Population should be Population should be approximately symmetrical approximately symmetrical butbut need not be Normal need not be Normal
• Results must be classified as either Results must be classified as either being greater than or less than the being greater than or less than the median ie exclude results=medianmedian ie exclude results=median
• Can be used for small or large Can be used for small or large samplessamples
Paired samples t-test Paired samples t-test
• DisadvantageDisadvantage: Assumes data are : Assumes data are a random sample from a a random sample from a population which is Normally population which is Normally distributeddistributed
• AdvantageAdvantage: Uses all detail of the : Uses all detail of the available data, and if the data are available data, and if the data are normally distributed it is the most normally distributed it is the most powerful testpowerful test
• DisadvantageDisadvantage: Assumes data are : Assumes data are a random sample from a a random sample from a population which is Normally population which is Normally distributeddistributed
• AdvantageAdvantage: Uses all detail of the : Uses all detail of the available data, and if the data are available data, and if the data are normally distributed it is the most normally distributed it is the most powerful testpowerful test
The Wilcoxon Signed Rank The Wilcoxon Signed Rank Test for Paired Comparisons Test for Paired Comparisons
• DisadvantageDisadvantage: Only the sign (+ : Only the sign (+ or -) of any change is analysedor -) of any change is analysed
• AdvantageAdvantage: Easy to carry out : Easy to carry out and data can be analysed from and data can be analysed from any distribution or populationany distribution or population
• DisadvantageDisadvantage: Only the sign (+ : Only the sign (+ or -) of any change is analysedor -) of any change is analysed
• AdvantageAdvantage: Easy to carry out : Easy to carry out and data can be analysed from and data can be analysed from any distribution or populationany distribution or population
Paired And Not Paired Paired And Not Paired Comparisons Comparisons
• If you have the same sample If you have the same sample measured on two separate measured on two separate occasions then this is a paired occasions then this is a paired comparisoncomparison
• Two independent samples is not Two independent samples is not a paired comparisona paired comparison
• Different samples which are Different samples which are ‘matched’ by age and gender are ‘matched’ by age and gender are pairedpaired
• If you have the same sample If you have the same sample measured on two separate measured on two separate occasions then this is a paired occasions then this is a paired comparisoncomparison
• Two independent samples is not Two independent samples is not a paired comparisona paired comparison
• Different samples which are Different samples which are ‘matched’ by age and gender are ‘matched’ by age and gender are pairedpaired
The Wilcoxon Signed Rank The Wilcoxon Signed Rank Test for Paired Comparisons Test for Paired Comparisons
• Similar calculation to the Wilcoxon Similar calculation to the Wilcoxon Signed Rank test, only the Signed Rank test, only the differences in the paired results differences in the paired results are rankedare ranked
• Example using SPSS:Example using SPSS:
A group of 10 patients with chronic A group of 10 patients with chronic anxiety receive sessions of anxiety receive sessions of cognitive therapy. Quality of Life cognitive therapy. Quality of Life scores are measured before and scores are measured before and after therapy.after therapy.
• Similar calculation to the Wilcoxon Similar calculation to the Wilcoxon Signed Rank test, only the Signed Rank test, only the differences in the paired results differences in the paired results are rankedare ranked
• Example using SPSS:Example using SPSS:
A group of 10 patients with chronic A group of 10 patients with chronic anxiety receive sessions of anxiety receive sessions of cognitive therapy. Quality of Life cognitive therapy. Quality of Life scores are measured before and scores are measured before and after therapy.after therapy.
QoL ScoreQoL Score
BeforeBefore AfterAfter66 9955 121233 9944 9922 3311 1133 2288 121266 99
1212 1010
Wilcoxon Signed Rank Test Wilcoxon Signed Rank Test exampleexample
Wilcoxon Signed Rank Wilcoxon Signed Rank Test exampleTest example
p < 0.05
SPSS OutputSPSS Output
Mann-Whitney testMann-Whitney test
• Used when we want to compare Used when we want to compare two unrelated or INDEPENDENT two unrelated or INDEPENDENT groupsgroups
• For parametric data you would For parametric data you would use the unpaired (independent) use the unpaired (independent) samples t-testsamples t-test
• The assumptions of the t-test The assumptions of the t-test were:were:
1.1. The distribution of the measure in each The distribution of the measure in each group is approx Normally distributedgroup is approx Normally distributed
2.2. The variances are similarThe variances are similar
• Used when we want to compare Used when we want to compare two unrelated or INDEPENDENT two unrelated or INDEPENDENT groupsgroups
• For parametric data you would For parametric data you would use the unpaired (independent) use the unpaired (independent) samples t-testsamples t-test
• The assumptions of the t-test The assumptions of the t-test were:were:
1.1. The distribution of the measure in each The distribution of the measure in each group is approx Normally distributedgroup is approx Normally distributed
2.2. The variances are similarThe variances are similar
Example (1)Example (1)
The following data shows the number The following data shows the number
of alcohol units per week collected in a of alcohol units per week collected in a
survey:survey:
Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0
Women (n=14): Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,00,0,0,0,1,5,4,1,0,0,3,20,0,0
Is the amount greater in men compared Is the amount greater in men compared
to women?to women?
The following data shows the number The following data shows the number
of alcohol units per week collected in a of alcohol units per week collected in a
survey:survey:
Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0Men (n=13): 0,0,1,5,10,30,45,5,5,1,0,0,0
Women (n=14): Women (n=14): 0,0,0,0,1,5,4,1,0,0,3,20,0,00,0,0,0,1,5,4,1,0,0,3,20,0,0
Is the amount greater in men compared Is the amount greater in men compared
to women?to women?
Example (2)Example (2)
How would you test whether the How would you test whether the
distributions in both groups are distributions in both groups are
approximately Normally approximately Normally distributed?distributed?
How would you test whether the How would you test whether the
distributions in both groups are distributions in both groups are
approximately Normally approximately Normally distributed?distributed?
Example (2)Example (2)
How would you test whether the How would you test whether the
distributions in both groups are distributions in both groups are
approximately Normally distributed?approximately Normally distributed?
Plot histogramsPlot histograms Stem and leaf plotStem and leaf plot Box-plotBox-plot Q-Q or P-P plotQ-Q or P-P plot
How would you test whether the How would you test whether the
distributions in both groups are distributions in both groups are
approximately Normally distributed?approximately Normally distributed?
Plot histogramsPlot histograms Stem and leaf plotStem and leaf plot Box-plotBox-plot Q-Q or P-P plotQ-Q or P-P plot
Male Female
Gender
0
10
20
30
40
50U
nit
s o
f alc
oh
ol p
er
we
ek
25
6
7
Boxplots of alcohol units per week by Boxplots of alcohol units per week by gendergender
Example (3)Example (3)
Are those distributions Are those distributions symmetrical?symmetrical?
Are those distributions Are those distributions symmetrical?symmetrical?
Example (3)Example (3)
Are those distributions symmetrical?Are those distributions symmetrical?
Definitely not!Definitely not!
They are both highly skewed so not They are both highly skewed so not Normal. If transformation is still not Normal Normal. If transformation is still not Normal then use non-parametric test – Mann then use non-parametric test – Mann
WhitneyWhitney
Suggests perhaps that males tend to Suggests perhaps that males tend to have a higher intake than women.have a higher intake than women.
Are those distributions symmetrical?Are those distributions symmetrical?
Definitely not!Definitely not!
They are both highly skewed so not They are both highly skewed so not Normal. If transformation is still not Normal Normal. If transformation is still not Normal then use non-parametric test – Mann then use non-parametric test – Mann
WhitneyWhitney
Suggests perhaps that males tend to Suggests perhaps that males tend to have a higher intake than women.have a higher intake than women.
Mann-Whitney on SPSSMann-Whitney on SPSS
Normal approx (NS)
Mann-Whitney (NS)
Spearman Rank CorrelationSpearman Rank Correlation
• Method for investigating the Method for investigating the relationship between 2 relationship between 2 measured variables measured variables
• Non-parametric equivalent to Non-parametric equivalent to Pearson correlationPearson correlation
• Variables are either non-Variables are either non-Normal or measured on ordinal Normal or measured on ordinal scalescale
• Method for investigating the Method for investigating the relationship between 2 relationship between 2 measured variables measured variables
• Non-parametric equivalent to Non-parametric equivalent to Pearson correlationPearson correlation
• Variables are either non-Variables are either non-Normal or measured on ordinal Normal or measured on ordinal scalescale
Spearman Rank Correlation Spearman Rank Correlation ExampleExample
A researcher wishes to assess whetherA researcher wishes to assess whether
the distance to general practice the distance to general practice
influences the time of diagnosis of influences the time of diagnosis of
colorectal cancer. colorectal cancer.
The null hypothesis would be that The null hypothesis would be that
distance is not associated with time to distance is not associated with time to
diagnosis. Data collected for 7 patientsdiagnosis. Data collected for 7 patients
A researcher wishes to assess whetherA researcher wishes to assess whether
the distance to general practice the distance to general practice
influences the time of diagnosis of influences the time of diagnosis of
colorectal cancer. colorectal cancer.
The null hypothesis would be that The null hypothesis would be that
distance is not associated with time to distance is not associated with time to
diagnosis. Data collected for 7 patientsdiagnosis. Data collected for 7 patients
Distance (km)Distance (km) Time to diagnosis Time to diagnosis (weeks)(weeks)
55 66
22 44
44 33
88 44
2020 55
4545 55
1010 44
Distance from GP and time to Distance from GP and time to diagnosisdiagnosis
ScatterplotScatterplot
Distance from GP and time to Distance from GP and time to diagnosisdiagnosis
DistanceDistance(km)(km)
TimeTime(weeks)(weeks)
Rank for Rank for distancedistance
Rank for Rank for timetime
DifferenceDifferencein Ranksin Ranks
DD22
22 44 11 33 -2-2 44
44 33 22 11 11 11
55 66 33 77 -4-4 1616
88 44 44 33 11 11
1010 44 55 33 22 44
2020 55 66 5.55.5 0.50.5 0.250.25
4545 55 77 5.55.5 1.51.5 2.252.25
Total = 0Total = 0 dd22=28.5=28.5
Spearman Rank Correlation Spearman Rank Correlation ExampleExample
The formula for Spearman’s rank The formula for Spearman’s rank
correlation is:correlation is:
where n is the number of pairswhere n is the number of pairs
The formula for Spearman’s rank The formula for Spearman’s rank
correlation is:correlation is:
where n is the number of pairswhere n is the number of pairs
16
12
2
nn
drs
Spearman’s on SPSSSpearman’s on SPSS
Spearman’s in SPSSSpearman’s in SPSS
Spearman’s in SPSSSpearman’s in SPSS
Spearman’s in SPSSSpearman’s in SPSS
Spearman Rank Correlation Spearman Rank Correlation ExampleExample
In our example, rIn our example, rss=0.468=0.468
In SPSS we can see that this value is In SPSS we can see that this value is not significant, ie.p=0.29not significant, ie.p=0.29
Therefore there is no significant Therefore there is no significant
relationship between the distance to a relationship between the distance to a
GP and the time to diagnosis but note GP and the time to diagnosis but note that correlation is quite high!that correlation is quite high!
In our example, rIn our example, rss=0.468=0.468
In SPSS we can see that this value is In SPSS we can see that this value is not significant, ie.p=0.29not significant, ie.p=0.29
Therefore there is no significant Therefore there is no significant
relationship between the distance to a relationship between the distance to a
GP and the time to diagnosis but note GP and the time to diagnosis but note that correlation is quite high!that correlation is quite high!
Spearman Rank CorrelationSpearman Rank Correlation
• Correlations lie between –1 to +1Correlations lie between –1 to +1
• A correlation coefficient close to A correlation coefficient close to zero indicates weak or no zero indicates weak or no correlationcorrelation
• A significant rA significant rs s value depends on value depends on sample size and tells you that its sample size and tells you that its unlikely these results have arisen unlikely these results have arisen by chanceby chance
• Correlation does NOT measure Correlation does NOT measure causality only associationcausality only association
• Correlations lie between –1 to +1Correlations lie between –1 to +1
• A correlation coefficient close to A correlation coefficient close to zero indicates weak or no zero indicates weak or no correlationcorrelation
• A significant rA significant rs s value depends on value depends on sample size and tells you that its sample size and tells you that its unlikely these results have arisen unlikely these results have arisen by chanceby chance
• Correlation does NOT measure Correlation does NOT measure causality only associationcausality only association
Chi-squared testChi-squared test
• Used when comparing 2 or more groups Used when comparing 2 or more groups of categorical or nominal data (as of categorical or nominal data (as opposed to measured data)opposed to measured data)
• Already covered!Already covered!
• In SPSS Chi-squared test is test of In SPSS Chi-squared test is test of observed vs. expected in single observed vs. expected in single categorical variablecategorical variable
• Used when comparing 2 or more groups Used when comparing 2 or more groups of categorical or nominal data (as of categorical or nominal data (as opposed to measured data)opposed to measured data)
• Already covered!Already covered!
• In SPSS Chi-squared test is test of In SPSS Chi-squared test is test of observed vs. expected in single observed vs. expected in single categorical variablecategorical variable
More than 2 groupsMore than 2 groups
• So far we have been comparing 2 So far we have been comparing 2 groupsgroups
• If we have 3 or more independent If we have 3 or more independent groups and data is not Normal we groups and data is not Normal we need NP equivalent to ANOVAneed NP equivalent to ANOVA
• If independent samples use If independent samples use Kruskal-WallisKruskal-Wallis
• If related samples use If related samples use FriedmanFriedman• Same assumptions as beforeSame assumptions as before
• So far we have been comparing 2 So far we have been comparing 2 groupsgroups
• If we have 3 or more independent If we have 3 or more independent groups and data is not Normal we groups and data is not Normal we need NP equivalent to ANOVAneed NP equivalent to ANOVA
• If independent samples use If independent samples use Kruskal-WallisKruskal-Wallis
• If related samples use If related samples use FriedmanFriedman• Same assumptions as beforeSame assumptions as before
More than 2 groupsMore than 2 groups
Parametric related to Non-Parametric related to Non-parametric testparametric test
Parametric TestsParametric Tests Non-parametric Non-parametric TestsTests
Single sample t-testSingle sample t-test
Paired sample t-testPaired sample t-test
2 independent samples t-2 independent samples t-testtest
One-way Analysis of One-way Analysis of VarianceVariance
Pearson’s correlationPearson’s correlation
Parametric / Non-Parametric / Non-parametricparametric
Parametric Tests Non-parametric Tests
Single sample t-test Wilcoxon-signed rank test
Paired sample t-test
2 independent samples t-test
One-way Analysis of Variance
Pearson’s correlation
Parametric / Non-Parametric / Non-parametricparametric
Parametric Tests Non-parametric Tests
Single sample t-test Wilcoxon-signed rank test
Paired sample t-test Paired Wilcoxon-signed rank
2 independent samples t-test
One-way Analysis of Variance
Pearson’s correlation
Parametric / Non-Parametric / Non-parametricparametric
Parametric Tests Non-parametric Tests
Single sample t-test Wilcoxon-signed rank test
Paired sample t-test Paired Wilcoxon-signed rank
2 independent samples t-test
Mann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)
One-way Analysis of Variance
Pearson’s correlation
Parametric / Non-Parametric / Non-parametricparametric
Parametric Tests Non-parametric Tests
Single sample t-test Wilcoxon-signed rank test
Paired sample t-test Paired Wilcoxon-signed rank
2 independent samples t-test
Mann-Whitney test (Note: sometimes called Wilcoxon Rank Sums test!)
One-way Analysis of Variance
Kruskal-Wallis
Pearson’s correlation
Parametric / Non-Parametric / Non-parametricparametric
Parametric Tests Non-parametric Tests
Single sample t-test Wilcoxon-signed rank test
Paired sample t-test Paired Wilcoxon-signed rank
2 independent samples t-test
Mann-Whitney test(Note: sometimes called Wilcoxon Rank Sums test!)
One-way Analysis of Variance
Kruskal-Wallis
Pearson’s correlation Spearman Rank
Summary Summary Non-parametricNon-parametric
• Non-parametric methods have fewer Non-parametric methods have fewer assumptions than parametric testsassumptions than parametric tests
• So useful when these assumptions So useful when these assumptions not metnot met
• Often used when sample size is small Often used when sample size is small and difficult to tell if Normally and difficult to tell if Normally distributeddistributed
• Non-parametric methods are a Non-parametric methods are a ragbag of tests developed over time ragbag of tests developed over time with no consistent frameworkwith no consistent framework
• Read in datasets LDL, etc and carry Read in datasets LDL, etc and carry out appropriate Non-Parametric out appropriate Non-Parametric teststests
ReferencesReferences
Corder GW, Foreman DI. Non-parametric Statistics for Non-Statisticians. Wiley, 2009.Nonparametric statistics for the behavioural Sciences. Siegel S, Castellan NJ, Jr. McGraw-Hill, 1988 (first edition was 1956)