of 45/45
STAT162 / AC YR 2014 1

Wilcoxon Rank-Sum, Mann Whitney U, Kolmogorov-Smirnov 1-2 Sample Test

  • View
    667

  • Download
    4

Embed Size (px)

Text of Wilcoxon Rank-Sum, Mann Whitney U, Kolmogorov-Smirnov 1-2 Sample Test

  • 1. STAT162 / AC YR 20141

2. Wilcoxon Rank-Sum Test STAT162 / AC YR 20142 3. The Wilcoxon Rank Sum test is used to test for a difference between two samples. It is the nonparametric counterpart to the two-sample Z or t test. Instead of comparing two population means, we compare two population medians.STAT162 / AC YR 20143 4. The problem characteristics of this test are: two groups being tested are independent of each other two groups should have approximately similar distributions numeric and ordinal dataSTAT162 / AC YR 20144 5. Wilcoxon with ni < 10 (small sample) Wilcoxon with ni 10 (large sample, use normal approximation)STAT162 / AC YR 20145 6. Wilcoxon Rank-Sum Test Step 1: List the data values from both samples in a single list arranged from smallest to largestStep 2: In the next column, assign the numbers 1 to N (where N = n1+n2). These are the ranks of the observations. When N is equal to our total sample size, our smallest observation receives a rank of 1, and the largest observation receives a rank of N. If there are ties, assign the average of the ranks the values would receive to each of the tied values. Step 3: The sum of the ranks of the first sample is W, the Wilcoxon Rank Sum test statistic. If one sample is truly bigger than the other, wed expect its ranks to be higher than the others. So after we have ranked all of the observations, we sum up the ranks for each of the two samples and we can then compare the two rank sums.STAT162 / AC YR 20146 7. STAT162 / AC YR 20147 8. The hypothesis statements function the same way as the two sample ttest but we are focused on the medians rather than on the means: H0: 1 2 = 0 H1: 1 2 0 (These could also be expressed as one tailed tests.) STAT162 / AC YR 20148 9. Small sampleThe following data measures the reaction times of two samples of people one set drank alcohol, one set drank a placebo. AlcoholPlacebo1.45.901.46.371.761.631.44.831.11.95.98.781.27.862.56.611.32.38 STAT162 / AC YR 20149 10. DataRankAlcohol or Placebo.371Placebo.382Placebo.613Placebo.784Placebo.835Placebo.866Placebo.907Placebo.958Placebo.989Alcohol1.1110Alcohol1.2711Alcohol1.3212Alcohol1.4413Alcohol1.4514Alcohol1.4615Alcohol1.6316Placebo1.7617Alcohol2.5618AlcoholSTAT162 / AC YR 201410 11. Large Sample using Normal ApproximationSTAT162 / AC YR 201411 12. Large Sample using Normal ApproximationArmy1518161713222417192126Marines14916191012118151825STAT162 / AC YR 201428Mean = 19.67 Mean = 14.2712 13. STAT162 / AC YR 201413 14. Combine the data from the two samples, arrange the combined data in order, and rank each value. Be sure to indicate the group.STAT162 / AC YR 201414 15. Step 3: Compute the test valueSum the ranks of the group with the smaller sample size. In this case, the sample size for the marines is smaller. R= 1+2+3++14.5+16.5+21 = 93 Substitute in the formulas to find the test value.STAT162 / AC YR 201415 16. Step 5: Make the decisionThe decision is to reject the null hypothesis, since -2.41< -1.96. Step 6: InterpretationThere is enough evidence to support the claim that there is a difference in the times it takes the recruits to complete the course STAT162 / AC YR 201416 17. Mann Whitney U STAT162 / AC YR 201417 18. commonly portrayed as the non-parametric substitute for Student's t-test when samples are not normally distributed.STAT162 / AC YR 201418 19. To compute the Mann Whitney U: Rank the scores in both groups (together) from highest to lowest. Sum the ranks of the scores for each group. The sum of ranks for each group are used to make the statistical comparison.STAT162 / AC YR 201419 20. STAT162 / AC YR 201420 21. Null Hypothesis: There is no difference in scores of the two groups (i.e. the sum of ranks for group 1 is no different than the sum of ranks for group 2). Alternative Hypothesis: There is a difference between the scores of the two groups (i.e. the sum of ranks for group 1 is significantly different from the sum of ranks for group 2). Sum RanksSTAT162 / AC YR 201421 22. STAT162 / AC YR 201422 23. STAT162 / AC YR 201423 24. STAT162 / AC YR 201424 25. STAT162 / AC YR 201425 26. Kolmogorov Smirnov One Sample Test STAT162 / AC YR 201426 27. Concern with the degree of agreement between the distribution of a set of sample values (observed scores) and some specified theoretical distribution Determines whether the scores in a sample can reasonably be taught to have come from a population having the theoretical distributionSTAT162 / AC YR 201427 28. STAT162 / AC YR 201428 29. Kolmogorov-Smirnov One Sample TestSTAT162 / AC YR 201429 30. Example Grundman et.al reported the weighted of the kidneys f 36 mongrel dogs before they were used in an experiment. We wish to test the null hypothesis that these data are from a normally distributed population with a mean of 85 grams and a standard deviation of 15 grams.STAT162 / AC YR 201430 31. STAT162 / AC YR 201431 32. STAT162 / AC YR 201432 33. STAT162 / AC YR 201433 34. STAT162 / AC YR 201434 35. STAT162 / AC YR 201435 36. Decision: Entering Table A.18 with N = 36, and keeping in mind that the test is two-sided, we find that the probability of obtaining a value of D as extreme as more extreme than 0.15 is greater than 0.23. Hence these data do not provide sufficient evidence to warrant the conclusion that the weights of mongrel dog kidneys are not normally distributed. Conclusion: Thus, the weights of the kidneys of mongrel dogs are normally distributed .STAT162 / AC YR 201436 37. Kolmogorov Smirnov Two Sample Test STAT162 / AC YR 201437 38. Applications: concern with the agreement between two sets of sample values determines whether the two independent samples have been drawn from the same population(or from populations with the same distribution)STAT162 / AC YR 201438 39. Kolmogorov-Smirnov Two Sample TestSTAT162 / AC YR 201439 40. STAT162 / AC YR 201440 41. Example Lepley compared the serial learning of 10 seventh-grade students with the serial learning of 9 eleventh-grade students. His hypothesis was that the primacy effect should be less prominent in the learning of the younger subjects. The primacy effect is the tendency for the material learned early in a series to be remembered more efficiently than the material learned later in the series. He tested this hypothesis by comparing the percentage of errors made by the two groups in the first half of the series of learned material, predicting that the older group would make relatively fewer errors in repeating the first half of the series than would the younger group.STAT162 / AC YR 201441 42. Percentage of total errors in first half of series Eleventh-grade subjectsSeventh-grade subjects35.2 39.2 40.9 38.1 34.4 29.1 41.8 24.3 32.4 ------39.1 41.2 45.2 46.2 48.4 48.7 55.0 40.6 52.1 47.2 STAT162 / AC YR 201442 43. STAT162 / AC YR 201443 44. STAT162 / AC YR 201444 45. STAT162 / AC YR 201445