24
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology Chapter 10: t Test for Two Independent Samples © 2013 - - DO NOT CITE, QUOTE, REPRODUCE, OR DISSEMINATE WITHOUT WRITTEN PERMISSION FROM THE AUTHOR: Dr. John J. Kerbs can be emailed for permission at [email protected]

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology

Embed Size (px)

Citation preview

  • Slide 1

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology Chapter 10: t Test for Two Independent Samples 2013 - - DO NOT CITE, QUOTE, REPRODUCE, OR DISSEMINATE WITHOUT WRITTEN PERMISSION FROM THE AUTHOR: Dr. John J. Kerbs can be emailed for permission at [email protected] Slide 2 Independent-Measures Designs Allows researchers to evaluate the mean difference between two populations using data from two separate samples. Allows researchers to evaluate the mean difference between two populations using data from two separate samples. The identifying characteristic of the independent- measures or between-subjects design is the existence of two separate or independent samples. The identifying characteristic of the independent- measures or between-subjects design is the existence of two separate or independent samples. Thus, an independent-measures design can be used to test for mean differences between two distinct populations (such as men versus women) or between two different treatment conditions (such as drug versus no-drug). Thus, an independent-measures design can be used to test for mean differences between two distinct populations (such as men versus women) or between two different treatment conditions (such as drug versus no-drug). Slide 3 Teaching Method A vs. B Slide 4 Independent-Measures Designs (cont'd.) The independent-measures design is used in situations where a researcher has no prior knowledge about either of the two populations (or treatments) being compared. The independent-measures design is used in situations where a researcher has no prior knowledge about either of the two populations (or treatments) being compared. In particular, the population means and standard deviations are all unknown. In particular, the population means and standard deviations are all unknown. Because the population variances are not known, these values must be estimated from the sample data. Because the population variances are not known, these values must be estimated from the sample data. Slide 5 The t Statistic for an Independent- Measures Research Design As with all hypothesis tests, the general purpose of the independent-measures t test is to determine whether the sample mean difference obtained in a research study indicates a real mean difference between the two populations (or treatments) or whether the obtained difference is simply the result of sampling error. As with all hypothesis tests, the general purpose of the independent-measures t test is to determine whether the sample mean difference obtained in a research study indicates a real mean difference between the two populations (or treatments) or whether the obtained difference is simply the result of sampling error. Remember, if two samples are taken from the same population and are given exactly the same treatment, there will still be some difference between the sample means (i.e., this difference is called sampling error). Remember, if two samples are taken from the same population and are given exactly the same treatment, there will still be some difference between the sample means (i.e., this difference is called sampling error). The hypothesis test provides a standardized, formal procedure for determining whether the mean difference obtained in a research study is significantly greater than can be explained by sampling error The hypothesis test provides a standardized, formal procedure for determining whether the mean difference obtained in a research study is significantly greater than can be explained by sampling error Slide 6 Two Population Distributions Slide 7 Hypothesis Tests and Effect Size with the Independent Measures t Statistic To prepare the data for analysis, the first step is to compute the sample mean and SS (or s, or s 2 ) for each of the two samples. To prepare the data for analysis, the first step is to compute the sample mean and SS (or s, or s 2 ) for each of the two samples. The hypothesis test follows the same four-step procedure outlined in Chapters 8 and 9. The hypothesis test follows the same four-step procedure outlined in Chapters 8 and 9. Slide 8 Elements of a t statistics: Single-Sample & Independent Measures NOTE: This alternative formula on the bottom left for pooled variance works when you have sample variances for the first and second samples (s 2 ) and not sum of squares (SS) as noted here Slide 9 Hypothesis Testing with the Independent-Measures t Statistic Book Example: Book Example: For 10 students who watched Sesame Street (group 1) and 10 students who did not watch the show (group 2), was there a difference in their average high-school grades. For 10 students who watched Sesame Street (group 1) and 10 students who did not watch the show (group 2), was there a difference in their average high-school grades. Key Information to Run t-tests for independent- measures Key Information to Run t-tests for independent- measures n 1 = 10n 2 = 10 n 1 = 10n 2 = 10 M 1 = 93M 2 = 85 M 1 = 93M 2 = 85 SS 1 = 200SS 2 = 160 SS 1 = 200SS 2 = 160 Slide 10 Hypothesis Testing with the Independent-Measures t Statistic Step 1: Step 1: State the hypotheses and select the level. For the independent-measures test, H 0 states that there is no difference between the two population means. State the hypotheses and select the level. For the independent-measures test, H 0 states that there is no difference between the two population means. For a two-tailed test, the hypotheses are as follows: For a two-tailed test, the hypotheses are as follows: Example: H 0 : 1 - 2 = 0 Example: H 0 : 1 - 2 = 0 H 1 : 1 - 2 0 H 1 : 1 - 2 0 Select : = 0.01 (two tail) Select : = 0.01 (two tail) For a one-tail test, H 0 : 1 - 2 0 For a one-tail test, H 0 : 1 - 2 0 H 1 : 1 - 2 > 0 H 1 : 1 - 2 > 0 Slide 11 Hypothesis Testing with the Independent-Measures t Statistic Step 2: Step 2: Locate the critical region. The critical values for the t statistic are obtained using degrees of freedom that are determined by adding together the df value for the first sample and the df value for the second sample. For two samples with n = 10 in each sample, df is calculated as follows: Locate the critical region. The critical values for the t statistic are obtained using degrees of freedom that are determined by adding together the df value for the first sample and the df value for the second sample. For two samples with n = 10 in each sample, df is calculated as follows: Example: df = df 1 + df 2 Example: df = df 1 + df 2 df = (n 1 - 1) + (n 2 -1) df = (n 1 - 1) + (n 2 -1) df = 9 + 9 = 18 df = 9 + 9 = 18 To find the critical t value for 18df at = 0.01, we look at the t-distribution table: t = +/- 2.878 Slide 12 Hypothesis Testing with the Independent-Measures t Statistic Step 3: Step 3: Compute the test statistic. The t statistic for the independent-measures design has the same structure as the single sample t introduced in Chapter 9. However, in the independent-measures situation, all components of the t formula are doubled: there are two sample means, two population means, and two sources of error contributing to the standard error in the denominator. Three key parts to the t-statistic calculation: Compute the test statistic. The t statistic for the independent-measures design has the same structure as the single sample t introduced in Chapter 9. However, in the independent-measures situation, all components of the t formula are doubled: there are two sample means, two population means, and two sources of error contributing to the standard error in the denominator. Three key parts to the t-statistic calculation: Part A) Calculate the pooled variance Part A) Calculate the pooled variance Part B) Use the pooled variance to calculate the estimated standard error Part B) Use the pooled variance to calculate the estimated standard error Part C) Compute t-statistic Part C) Compute t-statistic Slide 13 Hypothesis Testing with the Independent-Measures t Statistic Slide 14 Step 4: Step 4: Make a decision. If the t statistic ratio indicates that the obtained difference between sample means (numerator) is substantially greater than the difference expected by chance (denominator), we reject H 0 and conclude that there is a real mean difference between the two populations or treatments. Make a decision. If the t statistic ratio indicates that the obtained difference between sample means (numerator) is substantially greater than the difference expected by chance (denominator), we reject H 0 and conclude that there is a real mean difference between the two populations or treatments. T = 4.00>+2.878, thus reject H 0 and conclude that there is a significant difference in high school grades for those who watched Sesame Street as compared to those who did not watch this show. T = 4.00>+2.878, thus reject H 0 and conclude that there is a significant difference in high school grades for those who watched Sesame Street as compared to those who did not watch this show. Slide 15 Measuring Effect Size for the Independent-Measures t Effect size for the independent-measures t is measured in the same way that we measured effect size for the single-sample t in Chapter 9. Effect size for the independent-measures t is measured in the same way that we measured effect size for the single-sample t in Chapter 9. Specifically, you can compute an estimate of Cohens d or you can compute r 2 to obtain a measure of the percentage of variance accounted for by the treatment effect. Specifically, you can compute an estimate of Cohens d or you can compute r 2 to obtain a measure of the percentage of variance accounted for by the treatment effect. Slide 16 Measuring Effect Size for the Independent-Measures t Magnitude of dEvaluation of Effect Size d = 0.2Small effect (mean difference around 0.2 standard deviations) d = 0.5Medium effect (mean difference around 0.5 standard deviations) d = 0.8Large effect (mean difference around 0.8 standard deviations) Percent of Variance Explained as Measured by r 2 Evaluation of Effect Size r 2 = 0.01 (0.01*100 = 1%)Small effect r 2 = 0.09 (0.09*100 = 9%)Medium effect r 2 = 0.25 (0.25*100 = 25%)Large effect Slide 17 Removing Treatment Effects NOTE: We added 4 points to those students who did not watch Sesame Street NOTE: We subtracted 4 points to those students who watched Sesame Street Slide 18 Confidence Intervals and Hypothesis Tests NOTE: Because 0 is not in the 95% C Interval, we can conclude that the value of 0 is not in the 95% Confidence Interval. Alternatively, the value of 0 is rejected with 95% confidence. This is the same as rejecting H 0 with = 0.05 Slide 19 Sample Variance and Sample Size Standard error is positively related to sample variance (larger variance leads to larger sample error). Standard error is positively related to sample variance (larger variance leads to larger sample error). Standard error is inversely related to sample size (large sample sizes lead to smaller standard error). Standard error is inversely related to sample size (large sample sizes lead to smaller standard error). Larger variance produces smaller t-statistic and reduces the likelihood of a significant finding. Larger variance produces smaller t-statistic and reduces the likelihood of a significant finding. Larger variance also produces smaller measures of effect size Larger variance also produces smaller measures of effect size Larger samples produces larger values for t-statistic and increases the likelihood of rejecting H 0. Larger samples produces larger values for t-statistic and increases the likelihood of rejecting H 0. Sample size has no effect on Cohens d and only a small influence on r 2 Sample size has no effect on Cohens d and only a small influence on r 2 Slide 20 Homogeneity of Variance Assumption Most hypothesis tests usually work reasonably well even if the set of underlying assumptions are violated. Most hypothesis tests usually work reasonably well even if the set of underlying assumptions are violated. The one notable exception is the assumption of homogeneity of variance for the independent- measures t test. The one notable exception is the assumption of homogeneity of variance for the independent- measures t test. Requires that the two populations from which the samples are obtained have equal variances Requires that the two populations from which the samples are obtained have equal variances Necessary in order to justify pooling the two sample variances and using the pooled variance in the calculation of the t statistic Necessary in order to justify pooling the two sample variances and using the pooled variance in the calculation of the t statistic Slide 21 If the assumption is violated, then the t statistic contains two questionable values: (1) the value for the population mean difference which comes from the null hypothesis, and (2) the value for the pooled variance. If the assumption is violated, then the t statistic contains two questionable values: (1) the value for the population mean difference which comes from the null hypothesis, and (2) the value for the pooled variance. The problem is that you cannot determine which of these two values is responsible for a t statistic that falls in the critical region. The problem is that you cannot determine which of these two values is responsible for a t statistic that falls in the critical region. In particular, you cannot be certain that rejecting the null hypothesis is correct when you obtain an extreme value for t. In particular, you cannot be certain that rejecting the null hypothesis is correct when you obtain an extreme value for t. Homogeneity of Variance Assumption Slide 22 If the two sample variances appear to be substantially different, you should use Hartleys F-max test to determine whether or not the homogeneity assumption is satisfied. If the two sample variances appear to be substantially different, you should use Hartleys F-max test to determine whether or not the homogeneity assumption is satisfied. If homogeneity of variance is violated, Box 10.2 presents an alternative procedure for computing the t statistic that does not involve pooling the two sample variances. If homogeneity of variance is violated, Box 10.2 presents an alternative procedure for computing the t statistic that does not involve pooling the two sample variances. Homogeneity of Variance Assumption Slide 23 NOTE: If the F-max test rejects the hypothesis of equal variances, or if you suspect that the homogeneity of variance assumption is not justified, you should not compute an independent-measures t-statistic using pooled variance. In such cases, use the alternative formula for the t-statistic on the next slide. Slide 24 Alternative formula for t-statistic Alternative formula for t-statistic Step 1: Calculate the standard error using the two separate sample variances as in equation 10.1 Step 1: Calculate the standard error using the two separate sample variances as in equation 10.1 Step 2: The value of degrees of freedom Step 2: The value of degrees of freedom for the t-statistic is adjusted using the for the t-statistic is adjusted using the following equation: following equation: Homogeneity of Variance Assumption NOTE: The following alternative formula for the t-statistic does not pool sample variances and does not require the homogeneity of variance assumption. Decimal values for df should be rounded down to the next lower integer. By lowering the df, you push the boundaries of the critical region farther out. This makes the test more demanding.