23
Practice Exercises for QA252 Intermediate Statistics Professor K. Leppel

Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

  • Upload
    dodieu

  • View
    220

  • Download
    4

Embed Size (px)

Citation preview

Page 1: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Practice Exercisesfor QA252

Intermediate Statistics

Professor K. Leppel

Page 2: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Hypothesis Testing: Type I and Type II Errors

1. A researcher has hypothesized that the mean number of traffic violations per day in Saskatchewan, Canada is 25. So the null and alternative hypotheses are H0: μ = 25, H1: μ ≠ 25. The level of significance to be used is α = .05. A random sample was taken and based on the results of the sample, decisions were made. Complete the table below.

true (but unknown)

meanIs H0 true? Decision

(based on sample)Is the decision correct? (Y/N)

Type of Error (I or II) if any

25 Accept H0

25 Reject H0

27 Accept H0

27 Reject H0

2. A consumer advocate believes that more than 5% of a particular manufacturer’s tires are defective. The advocate wants to be 99% sure that he is correct before going public with his claim. In other words, he wants to limit the probably of claiming that a lot of the tires are bad, if they’re not, to at most 1%.

In one-sided tests, the null hypothesis can usually be set up as the devil’s advocate’s approach to the claim. That is, the null hypothesis is "the percent defective is not more than 5% and I'm sticking with that until you can show me reason to believe otherwise." So the claim becomes the alternative hypothesis that will only be accepted if the null can be rejected. So, the null and alternative hypotheses are H0: π ≤ .05, H1: π > .05. The level of significance, α, is .01.

A random sample was taken and based on the results of the sample, decisions were made. Complete the table.

true (but unknown)

proportionIs H0 true? Decision

(based on sample)Is the decision correct? (Y/N)

Type of Error (I or II) if any

.05 Accept H0

.05 Reject H0

.07 Accept H0

.07 Reject H0

.03 Accept H0

.03 Reject H0

Page 3: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Valid Null and Alternative Hypotheses

For each pair of null and alternative hypotheses, determine whether the set is a valid set of hypotheses, and if not, explain why not.

Question number

Null Hypothesis

Alternative Hypothesis

Valid? (Y/N) If not valid, why not?

1 H0: μ = 25 H1: μ ≠ 25

2 H0: π = .25 H1: π ≠ .25

3 H 0 :X=63 H 1: X≠63

4 H0: π ≥ 18 H1: π ≤ 18

5 H0: μ > 43 H1: μ ≤ 43

6 H0: p = .72 H1: p ≠ .72

7 H0: μ ≥ 96 H1: μ < 96

8 H0: p ≤ .42 H1: p > .42

9 H 0 :X<57 H 1: X<57

10 H0: π ≤ .81 H1: π > .81

Issues: Hypotheses must involve population parameters and NOT sample statistics.Null and alternative hypotheses must describe different and non-overlapping situations. Equality must be in the null hypothesis and NOT in the alternative.

Page 4: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Hypothesis Testing – One Sample

Use the following GPA data from a statistics class of 18 males and 14 females to answer questions 1 to 5.

(1) Suppose that the standard deviation of GPAs for the population of male Statistics students was known to be 0.45. Test at the 5% level whether the mean GPA for all male Statistics students is equal to 2.9.

(2) Suppose that the standard deviation of GPAs for the population of male Statistics students was known to be 0.45. Calculate the p-value to test at the 5% level whether the mean GPA for all male Statistics students is equal to 2.9.

(3) Suppose that the standard deviation of GPAs for the population of male Statistics students was known to be 0.45. Test at the 5% level whether the mean GPA for all male Statistics students is less than 2.9. (Use the devil's advocate approach to set up the null hypothesis in this problem. The null is "the mean GPA for all male Statistics students is not less than 2.9, and I'm sticking with that until you can show me reason to believe otherwise." Note: Saying that the male mean GPA is not less than 2.9 is equivalent to saying that the male mean GPA greater than or equal to 2.9. The alternative is that the mean GPA is less than 2.9.)

(4) Suppose that the standard deviation of GPAs for the population of male Statistics students was known to be 0.45. Calculate the p-value to test at the 5% level whether the mean GPA for all male Statistics students is less than 2.9. (Again, set up the null hypothesis using the devil’s advocate approach.)

(5) Test at the 5% level whether the mean GPA for all male Statistics students is equal to 2.9. (You have no knowledge of the population standard deviation of GPAS.)

Male GPAs

FemaleGPAs

3.52 3.83.5 3.783.4 3.73.2 3.53.2 3.3

3.06 2.983.05 2.92.8 2.82.8 2.72.7 2.72.7 2.7

2.65 2.32.6 2.32.5 1.92.52.42.42.3

Page 5: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

(6) Suppose one of the possible majors in Business Administration is International Business. In a sample of 340 college students majoring in Business Administration, 60 students are majors in International Business. Test at the 5% level whether the proportion of Business Administration students majoring in International Business is 20%.

Page 6: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Hypothesis Testing – Two Sample

Use the following GPA data from a statistics class of 18 males and 14 females to answer questions 1 to 5. (This is the same data set as was used in the one-sample practice problems.)

(1) Suppose that the standard deviation of GPAs for the population of male Statistics students was known to be 0.45, and the standard deviation of GPAs for the population of female Statistics students was known to be 0.55. Test at the 5% level whether the mean GPA for all male Statistics students is equal to the mean GPA for all female Statistics students.

(2) Suppose that the standard deviation of GPAs for the population of

male Statistics students was known to be 0.45, and the standard deviation of GPAs for the population of female Statistics students was known to be 0.55. Test at the 5% level whether the mean GPA is greater for females than for males. (Use the devil's advocate approach to set up the null hypothesis in this problem. The null is "the mean GPA for all female Statistics students is not greater than the mean GPA for all male Statistics students, and I'm sticking with that until you can show me reason to believe otherwise." Note: Saying that the female mean is not greater than the male mean is equivalent to saying that the female mean is less than or equal to the male mean. The alternative is that the mean GPA is greater for females than for males.)

(3) Test at the 5% level the null hypothesis that the mean GPA for all male Statistics students is equal to the mean GPA for all female Statistics students. You have no knowledge of the population standard deviations.

(4) Test at the 5% level the null hypothesis that the mean GPA for all male Statistics students is equal to the mean GPA for all female Statistics students. The population standard deviations are unknown but you believe that they are equal.

MaleGPAs

FemaleGPAs

3.52 3.83.5 3.783.4 3.73.2 3.53.2 3.3

3.06 2.983.05 2.92.8 2.82.8 2.72.7 2.72.7 2.7

2.65 2.32.6 2.32.5 1.92.52.42.42.3

Page 7: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

(5) Test at the 5% level whether the variance of the GPAs of the female studentsa. is equal to the variances of the GPAs of male students.b. is greater than the variance of the GPAs of the male students.

(6) Suppose one of the possible majors in Business Administration is International Business. A sample of 340 college students in Business Administration consists of 190 men and 150 women. There are 60 students majoring in International Business, 20 male and 40 female. Test at the 1% level whether the proportion of women in Business Administration who major in International Business is greater than the proportion of men in Business Administration who major in International Business. (Reminder: Use the devil's advocate approach to set up the null hypothesis for this problem.)

(continued) (7) Suppose that the number of girls and the number of boys in the families of Statistics

students are as given below. Test at the 10% level whether on average the number of boys in the family is equal to the number of girls in the family.

Family # of girls # of boys1 0 12 1 13 0 14 1 25 4 16 1 17 0 38 1 19 3 010 0 211 0 312 5 213 3 114 1 015 0 316 2 017 1 118 0 2

Page 8: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Chi-squared Tests

(1) Suppose a manager believes that the company’s customers have the following preferences for three models: 20% prefer model A, 30% prefer model B, and 50% prefer model C. Survey results from a sample of 300 customers indicate that 50 prefer model A, 80 prefer B, and 170 prefer C. Test at the 10% level whether the manager is correct.

(2) Suppose that a class of 33 students is divided into commuters and residents. The students are also divided into 3 activities categories: those with no extracurricular activities, those with exactly 1, and those with 2 or more. The result is the following table. Test at the 10% level whether student commuter versus resident status is independent of the number of extracurricular activities.

residence status # of extracurricular activities

0 1 2 or more

commuter 6 7 1

Resident 8 5 6

(3) You want to test at the 5% level whether the performance on a standardized exam by the students at a particular university has a standard deviation of 10. Based on the data from your sample of 20 students, you have found the standard deviation to be 14. Perform the test.

Page 9: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Analysis of Variance (ANOVA): Null hypothesis versus alternative hypothesisIn the table below, write H0 next to the null hypothesis and H1 next to the alternative hypothesis.

H0

or H1

HypothesisH0

or H1

Hypothesis

1 There is no difference in average salaries of Asians, Caucasians, and African Americans.

There is a difference in average salaries of Asians, Caucasians, and African Americans.

2 Average number of years of education varies with income class.

Average number of years of education is the same for all income classes.

3The average number of items produced per minute by four different machines is not the same.

The average number of items produced per minute by four different machines is the same.

4 The average student performance is the same for all sections of a course.

The average student performance is not the same for all sections of a course.

5 The average level of productivity of employees is the same for all training programs.

The average level of productivity of employees depends on the training program.

6 The average life span of a microwave oven varies with brand.

The average life span of a microwave oven does not vary with brand.

7 Average price of housing does not vary by city. Average price of housing varies by city.

8 The average student performance is the same regardless of the software used.

The average student performance depends on the software used.

9The average employer rating of Co-op students depends on the university attended by the student.

The average employer rating of Co-op students does not depend on the university attended by the student.

10The average starting salary does not vary with the university from which the employee graduated.

The average starting salary does vary with the university from which the employee graduated.

11 The average fee charged for household electrical repairs differs by county.

The average fee charged for household electrical repairs is the same for all counties.

12 The average applicant score is the same for all interviewers.

The average applicant score is not the same for all interviewers.

13 Average gasoline mileage for compact cars varies with manufacturer.

Average gasoline mileage for compact cars does not vary with manufacturer.

14 Average number of hours studied per week is the same for all college class years.

Average number of hours studied per week depends on college class year.

15 Average family size varies by ethnic group. Average family size is the same for all ethnic groups.

Page 10: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

ANOVA Table Completion

Complete the following tables:

Source of Variation SS DF MS F-statistic

Among or Between Treatments 2 1000

Error or Within Treatments 400 ----------------

Total 6800 14 ----------------

Source of Variation SS DF MS F-statistic

Among or Between Treatments 800

Error or Within Treatments 11 300 ----------------

Total 4100 13 ----------------

Source of Variation SS DF MS F-statistic

Among or Between Treatments 7 500

Error or Within Treatments 336.364 ----------------

Total 7200 18 ----------------

Page 11: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Analysis of Variance Testing

(1) Suppose that 34 students are divided into four categories: those with sports extracurricular activities only, those with non-sport activities only, those with both sports and non-sports activities, and those with neither type of activity. Each student was asked how many hours he/she worked per week at paid employment, if any. Based on the results, the sums of squares between and within were computed. Complete the analysis of variance table presented below. Then test at the 5% level whether the average number of hours worked per week varies with activity category.

Source of variation Sum of squares Degrees of freedom Mean squareBetween 450.00Within 1500.00Total

(2) Suppose there are 4 years (freshman, sophomore, junior, and senior), and 2 housing statuses (commuter and resident), for a total of 8 cells. Each cell has 4 observations. The number of credits each student is carrying in the current semester is examined, and the various sums of squares are computed. Complete the analysis of variance table presented below. Then test at the 5% level whether the average number of credits carried is influenced by (a) class year, (b) housing status, and (c) the interaction of class year and housing status.

Source of variation Sum of squares Degrees of freedom Mean square

Class year 30Housing status 4

Interaction 24Error 48

Total

Page 12: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Simple Regression

Consider the following data on the heights and weights of 30 students. Use these data to answer the questions below.

height weight 69 16068 22574 17567 12564 10965 13274 18573 18562 11271 16573 20566 14066 12072 20063 10467 17571 17269 16063 13569 17567 14367 15065 12064 11568 16070 18580 20064 11575 21566 140

(1) Estimate the regression line of weight on height,

WGT=a+b HGT

(2) Calculate the standard error of the regression (or standard error of the estimate).

(3) Calculate and interpret the coefficient of determination.

(4) Calculate the standard error of the estimated coefficient b. Use this information to test at the 5% level whether the true slope of the relation between height and weight is actually zero.

(5) Calculate the 95% confidence interval for the true slope of the relation between height and weight.

(6) Calculate the sample correlation coefficient r. Test at the five percent level whether the population correlation coefficient is actually zero.

(7) Calculate the 90% forecasting interval for the weight of an individual student whose height is 5 feet 9 inches.

(8) Calculate the 90% forecasting interval for the average weight of a large group of students whose heights are all 5 feet 9 inches.

Page 13: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Multiple Regression

Suppose that a regression is run using the number of hours of study time per week (STUDY) as the dependent variable. There are 35 observations. The independent variables are

WKHRS: the number of hours worked per week at a job,COMMUTER: dummy variable equal to one for commuting students, and 0 for resident students,MALE: dummy variable equal to one if the student is male, and 0 if the student is female,SENIOR: dummy variable equal to one if the student is a senior, and 0 otherwise.

The results are as follows.

Variable Coefficient Standard errorConstant 20.0 10.0WKHRS -0.5 0.125COMMUTER 2.0 2.0MALE -3.0 6.0SENIOR 6.0 2.0

Analysis of VarianceSource of variation Sum of squares Degrees of freedom Mean squareRegression 160.0Error 40.0Total 200.0

(1) Complete the ANOVA table.

(2) Compute the standard error of the estimate (or standard error of the regression).

(3) Compute and interpret the unadjusted coefficient of determination.

(4) Compute the coefficient of determination adjusted for degrees of freedom.

(5) Test at the 5% level whether the coefficient on the variable MALE is equal to zero.

(6) Test at the 5% level whether the coefficient on the variable SENIOR is equal to zero.

(7) Test at the 5% level the null hypothesis that the coefficient on the variable COMMUTER is equal to zero, versus the alternative that it is more than zero.

(8) Test at the 5% level the null hypothesis that the coefficient on the variable WKHRS is equal to zero, versus the alternative that it is less than zero.

(9) Test at the 5% level the hypothesis that all the slope coefficients are zero.

(10) How much does expected study time change if a student works an additional hour at a job? Specify whether this change is an increase or a decrease in study time.

(11) According to the regression results, do seniors study more or less than non-senior students? By how much more or less than non-seniors do seniors study?

Page 14: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Time Series

Suppose a student takes an intensive summer school course. The course meets all day Monday, Tuesday, Wednesday, and Thursday for four weeks. A short quiz is given each day. Suppose the student's quiz grades are as follows. Answer questions 1 to 3 based on these data.

week day gradeI M 4

Tu 5W 8Th 6

II M 6Tu 8W 5Th 7

III M 6Tu 7W 9Th 8

IV M 8Tu 8W 9Th 7

(1) Using four-day moving averages, compute the "seasonal" or "daily" index for each day of the week (instead of each season).

(2) Based on your daily indices, on which day does the student usually perform the best? On which day does the student usually perform the worst?

(3) Use your daily indices to adjust the time series of quiz grades.

(4) Consider the following sequence of quiz grades. Calculate the grade forecasts for quizzes 2 to 10, using the exponential smoothing method. Let the forecast F1 for the first quiz grade be the actual value A1 for the first quiz. Use a weight on the actual grade of w = 0.5.

actual grade forecasted grade for next quiz(A1) 5 (F2)(A2) 8 (F3)(A3) 6 (F4)(A4) 8 (F5)(A5) 5 (F6)(A6) 7 (F7)(A7) 6 (F8)(A8) 7 (F9)(A9) 9 (F10)

Page 15: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

Nonparametric Tests

(1) Suppose the grades on an exam for the male and female students in a class were as indicated below. Use the Wilcoxon rank sum test to test at the 5% level whether males and females did equally well. (Note: If two students tie for ranks 1 and 2, they both get the "middle" rank of 1.5, and the student following them is ranked 3. If three students tie for ranks 1, 2, and 3, they all get the "middle" rank of 2, and the student following them is ranked 4. If four students tie for ranks 1, 2, 3, and 4, they all get the "middle" rank of 2.5, and the student following them is ranked 5.)

males females97 9794 9286 8986 8683 8681 8578 7776 7676 7069 5663 4956 4155 3951 2947 15383221203

(2) Two drivers are testing the mileage of different models of cars.a.The gas mileage of nine different cars is as indicated. Use the Wilcoxon signed rank test to

test at the 5% level whether there is a difference in the gas mileage for the two drivers based on the data below. (Use the table for the signed rank critical values for this part.)

driver 1 driver 214.3 16.815.0 17.827.8 26.227.9 33.248.8 47.616.8 18.323.7 28.532.8 33.137.3 44.0

b. Thirty-three cars were tested. Differences in mileage were calculated and ranked. There were three differences that were equal to zero. The sum of the positive ranks was 200 and the sum of the negative ranks was 265. Use the Wilcoxon signed rank test to test at the 5% level whether there is a difference in the gas mileage for the two drivers.

(continued)

Page 16: Practice Exercises - Widener Universitykleppel/pract252.docx · Web viewPractice Exercisesfor QA252Intermediate Statistics Professor K. Leppel Hypothesis Testing: Type I and Type

(3) Suppose scores for 17 students from 3 schools in intermural competitions are as given below. Use the Kruskal-Wallis Test to test at the 5% level whether average scores for students from the three schools are the same.

School A: 27, 21, 30, 23, 18, 19School B: 22, 29, 17, 26, 14, 16School C: 24, 12, 11, 13, 28

(4) Suppose that in a class of 15 males and 10 females, course averages are such that the males (m) and females (f) rank from high to low as given below. Test whether the arrangement is random, at the 10% level.

f f m m m f m f f f m m m m m f f m m m f m m m f