33
Review for the Final Exam STAT E-150 Statistical Methods

Review for the Final Exam STAT E-150 Statistical Methods

Embed Size (px)

Citation preview

Page 1: Review for the Final Exam STAT E-150 Statistical Methods

Review for the Final Exam

STAT E-150Statistical Methods

Page 2: Review for the Final Exam STAT E-150 Statistical Methods

2

The final exam will be on Wednesday, May 15 during our regular class time. You will have two hours to complete the exam.

Please arrive on time and take alternating seats so that you have room for your materials.

The exam is open book: you may use your own notes, handouts, homework, textbook, etc. Basically, you may use any of your own materials from this course, from this semester.

Page 3: Review for the Final Exam STAT E-150 Statistical Methods

3

Topics will include: simple linear regression, multiple regression, one-way ANOVA, two-way ANOVA, repeated measures, logistic regression, multiple logistic regression, experimental design, and nonparametric tests. No specific topics from the first half of the course will be tested, except as they relate to the topics listed here.

Page 4: Review for the Final Exam STAT E-150 Statistical Methods

4

What to expect

• Multiple choice, matching questions, and short answer questions as there were on the midterm

• Using SPSS output instead of calculations• Interpreting graphs

You’ll be tested on what you’ve seen (assuming you’ve done your homework …)

Page 5: Review for the Final Exam STAT E-150 Statistical Methods

5

If you would like to have your exam returned to you, please bring a large self-addressed envelope to the exam, with sufficient postage attached (not affixed).

If you forget to bring an envelope to the exam, you can send one to me at Pine Manor College, 400 Heath St, Chestnut Hill, MA 02467.

Page 6: Review for the Final Exam STAT E-150 Statistical Methods

6

WELL BEFORE THE EXAM  Organize the course handouts, your homework, homework solutions, and any section materials from the class website in a 3-ring binder, tabbed for quick reference. Materials from before the Midterm could be helpful as well. Make a few pages of your own key notes, such as particular formulas, definitions or concepts. Consider making your own guides to hypothesis tests, such as the example in this document.

Page 7: Review for the Final Exam STAT E-150 Statistical Methods

7

WELL BEFORE THE EXAM  Review the class notes and homework solutions, especially on concepts that you need to review more carefully. Be sure that you have taken the Practice Final using the materials (calculator, notes, section handouts...) that you will use for the exam. Attend some or all of these sections:

Fri. 5/10 53 Church St., Rm 201 5:30 - 6:30 Kela

Mon. 5/13 53 Church St., Rm 202 5:30 - 6:30 Kela

Wed. 5/15 Harvard Hall, Rm 201 4:30 - 5:20 Stephanie

Page 8: Review for the Final Exam STAT E-150 Statistical Methods

8

BEFORE THE EXAM Organize all materials, notebooks, textbook, section handouts, etc. Only your own materials from this course, this semester, are permitted. Be sure you have your calculator, and possibly an extra one. You may not use a graphing calculator for statistics functions, and you may not use a cell phone or PDA calculator. When you are ready to begin, RELAX!, and continue to think positive thoughts about the outcome of the exam, as research has shown this technique to contribute to better scores (Meichenbaum, 1996).

Page 9: Review for the Final Exam STAT E-150 Statistical Methods

9

DURING THE EXAM Read questions carefully and thoroughly. Pace yourself, and keep track of your progress and the clock. Don’t get bogged down. Consider noting any difficult questions and coming back later.  Think carefully about the appropriate analysis: Hypothesis test or confidence interval? Is the response variable quantitative or categorical? How many treatments? Which is the coefficient of interest? Are you concerned with means or relationships between variables? When you are finished, go back to check that you haven’t skipped any questions.

Page 10: Review for the Final Exam STAT E-150 Statistical Methods

10

AFTER THE EXAM

CONGRATULATIONS!!  IT’S TIME TO CELEBRATE!!!

Page 11: Review for the Final Exam STAT E-150 Statistical Methods

11

SAMPLE HYPOTHESIS TEST GUIDE - create your own for each type of test Multiple Regression Test 1: Overall significance of multiple regression model; use an F-test for the model (ANOVA).

H0: β1 = β2 =… = βk = 0 Ha: The slopes are not all zero

 Test 2: Specific significance of single coefficient; use a t-test for each coefficient.

H0: βj = 0 Ha: βj ≠ 0

Page 12: Review for the Final Exam STAT E-150 Statistical Methods

12

Review Questions For each question choose the best method of analysis and write the appropriate hypotheses. Choose from the statistical methods we have discussed this semester: 

Simple linear regressionMultiple regressionLogistic regressionMultiple logistic regressionOne-way ANOVATwo-way ANOVARepeated measures ANOVA

Page 13: Review for the Final Exam STAT E-150 Statistical Methods

13

1. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to investigate any differences in the political ideologies of these groups.

What is the best method of analysis?

What would be your hypotheses?

Page 14: Review for the Final Exam STAT E-150 Statistical Methods

14

1. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to investigate any differences in the political ideologies of these groups.

What is the best method of analysis? Two-way ANOVA

What would be your hypotheses?

Page 15: Review for the Final Exam STAT E-150 Statistical Methods

15

1. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to investigate any differences in the political ideologies of these groups.

What is the best method of analysis? Two-way ANOVA

What would be your hypotheses?

H0: μmn = μfn = μms = μfs = μmg = μfg

Ha: the means are not all equal

Page 16: Review for the Final Exam STAT E-150 Statistical Methods

16

2. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to see if there were differences in the political ideologies of people with different levels of education. What is the best method of analysis?

What would be your hypotheses?

Page 17: Review for the Final Exam STAT E-150 Statistical Methods

17

2. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to see if there were differences in the political ideologies of people with different levels of education. What is the best method of analysis? One-way ANOVA

What would be your hypotheses?

Page 18: Review for the Final Exam STAT E-150 Statistical Methods

18

2. A survey asked subjects to report their political ideology, measured with seven categories in which 1 = extremely liberal, 4 = moderate, and 7 = extremely conservative. The subjects also reported their gender (male, female) and their level of education (no college, some college, college graduate). The data was used to see if there were differences in the political ideologies of people with different levels of education. What is the best method of analysis? One-way ANOVA

What would be your hypotheses?

H0: μnone = μsome = μgrad

Ha: the means are not all equal

Page 19: Review for the Final Exam STAT E-150 Statistical Methods

19

3. In a survey related to Jeb Bush’s possible candidacy for President, subjects were asked for their annual income and whether they would vote for Bush, to see if there is any relationship between income and interest in voting for Bush.  What is the best method of analysis?

What would be your hypotheses?

Page 20: Review for the Final Exam STAT E-150 Statistical Methods

20

3. In a survey related to Jeb Bush’s possible candidacy for President, subjects were asked for their annual income and whether they would vote for Bush, to see if there is any relationship between income and interest in voting for Bush.  What is the best method of analysis? Logistic regression

What would be your hypotheses?

Page 21: Review for the Final Exam STAT E-150 Statistical Methods

21

3. In a survey related to Jeb Bush’s possible candidacy for President, subjects were asked for their annual income and whether they would vote for Bush, to see if there is any relationship between income and interest in voting for Bush.  

What is the best method of analysis? Logistic regression

What would be your hypotheses?

H0: 1 = 0

Ha: 1 ≠ 0

Page 22: Review for the Final Exam STAT E-150 Statistical Methods

22

4. Many women give birth to more than one child. In research on the birthweights of children, data was gathered on the birthweights of children born to six different mothers. The data looked like this:

 

What is the best method of analysis?

What would be your hypotheses?

Birthweights

Mother Child 1 Child 2 Child 3 Child 4

12:6

6.48.5:

7.0

6.97.8:

7.8

6.77.8:

8.6

7.18.3:

6.6

Page 23: Review for the Final Exam STAT E-150 Statistical Methods

23

4. Many women give birth to more than one child. In research on the birthweights of children, data was gathered on the birthweights of children born to six different mothers. The data looked like this:

 

What is the best method of analysis? Repeated measures ANOVA

What would be your hypotheses?

Birthweights

Mother Child 1 Child 2 Child 3 Child 4

12:6

6.48.5:

7.0

6.97.8:

7.8

6.77.8:

8.6

7.18.3:

6.6

Page 24: Review for the Final Exam STAT E-150 Statistical Methods

24

4. Many women give birth to more than one child. In research on the birthweights of children, data was gathered on the birthweights of children born to six different mothers. The data looked like this:

 

What is the best method of analysis? Repeated measures ANOVA

What would be your hypotheses?

H0: μ1 = μ2 = μ3 = μ4

Ha: the means are not all equal

Birthweights

Mother Child 1 Child 2 Child 3 Child 4

12:6

6.48.5:

7.0

6.97.8:

7.8

6.77.8:

8.6

7.18.3:

6.6

Page 25: Review for the Final Exam STAT E-150 Statistical Methods

25

5. An investigator for the state police wants to determine the effectiveness of three different defensive driving programs to see if there are gender differences. Five subjects of each gender who recently received speeding tickets are assigned to each program. At the end of the program each is given a written test on his or her knowledge of defensive driving.

The scores (out of 100) are given here:

  Scores

GenderOne 8 - hour

sessionTwo 4 - hour

sessionsTwo 2 - hour

sessions

Female

8892989991

8792919493

8082798688

Male

8996959096

9587909192

7778837878

Page 26: Review for the Final Exam STAT E-150 Statistical Methods

26

Use the SPSS results shown to answer these questions:

1. Is there interaction between program and gender?

a. No, because the p-value is close to zero.b. Yes, because .196 is greater than .05c. No, because .196 is greater than .05d. Yes, because .374 is greater than .05e. No, because .374 is greater than .05

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)

Page 27: Review for the Final Exam STAT E-150 Statistical Methods

27

1. Is there interaction between program and gender?

a. No, because the p-value is close to zero.b. Yes, because .196 is greater than .05c. No, because .196 is greater than .05d. Yes, because .374 is greater than .05e. No, because .374 is greater than .05

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)

Page 28: Review for the Final Exam STAT E-150 Statistical Methods

28

2. How does the interaction plot support your results in the previous question?

a. There is evidence of interaction because for part of the plot the lines appear to be parallel.

b. There is no evidence of interaction because for part of the plot the lines are not parallel.

c. There is evidence of interaction because the lines do not intersect.

d. There is no evidence of interaction because the lines do not intersect.

e. None of the above.

Page 29: Review for the Final Exam STAT E-150 Statistical Methods

29

2. How does the interaction plot support your results in the previous question?

a. There is evidence of interaction because for part of the plot the lines appear to be parallel.

b. There is no evidence of interaction because for part of the plot the lines are not parallel.

c. There is evidence of interaction because the lines do not intersect.

d. There is no evidence of interaction because the lines do not intersect.

e. None of the above.

Page 30: Review for the Final Exam STAT E-150 Statistical Methods

30

3. Is there a significant difference in the mean scores for the three types of sessions? 

a. Yes, because F = 1.773 and p is large.b. No, because F = 1.773 and p is large.c. Yes, because F = 37.898 and p is close to 0.d. No, because F = 37.898 and p is close to 0.e. None of the above.

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)

Page 31: Review for the Final Exam STAT E-150 Statistical Methods

31

3. Is there a significant difference in the mean scores for the three types of sessions? 

a. Yes, because F = 1.773 and p is large.b. No, because F = 1.773 and p is large.c. Yes, because F = 37.898 and p is close to 0.d. No, because F = 37.898 and p is close to 0.e. None of the above.

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)

Page 32: Review for the Final Exam STAT E-150 Statistical Methods

32

4. Is there significant difference in the scores by gender?

  Yes/No, because F = 1.773 and the p-value is large/small.

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)

Page 33: Review for the Final Exam STAT E-150 Statistical Methods

33

4. Is there significant difference in the scores by gender?

  Yes/No, because F = 1.773 and the p-value is large/small.

Tests of Between-Subjects Effects

Dependent Variable:score

Source

Type III Sum of

Squares df Mean Square F Sig.

Corrected Model 935.500a 5 187.100 15.923 .000

Intercept 234967.500 1 234967.500 19997.234 .000

gender 20.833 1 20.833 1.773 .196

sessions 890.600 2 445.300 37.898 .000

gender * sessions 24.067 2 12.033 1.024 .374

Error 282.000 24 11.750    

Total 236185.000 30      

Corrected Total 1217.500 29      

a. R Squared = .768 (Adjusted R Squared = .720)