13
Quant 1 - Exam #2 (Dataset “Earth.sav”) Please provide all responses in this document and name the file using convention, “Last name, First name – exam #2.docx). Please highlight the text of your answers. There are six (6) pages in this document. Ensure you scroll down. I put page breaks in between each of the numbered problems. For this assignment, we will be using the file, “Earth.sav,” and the following variables. Use an α = .05 significance level for the entire exam . For all applicable analyses, exclude cases listwise . - Average female life expectancy (lifeexpf) – dependent variable (DV) - Gross domestic product (gdp_cap) – independent variable (IV) - Fertility: average number of children (fertility) – independent variable (IV) - Number of people / sq. kilometer (density) – independent variable (IV) - Daily calorie intake (calories) – independent variable (IV) 1. Using the steps from class and in SPSS, perform a correlation analysis between Average female life expectancy (lifeexpf) and Daily calorie intake (calories). a. Paste a copy of the pertinent parts of your output from the correlation analysis below (only the pertinent parts and please make it neat and reasonably sized). I recommend you paste as JPEG images. (4 points)

Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

Quant 1 - Exam #2 (Dataset “Earth.sav”)Please provide all responses in this document and name the file using convention, “Last name, First name – exam #2.docx). Please highlight the text of your answers. There are six (6) pages in this document. Ensure you scroll down. I put page breaks in between each of the numbered problems.

For this assignment, we will be using the file, “Earth.sav,” and the following variables. Use an α = .05 significance level for the entire exam. For all applicable analyses, exclude cases listwise.

- Average female life expectancy (lifeexpf) – dependent variable (DV)- Gross domestic product (gdp_cap) – independent variable (IV)- Fertility: average number of children (fertility) – independent variable (IV) - Number of people / sq. kilometer (density) – independent variable (IV)- Daily calorie intake (calories) – independent variable (IV)

1. Using the steps from class and in SPSS, perform a correlation analysis between Average female life expectancy (lifeexpf) and Daily calorie intake (calories).

a. Paste a copy of the pertinent parts of your output from the correlation analysis below (only the pertinent parts and please make it neat and reasonably sized). I recommend you paste as JPEG images. (4 points)

b. Are the two variables significantly correlated? If so, positively or negatively? Support your answer. (4 points)

Yes. The scatterplot indicates a positive and direct correlation because of its sloping nature from lower left to upper right. There is a significant correlation because the P value is less than .001 which is well below the α = .05 limit of significance. The Correlations Model tells us that the correlation coefficient is .775; indicating also a strong correlation.

Page 2: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

2. Using the steps from class and in SPSS, perform a simple linear regression analysis to determine the impact of Fertility: average number of children (fertility) on Average female life expectancy (lifeexpf).

a. Paste a copy of the pertinent parts (and only the pertinent parts) of your output from the simple linear regression analysis below. Proceed as though you have met all assumptions required for simple linear regression (in other words, you do NOT have to do the diagnostics for this problem). (4 points)

Page 3: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

b. Is the overall simple linear regression model significant? Support your answer. (4 points) Yes. The overall model is significant (has predictive value), and we know this because the P value in the ANOVA table is less than .001, so we reject the null hypothesis that the model has no predictive value.

c. What is the value of the coefficient of determination and what does it mean as it relates to this problem? (4 points) 70.2%, found in the model summary under the r2. In relationship to the problem, this means that there is 70.2% variance of life expectancy explained by the variation of fertility (average number of children).

d. Write out the regression equation model using the information from your analysis. (4 points) Y=86.661-4.674X

e. What is the expected average female life expectancy when the average number of children (fertility) is 2.5? (4 points) Y=86.661-4.674(2.5)Y= 74.976

f. In the terms of our problem, interpret the unstandardized Beta coefficient for b1. (4 points As the fertility increases by one unit (one baby), female life expectancy decreases by 4.674 years.

g. In the terms of our problem, interpret the standardized Beta coefficient for b1. (4 points) The standard beta coefficient of -.838 indicates that as the average number of children increases (upward) by one standard of deviation, the average female life expectancy decreases (downward)by .838 standard deviations.

Page 4: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

3. Use the steps from class and in SPSS. We want to analyze the effect of several variables on Average female life expectancy (dependent variable). Use the following variables and perform the appropriate steps for a multiple regression analysis:

- Average female life expectancy (lifeexpf) – dependent variable (DV)- Gross domestic product (gdp_cap) – independent variable (IV)- Fertility: average number of children (fertility) – independent variable (IV) - Number of people / sq. kilometer (density) – independent variable (IV)- Daily calorie intake (calories) – independent variable (IV)

a. Paste a copy of the pertinent parts of your output from the simple linear regression analysis below. (4 points)

Page 5: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide
Page 6: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide
Page 7: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

b. Does each of the IVs appear to be linearly related to the DV? Which part(s) of the SPSS output provides that information? If there are any IVs that do not appear to be linearly related to the DV, which are they? (4 points) No, the matrix scatterplot shows that all the independent variables except density appear to have reasonably linier relationships with the DV, average female life expectancy. Density appears to have little or no correlation in its scatterplot, based on its flat trend line. There also seems to be an obvious outlier in density data.

c. Does the model meet the assumption of independent errors (residuals) and homoscedasticity? Which part(s) of the SPSS output provides that information? (4 points) Yes, the model appears to meet the assumption of independent errors and homoscedasticity. The scatterplot of the standardized predicted value against the standardized residual shows that the dots are scattered and there are no obvious trends. We accept the model meets these assumptions.

d. Is there any multicollinearity? Support your answer. Which part(s) of the SPSS output provides the best information to answer the multicollinearity question? (4 pointsAll of the IV’s VIF collinearity are under three, which are acceptable low levels, which tells us that there is multicollinearity in our model. We find the VIF data in the Coefficients table.

e. Are the errors normally distributed? Support your answer. Which part(s) of the SPSS output provides that information? (4 points) Yes, the model reflects normally distributed errors. The histogram (normal bell curve) and P-P plot (data generally distributed around the trend line) both indicated normality. The scatterplot data resembles normal dispersion of data with a shot gun blast effect, further indication that the errors are normally distributed.

Page 8: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

f. Is the overall multiple linear regression model significant? Support your answer. (4 points) Yes. The global test/F-test measures the overall significance of the model, at α = .05 level. This measurement is found in the ANOVA table by looking at the p-value, which is less than .001. Because the P-value is low, the null must go, and we can reject our null hypothesis that it holds no predictive value..

g. Which of the independent variables significantly contribute to the predictive/explanatory value to the model? Which part(s) of the SPSS output provides that information? (4 points) The coefficient output gives information about the predictive significance of the independent variables. If α = .05 is the cut off level for significance in our model, then fertility (p value is less than or = to .001) and daily calorie intake (p value= .001) hold predictive value.

h. Of the independent variables that DO NOT significantly contribute to the predictive/explanatory value to the model, which has the highest p value (i.e., furthest from being significant) and what is the p value of that IV? (4 points) The remaining variables are not significance, with P-value levels way above α = .05. GDP p-value is .718, and density is .908, meaning neither variables hold predictive or significant value.

Page 9: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

4. Use the steps from class and in SPSS. Remove the non-significant predictor variables from the multiple linear regression model in number 3. above. Use the remaining predictor variables and perform the appropriate steps for a multiple regression analysis (assume that you have met all your assumptions – just run the multiple regression analysis).

a. Is the overall multiple linear regression model significant? Support your answer. (4 points) Yes. The global test/F-test measures the overall significance of the model, at α = .05 level. This measurement is found in the ANOVA table by looking at the p-value, which is less than .001. Because the P-value is low, the null must go, and we can reject our null hypothesis that it holds no predictive value. F(127.96), p<.001, adjusted R2 =.774.

b. What is the value of the coefficient of multiple determination and what does it mean as it relates to this problem? (4 points) The coefficient of determination (found in the model summary), is adjusted r2 .774. This means that 77.4% of the variance in the average life expectancy of a woman can be explained by the variance in fertility and daily calorie intake.

c. Write out the regression equation model using the information from your analysis. (4 points) Yfemale life expectancy=61.631 - 3.462(avg # children) +.007(daily calorie intake)Y=61.631 - 3.462 (x) + .007(x)

d. In the terms of our problem, interpret the unstandardized Beta coefficient for the predictor variable with the Beta coefficient with the largest/larger standard error. (4 points) The unstandardized Beta coefficient for the larger of the two predictor variables is Fertility (-3.462), meaning that fertility increases by one unit, the female life expectancy decreases by 3.462 years.

e. In the terms of our problem, interpret the standardized Beta coefficient for the same predictor variable as in d. immediately above. (4 points) -.589 As fertility increases by one standard deviation, the average female life expectancy decreases by .589 standard deviations.

f. Use 1 (one) in place of the X values in the new model equation to predict the expected average female life expectancy (note: this is like 3.e. above, but I can’t provide more information without giving you the answer to earlier questions)? (4 points)

Y=61.631 - 3.462 (1) + .007(1)=58.176

Page 10: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

5. A student playing an interactive marketing game received the following output when studying the relation between advertising expenditures (IV) and sales (DV) for one his products:

Yi = 357.4 - .23X

p-value of the estimated slope = .92

The student stated: “This output indicates that the more we spend on advertising this product, the fewer units we sell!! Comment. (4 points)

This statement is false, based on the linier equation given. For example, if X=500 ($500 spent on advertisement) is put in for X, the outcome as sales is $242.4. Putting in $600 for X, the y intercept (outcome) goes down.

6. Complete the missing values in the ANOVA table below: (4 points)

Page 11: Weebly - Quant 1 - Exam #2 (Dataset “Earth.sav”)mariehoffmanmha.weebly.com/.../hoffman_marie_exam_2.docx · Web viewQuant 1 - Exam #2 (Dataset “Earth.sav”) Please provide

245.93454.793