Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Notebook Tab 6Pages 183 to 196
© 2014 ConteSolutions
© 2014 ConteSolutions
When the assumed relationship best fits a straight line model (r (Pearson’s correlation coefficient) is close to |1|), this approach is known as Linear Regression Analysis.
Excel and Minitab gives r2 value
r is square root of r² with a + or - sign When the modeling analysis best fits a
straight line and includes only one independent variable, it is known as Simple Linear Regression Analysis.
© 2014 ConteSolutions
ebxby 01
TOH in Linear Regression model (ANOVA) The null hypothesis (= sign) and a parameter
Null Hypothesis: All beta coefficients are equal to zero
Criteria - P-value from Minitab is less than P=0.05 we reject the null hypothesis
statistical significant alternative based on α = 0.05 Excel users note: 5.45E-05 is exponential notation
and equal to 0.0000545 (see Wikipedia)
© 2014 ConteSolutions
Response variable and input variable Objective to predict dependent variable based on
the independent variable Example: predict customer satisfaction based on
wait time Strength of relationship is “Correlation Coefficient” The regression equation contains two values of
interest
© 2014 ConteSolutions
baxy
Slope and intercept and error term Least squares method Least squares regression line Example: x is wait time, y is customer satisfaction
ebxby 01
HEIGHT WEIGHT70 15563 15072 18060 13566 15670 16874 17865 16062 13267 14565 139
The following data was collected to see if weight can be predicted from a person’s height:
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
Correlations: HEIGHT, WEIGHT R-Sq = 75.92%Square root of 0.7592 = 0.87
Pearson correlation of HEIGHT and WEIGHT = + 0.87Pearson Correlation Coefficients range between -1.0 and + 1.0They can be positive or negativeCustomer Satisfaction model it could be + or –
P-Value = 0.000477
H0: Slope and Intercept values = 0 (no linear correlation) (beta0 and beta1)
Ha: Slope and Intercept not = 0 (there appears to be a linear correlation)
© 2014 ConteSolutions
Regression Analysis: WEIGHT versus HEIGHT
The regression equation is:
WEIGHT = - 62.8509 + 3.2553 x HEIGHT
Predictor Coef SE Coef T P
Constant -62.851 40.85 -1.54 0.158HEIGHT 3.2553 0.6455 6.28 0.000
Based on this analysis the independent variable, Height, does appear to be significant in predicting a person’s weight.
© 2014 ConteSolutions
For this model,
S = 8.42 – the standard deviation of the distances the actual values are from the fitted line (smaller is better)
R-Sq = 75.92% - this model explains 75.92% of the variation in weights
Is the Model Statistically Useful?Model Diagnostics
© 2014 ConteSolutions
H0: The model’s beta coefficients are all zero (none are useful in predicting the response variable
Ha: At least one of the model’s beta coefficients are not zero (at least one independent variable is useful in predicting the response variable)
What is your conclusion now?
© 2014 ConteSolutions
A weak correlation coefficient does not mean that the model is not useful.
Remember that the correlation coefficient only tests for linear relationships and the relationship among the variables may be curvilinear.
A low R sq / R sq adj does not mean that the model is not useful.
It just means that is it not complete and that other terms need to be added to make it more effective at predicting the variation in the response variable.
© 2014 ConteSolutions
Desktop Open Minitab File/Open Worksheet◦ Villanova/Correlation Exercise.XLS
Graph/Scatter Plot (with regression) Stat/Regression/Regression◦ Response=WEIGHT, Predictor=Height
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions
© 2014 ConteSolutions