Chapter 12a Simple Linear Regression
• Simple Linear Regression Model• Least Squares Method • Coefficient of Determination• Model Assumptions
Regression may be the most widely used
statistical technique in the social and natural
sciences—as well as in business
Simple Linear Regression Model
yy = = 00 + + 11xx + +
where:where:00 and and 11 are called are called parameters of the modelparameters of the model,, is a random variable called theis a random variable called the error term error term..
The The simple linear regression modelsimple linear regression model is: is:
The equation that describes how The equation that describes how yy is related to is related to xx and and an error term is called the an error term is called the regression modelregression model..
Simple Linear Regression EquationSimple Linear Regression Equation
The The simple linear regression equationsimple linear regression equation is: is:
• EE((yy) is the expected value of ) is the expected value of yy for a given for a given xx value. value.• 11 is the slope of the regression line. is the slope of the regression line.• 00 is the is the yy intercept of the regression line. intercept of the regression line.• Graph of the regression equation is a straight line.Graph of the regression equation is a straight line.
EE((yy) = ) = 00 + + 11xx
Simple Linear Regression Equation Positive Linear RelationshipPositive Linear Relationship
EE((yy))
xx
Slope Slope 11is positiveis positive
Regression lineRegression line
InterceptIntercept00
Simple Linear Regression EquationSimple Linear Regression Equation
Negative Linear RelationshipNegative Linear Relationship
EE((yy))
xx
Slope Slope 11is negativeis negative
Regression lineRegression lineInterceptIntercept00
Simple Linear Regression EquationSimple Linear Regression Equation
No RelationshipNo Relationship
EE((yy))
xx
Slope Slope 11is 0is 0
Regression lineRegression lineInterceptIntercept
00
Estimated Simple Linear Regression Estimated Simple Linear Regression EquationEquation
The The estimated simple linear regression estimated simple linear regression equationequation
0 1y b b x
• is the estimated value of is the estimated value of yy for a given for a given xx value. value.y• bb11 is the slope of the line. is the slope of the line.• bb00 is the is the yy intercept of the line. intercept of the line.
• The graph is called the estimated regression line.The graph is called the estimated regression line.
Estimation Process
Regression ModelRegression Modelyy = = 00 + + 11xx + +
Regression EquationRegression EquationEE((yy) = ) = 00 + + 11xx
Unknown ParametersUnknown Parameters00, , 11
Sample Data:Sample Data:x yx yxx11 y y11. .. . . .. . xxnn yynn
bb00 and and bb11provide estimates ofprovide estimates of
00 and and 11
EstimatedEstimatedRegression EquationRegression Equation
Sample StatisticsSample Statistics
bb00, , bb11
0 1y b b x
Least Squares Method• Least Squares Criterion
min (y yi i )2
where:where:yyii = = observedobserved value of the dependent variable value of the dependent variable for the for the iith observationth observation
^yyii = = estimatedestimated value of the dependent variable value of the dependent variable for the for the iith observationth observation
• Slope for the Estimated Regression Equation
1 2( )( )
( )i i
i
x x y yb
x x
Least Squares Method
yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression EquationEquation
Least Squares MethodLeast Squares Method
0 1b y b x
where:where:xxii = value of independent variable for = value of independent variable for iithth observationobservation
nn = total number of observations = total number of observations
__yy = mean value for dependent variable = mean value for dependent variable
__xx = mean value for independent variable = mean value for independent variable
yyii = value of dependent variable for = value of dependent variable for iithth observationobservation
Example: Reed Auto Sales• Simple Linear Regression
Reed Auto periodically hasa special week-long sale. As part of the advertisingcampaign Reed runs one ormore television commercialsduring the weekend preceding the sale. Data from asample of 5 previous sales are shown on the next slide.
Example: Reed Auto SalesExample: Reed Auto Sales
Simple Linear RegressionSimple Linear Regression
Number ofNumber of TV AdsTV Ads
Number ofNumber ofCars SoldCars Sold
1133221133
14142424181817172727
• Slope for the Estimated Regression Equation
• y-Intercept for the Estimated Regression Equation
• Estimated Regression Equation
Estimated Regression Equation
ˆ 10 5y x
1 2( )( ) 20 5( ) 4
i i
i
x x y yb
x x
0 1 20 5(2) 10b y b x
Using Excel to Develop a Scatter Diagram andUsing Excel to Develop a Scatter Diagram andCompute the Estimated Regression EquationCompute the Estimated Regression Equation Formula Worksheet (showing data)Formula Worksheet (showing data)
A B C D1 Week TV Ads Cars Sold 2 1 1 14 3 2 3 24 4 3 2 18 5 4 1 17 6 5 3 27 7
Producing a Scatter DiagramProducing a Scatter DiagramStep 1Step 1 Select cells B1:C6Select cells B1:C6
Step 2Step 2 Select the Select the Chart WizardChart WizardStep 3Step 3 When the When the Chart TypeChart Type dialog box appears: dialog box appears:
Choose Choose XY (Scatter)XY (Scatter) in the Chart type list in the Chart type list Choose Choose ScatterScatter from the Chart sub-type display from the Chart sub-type display Click Click Next >Next >
Step 4Step 4 When the When the Chart Source DataChart Source Data dialog box appears dialog box appears Click Click Next >Next >
Using Excel to Develop a Scatter Diagram andUsing Excel to Develop a Scatter Diagram andCompute the Estimated Regression EquationCompute the Estimated Regression Equation
Producing a Scatter DiagramProducing a Scatter Diagram
Using Excel to Develop a Scatter Diagram andUsing Excel to Develop a Scatter Diagram andCompute the Estimated Regression EquationCompute the Estimated Regression Equation
Step 5Step 5 When the When the Chart OptionsChart Options dialog box appears: dialog box appears: Select the Select the TitlesTitles tab and then tab and then
Delete Delete Cars SoldCars Sold in the Chart title box in the Chart title boxEnter Enter TV AdsTV Ads in the in the Value (X)Value (X) axis box axis boxEnter Enter Cars SoldCars Sold in the in the Value (Y)Value (Y) axis box axis box
Select the Select the LegendLegend tab and then tab and thenRemove the check in the Remove the check in the Show LegendShow Legend box boxClick Click Next >Next >
Producing a Scatter DiagramProducing a Scatter DiagramStep 6Step 6 When the When the Chart LocationChart Location dialog box appears: dialog box appears:
Specify the location for the new chartSpecify the location for the new chart Select Select FinishFinish to display the scatter diagram to display the scatter diagram
Using Excel to Develop a Scatter Diagram andUsing Excel to Develop a Scatter Diagram andCompute the Estimated Regression EquationCompute the Estimated Regression Equation
Adding the TrendlineAdding the Trendline
Step 3Step 3 When the When the Add TrendlineAdd Trendline dialog box appears: dialog box appears: On the On the TypeType tab select tab select LinearLinear On the On the Options Options tab select the tab select the DisplayDisplay
equation on chartequation on chart box box Click Click OKOK
Step 2Step 2 Choose the Choose the Add TrendlineAdd Trendline optionoption
Step 1Step 1 Position the mouse pointer over any dataPosition the mouse pointer over any data point and right click to display the point and right click to display the ChartChart
menumenu
Using Excel to Develop a Scatter Diagram andUsing Excel to Develop a Scatter Diagram andCompute the Estimated Regression EquationCompute the Estimated Regression Equation
Scatter Diagram and Trend Line
y = 5x + 10
0
5
10
15
20
25
30
0 1 2 3 4TV Ads
Car
s So
ld
Coefficient of Determination• Relationship Among SST, SSR, SSE
where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error
SST = SSR + SST = SSR + SSE SSE
2( )iy y 2ˆ( )iy y 2ˆ( )i iy y
The The coefficient of determinationcoefficient of determination is: is:
Coefficient of DeterminationCoefficient of Determination
where:where:SSR = sum of squares due to regressionSSR = sum of squares due to regressionSST = total sum of squaresSST = total sum of squares
rr22 = SSR/SST = SSR/SST
Coefficient of DeterminationCoefficient of Determination
rr22 = SSR/SST = 100/114 = .8772 = SSR/SST = 100/114 = .8772 The regression relationship is very strong; 88%The regression relationship is very strong; 88%of the variability in the number of cars sold can beof the variability in the number of cars sold can beexplained by the linear relationship between theexplained by the linear relationship between thenumber of TV ads and the number of cars sold.number of TV ads and the number of cars sold.
Using Excel to ComputeUsing Excel to Computethe Coefficient of Determinationthe Coefficient of Determination
Producing Producing r r 22
Step 3Step 3 When the Add Trendline dialog box appears: When the Add Trendline dialog box appears: On the On the OptionsOptions tab, select the tab, select the Display Display R-squared value on chartR-squared value on chart box box Click Click OKOK
Step 2Step 2 When the Chart menu appears: When the Chart menu appears: Choose the Choose the Add TrendlineAdd Trendline option option
Step 1Step 1 Position the mouse pointer over any data Position the mouse pointer over any data point in the scatter diagram and right clickpoint in the scatter diagram and right click
Using Excel to ComputeUsing Excel to Computethe Coefficient of Determinationthe Coefficient of Determination
Value Worksheet (showing Value Worksheet (showing rr 22))
y = 5x + 10
R2 = 0.8772
0
5
10
15
20
25
30
0 1 2 3 4TV Ads
Car
s So
ld
Sample Correlation Coefficient
21 ) of(sign rbrxy
ionDeterminat oft Coefficien ) of(sign 1brxy
where:where: bb11 = the slope of the estimated regression = the slope of the estimated regression equationequation xbby 10ˆ
21 ) of(sign rbrxy
The sign of The sign of bb11 in the equation in the equation is “+”. is “+”.ˆ 10 5y x
=+ .8772xyr
Sample Correlation Coefficient
rrxyxy = +.9366 = +.9366
Assumptions About the Error Term e
1. The error 1. The error is a random variable with mean of zero. is a random variable with mean of zero.
2. The variance of 2. The variance of , denoted by , denoted by 22, is the same for, is the same for all values of the independent variable.all values of the independent variable.
3. The values of 3. The values of are independent. are independent.
4. The error 4. The error is a normally distributed random is a normally distributed random variable.variable.