Upload
others
View
30
Download
2
Embed Size (px)
Citation preview
SOA Exam SRM
Flashcards
Runhuan Feng PhD FSA CERA Danieumll Linders PhDAmbrose Lo PhD FSA CERA
Learning amp Memorizing Key Topics and Formulas
Spring 2019 Edition
ACTEX Learning
Copyright copy 2019 ACTEX Learning a division of SRBooks Inc
No portion may be reproduced or transmittedin any part or by any meanswithout the permission of the publisher
ISBN 978-1-63588-708-2
Printed in the United States of America
ACTEX is committed to making continuous improvements to our study material We thus invite you to provide us with a critique of these flashcards
Publication ACTEX SOA SRM Flashcards Spring 2019 Edition
In preparing for my exam I found this material (Check one)
____Very Good ____Good ____Satisfactory ___Unsatisfactory
I found the following helpful________________________________________________________________________________________________________________
I found the following problems (Please be specific as to section specific item andor page number)________________________________________________________________________________________________________________
Please continue on the other side of this card
To improve these flashcards I would__________________________________________________________
Name ____________________________________________________Address ___________________________________________________Phone_____________________________________________________E-mail ____________________________________________________ (Please provide this information in case clarification is needed)
Send to Stephen Camilli ACTEX Learning
PO Box 715 New Hartford CT 06057
Or visit our website at wwwActexMadRivercom to complete the survey on-line Click on the ldquoSend Us Feedbackrdquo link to access the online version
You can also e-mail your comments to SupportActexMadRivercom
Preface
This set of flashcards is meant to complement the ACTEX Study Manual forSOA Exam SRM (Statistics for Risk Modeling) Fully revised in response tothe May 2019 edition of the SRM study manual these flashcards provide aconcise summary of the SRM exam material in a readable and presentation-oriented format with a view to maximizing retention Important formulasare displayed to facilitate identification and memorization Suggestions aregiven as to which formulas in our opinion must be memorized whichformulas are important but can be easily deduced from other results andwhich formulas are of secondary importance The flashcards are particularlysuitable for last-minute reviewmdashdonrsquot forget to take them with you on yourway to the CBT exam center
It should be noted however that these flashcards add value to but areno substitute for reading the SRM study manual Examples and problemswhich are key to exam success are not included or discussed in these flash-cards We suggest that you first read the manual carefully go over the
i
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
Copyright copy 2019 ACTEX Learning a division of SRBooks Inc
No portion may be reproduced or transmittedin any part or by any meanswithout the permission of the publisher
ISBN 978-1-63588-708-2
Printed in the United States of America
ACTEX is committed to making continuous improvements to our study material We thus invite you to provide us with a critique of these flashcards
Publication ACTEX SOA SRM Flashcards Spring 2019 Edition
In preparing for my exam I found this material (Check one)
____Very Good ____Good ____Satisfactory ___Unsatisfactory
I found the following helpful________________________________________________________________________________________________________________
I found the following problems (Please be specific as to section specific item andor page number)________________________________________________________________________________________________________________
Please continue on the other side of this card
To improve these flashcards I would__________________________________________________________
Name ____________________________________________________Address ___________________________________________________Phone_____________________________________________________E-mail ____________________________________________________ (Please provide this information in case clarification is needed)
Send to Stephen Camilli ACTEX Learning
PO Box 715 New Hartford CT 06057
Or visit our website at wwwActexMadRivercom to complete the survey on-line Click on the ldquoSend Us Feedbackrdquo link to access the online version
You can also e-mail your comments to SupportActexMadRivercom
Preface
This set of flashcards is meant to complement the ACTEX Study Manual forSOA Exam SRM (Statistics for Risk Modeling) Fully revised in response tothe May 2019 edition of the SRM study manual these flashcards provide aconcise summary of the SRM exam material in a readable and presentation-oriented format with a view to maximizing retention Important formulasare displayed to facilitate identification and memorization Suggestions aregiven as to which formulas in our opinion must be memorized whichformulas are important but can be easily deduced from other results andwhich formulas are of secondary importance The flashcards are particularlysuitable for last-minute reviewmdashdonrsquot forget to take them with you on yourway to the CBT exam center
It should be noted however that these flashcards add value to but areno substitute for reading the SRM study manual Examples and problemswhich are key to exam success are not included or discussed in these flash-cards We suggest that you first read the manual carefully go over the
i
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
ACTEX is committed to making continuous improvements to our study material We thus invite you to provide us with a critique of these flashcards
Publication ACTEX SOA SRM Flashcards Spring 2019 Edition
In preparing for my exam I found this material (Check one)
____Very Good ____Good ____Satisfactory ___Unsatisfactory
I found the following helpful________________________________________________________________________________________________________________
I found the following problems (Please be specific as to section specific item andor page number)________________________________________________________________________________________________________________
Please continue on the other side of this card
To improve these flashcards I would__________________________________________________________
Name ____________________________________________________Address ___________________________________________________Phone_____________________________________________________E-mail ____________________________________________________ (Please provide this information in case clarification is needed)
Send to Stephen Camilli ACTEX Learning
PO Box 715 New Hartford CT 06057
Or visit our website at wwwActexMadRivercom to complete the survey on-line Click on the ldquoSend Us Feedbackrdquo link to access the online version
You can also e-mail your comments to SupportActexMadRivercom
Preface
This set of flashcards is meant to complement the ACTEX Study Manual forSOA Exam SRM (Statistics for Risk Modeling) Fully revised in response tothe May 2019 edition of the SRM study manual these flashcards provide aconcise summary of the SRM exam material in a readable and presentation-oriented format with a view to maximizing retention Important formulasare displayed to facilitate identification and memorization Suggestions aregiven as to which formulas in our opinion must be memorized whichformulas are important but can be easily deduced from other results andwhich formulas are of secondary importance The flashcards are particularlysuitable for last-minute reviewmdashdonrsquot forget to take them with you on yourway to the CBT exam center
It should be noted however that these flashcards add value to but areno substitute for reading the SRM study manual Examples and problemswhich are key to exam success are not included or discussed in these flash-cards We suggest that you first read the manual carefully go over the
i
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
To improve these flashcards I would__________________________________________________________
Name ____________________________________________________Address ___________________________________________________Phone_____________________________________________________E-mail ____________________________________________________ (Please provide this information in case clarification is needed)
Send to Stephen Camilli ACTEX Learning
PO Box 715 New Hartford CT 06057
Or visit our website at wwwActexMadRivercom to complete the survey on-line Click on the ldquoSend Us Feedbackrdquo link to access the online version
You can also e-mail your comments to SupportActexMadRivercom
Preface
This set of flashcards is meant to complement the ACTEX Study Manual forSOA Exam SRM (Statistics for Risk Modeling) Fully revised in response tothe May 2019 edition of the SRM study manual these flashcards provide aconcise summary of the SRM exam material in a readable and presentation-oriented format with a view to maximizing retention Important formulasare displayed to facilitate identification and memorization Suggestions aregiven as to which formulas in our opinion must be memorized whichformulas are important but can be easily deduced from other results andwhich formulas are of secondary importance The flashcards are particularlysuitable for last-minute reviewmdashdonrsquot forget to take them with you on yourway to the CBT exam center
It should be noted however that these flashcards add value to but areno substitute for reading the SRM study manual Examples and problemswhich are key to exam success are not included or discussed in these flash-cards We suggest that you first read the manual carefully go over the
i
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
Preface
This set of flashcards is meant to complement the ACTEX Study Manual forSOA Exam SRM (Statistics for Risk Modeling) Fully revised in response tothe May 2019 edition of the SRM study manual these flashcards provide aconcise summary of the SRM exam material in a readable and presentation-oriented format with a view to maximizing retention Important formulasare displayed to facilitate identification and memorization Suggestions aregiven as to which formulas in our opinion must be memorized whichformulas are important but can be easily deduced from other results andwhich formulas are of secondary importance The flashcards are particularlysuitable for last-minute reviewmdashdonrsquot forget to take them with you on yourway to the CBT exam center
It should be noted however that these flashcards add value to but areno substitute for reading the SRM study manual Examples and problemswhich are key to exam success are not included or discussed in these flash-cards We suggest that you first read the manual carefully go over the
i
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
in-text examples and (most of the) end-of-chapter problems then use theflashcards as a means to review what you have learned and to ensure thatyou have mastered all of the key concepts
As with the SRM study manual we would be extremely grateful if youcould share your comments and suggestions on these flashcards with usand bring to our attention any potential errors Please direct your com-ments and questions to ambrose-louiowaedu The authors will try theirbest to respond to any inquiries as soon as possible and an ongoing list ofupdates will be maintained online at httpssitesgooglecomsite
ambroseloyppublicationsSRMWe wish you the best of luck with your SRM exam
Runhuan FengDaniel Linders
Ambrose LoFebruary 2019
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
Part I
Regression Models
1
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
Chapter 1
Simple Linear Regression
3
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash4
11 Basics
bull Simple linear regression (SLR) model equation An approxi-mately linear relationship between y and x
y︸ ︷︷ ︸response
= β0 + β1x︸ ︷︷ ︸regression function
+ ε︸ ︷︷ ︸error
where
y is the response variable (aka dependent variable)
x is the explanatory variable (aka predictors features)
β0 (intercept) and β1 (slope) are regression coefficients
ε is the random error term
In the above model we say that y is regressed on x (denoted y sim x)
bull Defining property of SLR There is only one explanatory variablenamely x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash5
bull Model assumptions
A1 The yirsquos are realizations of random variables while the xirsquosare nonrandom
A2 ε1 ε2 εn are independent with
E[εi] = 0 and Var(εi) = σ2
for all i = 1 2 n
Almost always further assume that εirsquos are normally dis-tributed ie
εiiidsim N(0 σ2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash6
12 Model Fitting by Least Squares Method
bull Idea of least squares method Choose β0 and β1 to make the sum ofsquares
SS(β0 β1) =
nsumi=1
[ yi︸ ︷︷ ︸obs value
minus ( β0 + β1xi︸ ︷︷ ︸candidate fitted value
)]2
the ldquoleastrdquo
bull Least squares estimates (LSEs)
β1 =SxySxx
=
sumni=1(xi minus x)(yi minus y)sumn
i=1(xi minus x)2and β0 = y minus β1x
where
Sxy =sumni=1(xi minus x)(yi minus y) =
sumni=1 xiyi minus nxy
Sxx =sumni=1(xi minus x)2 =
sumni=1 x
2i minus nx2
(Suggestion Remember the formulas for β0 and β1)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash7
bull How can the calculation of LSEs be tested
Case 1 Given the raw data (xi yi)ni=1 with a relatively small n(eg n le 10)
Enter the data into your financial calculator and read the out-put from its statistics functions
Case 2 Given summarized data in the form of various sums eg
nsumi=1
xi
nsumi=1
yi
nsumi=1
x2i
nsumi=1
y2i
nsumi=1
xiyi
Expand the products in the two sums that appear in β1 anduse the alternative form
β1 =
sumni=1 xiyi minus nxysumni=1 x
2i minus nx2
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash8
bull An alternative formula for β1 in terms of sample correlation
β1 = r times sysx
(Warning Not r times sx
sy
)where
sx and sy are the sample standard deviations of x and y
r is the sample correlation coefficient between x and y
bull Application of this formula Slope estimates when regressing y on xand regressing x on y are related via
βysimx1 times βxsimy1 = r2 = R2︸ ︷︷ ︸see Sect 13
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash9
bull Fitted values and residuals Given β0 and β1 we can compute
1 The fitted value (aka predicted value) yi = β0 + β1xi
Mnemonic Obtained from the model equation by
β0 rarr β0 β1 rarr β1 εi rarr 0
Ideally yi should be close to yi
2 The residual ei = yi minus yi Memory alert Not yi minus yi Completely different from εi which is unobservable and whichei serves to approximate
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash10
bull Graphical illustration of fitted regression line and definitions offitted value and residual
0
y
x
(xi yi)
(xi yi)
Slope = β1
fitted regression line
y = β0 + β1x
yi minus yi = ei
β0
xi
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash11
bull Sum-to-zero constraints on residuals
1sumni=1 ei = 0 provided that β0 is included in the model
Meaning The residuals offset one another to produce a zero sumthey are negatively correlated
2sumni=1 xiei = 0
Meaning The residuals and the explanatory variable values areuncorrelated
Mnemonic β0 and β1 satisfy
part
partβ0SS(β0 β1) = minus2
nsumi=1
[
ei︷ ︸︸ ︷yi minus (β0 + β1xi)] = 0
part
partβ1SS(β0 β1) = minus2
nsumi=1
xi[yi minus (β0 + β1xi)︸ ︷︷ ︸ei
] = 0
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash12
13 Assessing Goodness of Fit of the Model
bull Three kinds of sums of squares
Sum of
SquaresAbbrev Def What Does It Measure
Total SS TSS
Variation of
response values
about y
Amount of variability inher-
ent in the response prior to
performing regression
Residual SS
or
Error SS
RSS
Variation of
response values
about fitted
regression line
bull Goodness of fit of the SLRmodel (the lower the better)
bull Amount of variability of
response left unexplained
even after introduction of x
Regression SS Reg SS
Variation
explained by SLR
(or the knowledge
of x)
How effective SLR model is
in explaining the variation in
y
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash13
bull ANOVA identity
nsumi=1
(yi minus y)2
︸ ︷︷ ︸TSS
=
nsumi=1
(yi minus yi)2
︸ ︷︷ ︸RSS
+
nsumi=1
(yi minus y)2
︸ ︷︷ ︸Reg SS
bull Coefficient of determination
Definition R2 =Reg SS
TSS= 1minus RSS
TSS Measures the proportion of variation of response (about its mean)
explained by the SLR model
The higher the better
bull Specialized formulas for Reg SS and R2 under SLR
Reg SS = β21Sxx
R2 = r2 = Corr(x y)2 (square of correlation between x and y)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash14
bull ANOVA table
Source Sum of Squares df Mean Square F -value
Regression Reg SS 1 Reg SS1
Error RSS nminus 2 s2 = RSS(nminus 2)
Total TSS nminus 1
Structure
Different sources of variation in y
Some ldquoinformalrdquo rules for counting df
ndash Reg SS has 1 df because of one explanatory variable
ndash RSS has 2 df subtracted from n because of two parameters β0
and β1
Dividing an SS by its df results in a mean square (MS)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash15
bull Mean square error
s2 =RSS
df of RSS=
sumni=1 e
2i
nminus 2
s =radics2 is the residual standard deviation or residual standard error
bull F -test
HypothesesH0 β1 = 0︸ ︷︷ ︸
iid model
vs Ha β1 6= 0︸ ︷︷ ︸SLR model
a test of the significanceusefulness of x in explaining y
F -statistic
F =Reg SS(df of Reg SS)
RSS(df of RSS)=
Reg SS1
RSS(nminus 2)
Behavior of F -statistic
ndash H0 Expected value close to one
ndash Ha Tends to be large
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash16
bull F -test (Cont)
Going between F -statistic and R2
F = (nminus 2)
(Reg SSTSS
RSSTSS
)= (nminus 2)
(R2
1minusR2
)(Mnemonic Divide both the numerator and denominator of the F -statistic by TSS to get R2)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash17
14 Statistical Inference about β0 and β1
bull Sampling distributions of β0 and β1
Linear combination formulas
β1 =
nsumi=1
wiyi where wi =xi minus xSxx
β0 =
nsumi=1
wi0yi where wi0 =1
nminus xwi
(Suggestion Remembering these weights is recommended but notabsolutely essential)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash18
bull Sampling distributions of β0 and β1 (Cont)
Unbiased E[βj ] = βj for j = 0 1
Variances
Var(β0) = σ2
(1
n+
x2
Sxx
)and Var(β1) =
σ2
Sxx
(Suggestion Remember these two formulas)
Estimated variances With σ2 rarr s2 (MSE)
Var(β0) = s2
(1
n+
x2
Sxx
)and Var(β1) =
s2
Sxx
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash19
bull t-test
Hypotheses H0 βj = d vs Ha
βj 6= d
βj gt d
βj lt d
Important special case d = 0 (ie to test if x is useful)
t-statistic
t(βj) =LSEminus hypothesized value
standard error of LSE=
βj minus dSE(βj)
Null distribution t(βj)H0sim tnminus2
Decision rules and p-value
Ha Decision Rule p-value (t is the observed value of t(βj))
βj 6= d |t(βj)| gt tnminus2α2 P(|tnminus2| gt |t|) = 2P(tnminus2 gt |t|)βj gt d t(βj) gt tnminus2α P(tnminus2 gt t)
βj lt d t(βj) lt minustnminus2α P(tnminus2 lt t)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash20
bull Confidence intervals (CIs) for β0 and β1 General structure is
LSEplusmn t-quantiletimes Standard error = βj plusmn tnminus2α2 times SE(βj)
Eg β1 plusmn tnminus2α2 times SE(β1) is the CI for β1
Construction requires formulas of SE(β0) and SE(β1)
bull Relationship between F -test and t-test for H0 β1 = 0
Direct connection between test statistics
F = t(β1)2
Importance Connect
information about β1 (captured by t(β1))
with
information about the whole model (captured by F )
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash21
15 Prediction
bull Target (random variable)
ylowast = β0 + β1xlowast + εlowast
where xlowast is explanatory variable value of interest
bull Generic setting
response known values of explanatory variables
y x
y1 x1
observed
(past) data
y2 x2
yn xn
Unobserved
(future) data
ylowast (target) larr xlowast
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash22
bull Point predictorylowast = β0 + β1xlowast
(Mnemonic Set β0 rarr β0 β1 rarr β1 and εlowast rarr 0 same trick as fittedvalues)
bull 100(1minusα) prediction interval
point predictorplusmn t-quantiletimes st error of prediction error
= ylowast plusmn tnminus2α2 times SE(ylowast minus ylowast)
= (β0 + β1xlowast)plusmn tnminus2α2
radics2
[1 +
1
n+
(xlowast minus x)2
Sxx
](Suggestion Remember this formula)
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION
1ndash23
bull Remarks on the structure of the prediction interval Two sourcesof uncertainty associated with prediction
Var(ylowast minus ylowast) = s2︸ ︷︷ ︸1copy
+ s2
[1
n+
(xlowast minus x)2
Sxx
]︸ ︷︷ ︸
2copy
1copy Variability of the random error εlowast Reflected in the extra s2
2copy Estimation of the true regression line at xlowast
β0 and β1 are only estimates of β0 and β1 and are subject tosampling fluctuations
Variance of prediction error minimized when xlowast = x and in-creases quadratically as xlowast moves away from x
ACTEX Learning copy 2019 CHAPTER 1 SIMPLE LINEAR REGRESSION