Upload
matthew-goodwin
View
13
Download
6
Embed Size (px)
DESCRIPTION
fvsfv
Citation preview
MATH2831/2931
Linear Models/ Higher Linear Models.
August 19, 2013
Week 4 Lecture 3 - Last lecture:
Confidence Intervals for coefficients
Properties of multivariate Gaussian
Hypothesis testing for coefficients
Confidence intervals for the mean and prediction intervals.
Joint confidence regions.
Week 4 Lecture 3 - This lecture:
Decomposing variation
Introduction to the analysis of variance table
Sequential sums of squares.
Week 4 Lecture 3 - Decomposing variation
RECALL: Identity for simple linear regression
n
i=1
(yi y)2 =
n
i=1
(yi y)2 +
n
i=1
(yi yi )2.
SStotal = SSreg + SSres
SStotal , total sum of squares (the sum of squared deviations ofthe responses about their mean)
SSreg , regression sum of squares (sum of squared deviations offitted values about their mean, which is y)
SSres is called the residual sum of squares (the sum of thesquared deviations of the fitted values from the responses).
Week 4 Lecture 3 - Decomposing variation
This identity decomposing variation into a part explained bythe model and a part unexplained holds in the general linearmodel.
For simple linear regression, the partitioning of variation waspresented in the analysis of variance (ANOVA) table
The ANOVA table was also a way of organizing calculations inhypothesis testing
Week 4 Lecture 3 - Adjusted R2
For simple linear regression
R2 =SSreg
SStotal.
We also have adjusted R2, written as R2.
R2 = 0.748 here (or 74.8 percent)
What is the definition of R2 ???
Rewrite R2 as
R2 = 1SSres
SStotal(1)
Define R2 by replacing SSres in (1) by 2 (which is
SSres/(n p)) and replacing SStotal by SStotal/(n 1).
Week 4 Lecture 3 - Adjusted R2
R2 = 1(n 1)SSres(n p)SStotal
(2)
or
R2 = 12(n 1)
SStotal. (3)
In terms of R2,
R2 = 1n 1
n p(1 R2).
Week 4 Lecture 3 - Adjusted R2
What was the motivation for introduction of R2 ?
R2 is an easily interpreted measure of fit of a linear model:proportion of total variation explained by the model.
Might be tempted to use R2 as a basis for comparing modelswith different numbers of parameters.
IMPORTANT: R2 is not helpful here: if a new predictor isadded to a linear model, the residual sum of squares alwaysdecreases, and R2 will increase.
The attempt to select a subset of good predictors from a setof possible predictors using R2 results in the full model, evenif many of the predictors are irrelevant.
R2 does not necessarily increase as new predictors areadded to a model.
Week 4 Lecture 3 - Adjusted R2
Since
R2 = 12(n 1)
SStotal
R2 increases as 2 decreases
Ranking models using R2 is equivalent to ranking modelsbased on 2
QUESTION: Does 2 necessarily decrease as new predictorsare added to the model, and hence must R2 increase?
Week 4 Lecture 3 - Adjusted R2
Recall
2 =(y Xb)(y Xb)
n p.
Consider two models in which one model contains a subset of thepredictors included in the other.
For the larger model, the numerator in the above expression(residual sum of squares) is smaller, but the denominator willalso be smaller, as p is larger
Any reduction in the residual sum of squares must be largeenough to overcome the reduction in the denominator
R2 doesnt necessarily increase as we make the model morecomplicated.
So R2 may be useful as a crude device for model comparison !!
Week 4 Lecture 3 - Analysis of variance table
Week 4 Lecture 3 - Analysis of variance table
Notation: = (0, ...k), = ((1), (2)) where (1) is an r 1
subvector and (2) is a (p r) 1 subvector.
Week 4 Lecture 3 - Sequential sums of squares
Write R((2)|(1)) for the increase in SSreg when thepredictors corresponding to the parameters (2) are added toa model involving the parameters (1)
Think of R((2)|(1)) as the variation explained by theterm involving (2) in the presence of the term involving(1)
Define R(1, ..., k |0) as SSreg .
Week 4 Lecture 3 - Sequential sums of squares
Sequential sums of squares shown below the analysis of variancetable are the values
R(1|0)
R(2|0, 1)
R(3|0, 1, 2)...
R(k |0, ..., k1)
Values add up to R(1, ...k |0) = SSreg .
Week 4 Lecture 3 - Sequential sums of squares
Sequential sums of squares are useful when we have first orderedthe variables in our model in a meaningful way(based on the underlying science or context).
They tell us about how much a term contributes to explainingvariation given all the previous terms in the table(but ignoring the terms which come after).
Week 4 Lecture 3 - Hypothesis testing
Simple linear regression model: t test (or equivalent F test)for examining the usefulness of a predictor.
General linear model: partial t test for the usefulness of apredictor in the presence of the other predictors.
Equivalent partial F test: test statistic is the square of thepartial t statistic.
Test for overall model adequacy: is the model including all thepredictors better than the model containing just an intercept?
The F statistic in the analysis of variance table and thep-value relate to a test for overall model adequacy !
Week 4 Lecture 3 - Testing model adequacy
In the general linear model, if 1 = . . . = k = 0, then the statistic
F =SSreg/k
SSres/(n p)
has an Fk,np distribution.This distributional result is the basis for a hypothesis test.
Week 4 Lecture 3 - Testing model adequacyTo test
H0 : 1 = . . . = k = 0
versusH1 : Not all j = 0, j = 1, . . . , k
we use the test statistic
F =SSreg/k
SSres/(n p).
For a size test the critical region is
F F;k,np.
Alternatively, the p-value for the test is
Pr(F F )
where F Fk,np.
Week 4 Lecture 3 - Testing model adequacy
ANOVA table: columns show source of variation (Source), degreesof freedom (DF), sums of squares (SS), mean squares (MS), valueof F statistic for testing model adequacy (F) and correspondingp-value (P).
Source DF SS MS F P
Regression p 1 SSregSSreg(p1)
MSregMSres
p
Residual n p SSresSSres(np)
Total n 1 SStotal
Week 4 Lecture 3 - Model adequacy for risk assessment
Risk assessment data: response is mean risk assessment, sevenaccounting determined measures of risk as predictors.
Week 4 Lecture 3 - Model adequacy for risk assessmentRECAL we are testing
H0 : 1 = . . . = k = 0
versusH1 : Not all j = 0, j = 1, . . . , k
we use the test statistic
F =SSreg/k
SSres/(n p).
F statistic for testing overall model adequacy is 6.97, and theassociated p-value is
p = Pr(F 6.97)
where F F7,17.p = 0.001
approximately.
Week 4 Lecture 3 - Model adequacy for risk assessment
RESULT: Reject the null hypothesis
H0 : 1 = ... = k = 0
in favour of the alternative
H1 : Not all j = 0, j = 1, ..., k.
What can we say about inclusion of predictors in the order wehave selected?
Mean Risk Assessment = 2.19 + 0.443 Dividend Payout +0.865 Current Ratio - 0.247 Asset Size + 1.96 Asset Growth+ 3.59 Leverage + 0.135 Variability Earnings + 1.05Covariability Earnings
We have from this ordering: R(1|0) = 18.42;R(2|1, 0) = 5.6042; R(3|2, 1, 0) = 10.12;R(4|3, 2, 1, 0) = 1.64; . . .
Week 4 Lecture 3 - Model adequacy for risk assessment
Under a different ordering:Mean Risk Assessment = 2.19 + 0.865 Current Ratio + 1.96Asset Growth + 3.59 Leverage + 0.443 Dividend Payout -0.247 Asset Size + 1.05 Covariability Earnings + 0.135Variability Earnings
NOTE: The joint model adequacy F test statistic doesnot change neither does the result of the hypothesistest !!!
NOTE: The estimates S = 0.981620; R-Sq = 74.2%;R-Sq(adj) = 63.5% are unchanged !!!
NOTE: R(1|0), R(2|1, 0) ... clearly changed !!!
Week 4 Lecture 2 - Learning Expectations.
Be familiar with decomposing variation in the General LinearModel.
Understand the Sequential sums of squares and be able tointerpret and calculate.
Understand R2 versus R2(Adjusted)