Upload
james-rogers
View
218
Download
3
Embed Size (px)
Citation preview
Multiple Regression Multiple Regression AnalysisAnalysis
The principles of Simple Regression Analysis can be extended to two or more explanatory variables.
With two explanatory variables we get an equation
Y = α + β1X1 + β2X2. . It is customary to write it as Y = β0 +β1X1 + β2X2
As an example, if a hypotensive agent is administered prior to surgery, recovery time for blood pressure to normal value will depend on the dose of the hypotensive and the blood pressure during surgery.
This can be modelled as Recovery time = log dose – Surgery B.P.
Categorical Explanatory Categorical Explanatory VariablesVariables
Binary variables are coded 0, 1. For Binary variables are coded 0, 1. For example a binary variable example a binary variable xx11(‘Gender’) is coded male = 0, (‘Gender’) is coded male = 0, female = 1.female = 1.
Recovery time for Blood Recovery time for Blood Pressure and dose of Pressure and dose of
hypotensivehypotensive
6.55.54.53.52.5
70
60
50
40
30
20
10
0
Logdose
Re
cvT
ime
S = 14.7103 R-Sq = 15.5 % R-Sq(adj) = 13.8 %
RecvTime = -14.2576 + 8.00772 Logdose
95% CI
Regression
Recovery time for Blood Pressure and dose of hypotensiveThe scatter plot shows a linear relationship. Blood Pressure takes longer to come back to normal value the larger the dose of the hypotensive.
There are many outliers because of individual variability of subjects and because of different types of surgical operations.
Recovery time for Blood Recovery time for Blood Pressure and lowest Blood Pressure and lowest Blood Pressure reading during Pressure reading during
surgerysurgery
9080706050
70
60
50
40
30
20
10
0
BpsurgR
ecv
Tim
e
S = 15.9386 R-Sq = 0.8 % R-Sq(adj) = 0.0 %
RecvTime = 34.4692 - 0.183546 Bpsurg
95% CI
Regression
Recovery time for Blood Pressure and lowest B.P. reading during surgery
The lower the blood pressure achieved during surgery the longer the time for it to reach normal value during recovery from anaesthesia
Multiple Regression Multiple Regression AnalysisAnalysis
The effects of the two explanatory variables acting jointly is described by the equation
Recov. Time = 22.3 + 10.6 Log dose – 0.740 Surg. B.P.
As noted on the scatter plots several observations had outliers or larger than expected X values.
Categorical Explanatory Categorical Explanatory VariablesVariables
Binary variables are coded 0, 1. For example a variable xBinary variables are coded 0, 1. For example a variable x11 (Gender) is coded (Gender) is coded
male = 0 female = 1. Then in the regression equationmale = 0 female = 1. Then in the regression equationY = Y = ββ00 + + ββ11xx1 1 + + ββ22xx2 2 when xwhen x11 = 1 the value of Y indicates what is obtained for female = 1 the value of Y indicates what is obtained for female gender; and when xgender; and when x11 = 0 the value of Y indicates what is obtained for males. = 0 the value of Y indicates what is obtained for males.
If we have a nominal variable with more than two categories we have to create a If we have a nominal variable with more than two categories we have to create a number of new number of new dummydummy (also called (also called indicatorindicator) binary variables ) binary variables
How many Explanatory How many Explanatory Variables?Variables?
As a rule of thumb multiple As a rule of thumb multiple regression analysis should not be regression analysis should not be performed if the total number of performed if the total number of variables is greater than the number variables is greater than the number of of
subjects subjects ÷ 10.÷ 10.
AnalysisAnalysis
In the computer output look for:In the computer output look for:
Adjusted RAdjusted R22. It represents the proportion of . It represents the proportion of variability of Y explained by the X’s. R2 is variability of Y explained by the X’s. R2 is adjusted so that models with different number of adjusted so that models with different number of variables can be compared.variables can be compared.
The The F-F-test in the ANOVA table. Significant F test in the ANOVA table. Significant F indicates a linear relationship between Y and at indicates a linear relationship between Y and at least one of the X’s.least one of the X’s.
The The t-t-test of each partial regression coefficient. test of each partial regression coefficient. SignificantSignificant t t indicates that the variable in indicates that the variable in question influences the Y response while question influences the Y response while controlling for other explanatory variables.controlling for other explanatory variables.
Usefulness of Scatter Usefulness of Scatter Plots - IPlots - I
The scatter plot on the The scatter plot on the right illustrates the right illustrates the relationship between relationship between water hardness and water hardness and mortality in 61 large mortality in 61 large towns in England and towns in England and Wales.Wales.
The regression line The regression line indicates inverse indicates inverse relationship between relationship between water hardness and water hardness and mortality rates.mortality rates.
140120100 80 60 40 20 0
2000
1500
1000
CalciumM
ort
al
S = 143.029 R-Sq = 42.9 % R-Sq(adj) = 41.9 %
Mortal = 1676.36 - 3.22609 Calcium
95% CI
Regression
Motality and Water Hardness
Usefulness of Scatter Usefulness of Scatter Plots - IIPlots - II
0
50
100
1stQtr
3rdQtr
EastWestNorth
100 90 80 70 60 50 40 30 20 10 0
2000
1900
1800
1700
1600
1500
1400
CalciumN
Mort
alN
S = 129.209 R-Sq = 13.6 % R-Sq(adj) = 11.0 %
MortalN = 1692.31 - 1.93134 CalciumN
95% CI
Regression
Motality and Water Hardness in Towns in the North
The inverse relationship between water hardness is The inverse relationship between water hardness is till maintained. Buttill maintained. But
For towns in the North the regression line is less For towns in the North the regression line is less steep than for towns in the South indicating that steep than for towns in the South indicating that other causes of mortality are stronger in the North other causes of mortality are stronger in the North compared to the South.compared to the South.
140120100 80 60 40 20 0
1600
1500
1400
1300
1200
1100
CalciumS
Mo
rta
lS
S = 114.297 R-Sq = 36.3 % R-Sq(adj) = 33.6 %
MortalS = 1522.82 - 2.09272 CalciumS
95% CI
Regression
Motality and Water Hardness in Towns in the South