Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil...
Preview:
Citation preview
- Slide 1
- Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah:
I0174 Analisis Regresi Tahun: Ganjil 2007/2008
- Slide 2
- Bina Nusantara Pertemuan 25 Interaksi dalam Regresi (Lanjutan)
Uji kesamaan koefisien arah regresi Suatu Contoh perhitungan
- Slide 3
- Bina Nusantara Population Y-intercept Population slopesRandom
error The Multiple Regression Model Relationship between 1
dependent & 2 or more independent variables is a linear
function Dependent (Response) variable Independent (Explanatory)
variables
- Slide 4
- Bina Nusantara Multiple Regression Model Bivariate model
- Slide 5
- Bina Nusantara Multiple Regression Equation Bivariate model
Multiple Regression Equation
- Slide 6
- Bina Nusantara Interpretation of Estimated Coefficients Slope (
b j ) Estimated that the average value of Y changes by b j for each
1 unit increase in X j, holding all other variables constant
(ceterus paribus) Example: If b 1 = -2, then fuel oil usage ( Y )
is expected to decrease by an estimated 2 gallons for each 1 degree
increase in temperature ( X 1 ), given the inches of insulation ( X
2 ) Y-Intercept ( b 0 ) The estimated average value of Y when all X
j = 0
- Slide 7
- Bina Nusantara Multiple Regression Model: Example ( 0 F)
Develop a model for estimating heating oil used for a single family
home in the month of January, based on average temperature and
amount of insulation in inches.
- Slide 8
- Bina Nusantara Multiple Regression Equation: Example Excel
Output For each degree increase in temperature, the estimated
average amount of heating oil used is decreased by 5.437 gallons,
holding insulation constant. For each increase in one inch of
insulation, the estimated average use of heating oil is decreased
by 20.012 gallons, holding temperature constant.
- Slide 9
- Bina Nusantara Multiple Regression in PHStat PHStat |
Regression | Multiple Regression Excel spreadsheet for the heating
oil example
- Slide 10
- Bina Nusantara Venn Diagrams and Explanatory Power of
Regression Oil Temp Variations in Oil explained by Temp or
variations in Temp used in explaining variation in Oil Variations
in Oil explained by the error term Variations in Temp not used in
explaining variation in Oil
- Slide 11
- Bina Nusantara Venn Diagrams and Explanatory Power of
Regression Oil Temp (continued)
- Slide 12
- Bina Nusantara Venn Diagrams and Explanatory Power of
Regression Oil Temp Insulation Overlapping variation NOT estimation
Overlapping variation in both Temp and Insulation are used in
explaining the variation in Oil but NOT in the estimation of nor
NOT Variation NOT explained by Temp nor Insulation
- Slide 13
- Bina Nusantara Coefficient of Multiple Determination Proportion
of Total Variation in Y Explained by All X Variables Taken Together
Never Decreases When a New X Variable is Added to Model
Disadvantage when comparing among models
- Slide 14
- Bina Nusantara Venn Diagrams and Explanatory Power of
Regression Oil Temp Insulation
- Slide 15
- Bina Nusantara Adjusted Coefficient of Multiple Determination
Proportion of Variation in Y Explained by All the X Variables
Adjusted for the Sample Size and the Number of X Variables Used
Penalizes excessive use of independent variables Smaller than
Useful in comparing among models Can decrease if an insignificant
new X variable is added to the model
- Slide 16
- Bina Nusantara Coefficient of Multiple Determination Excel
Output Adjusted r 2 reflects the number of explanatory variables
and sample size is smaller than r 2
- Slide 17
- Bina Nusantara Interpretation of Coefficient of Multiple
Determination 96.56% of the total variation in heating oil can be
explained by temperature and amount of insulation 95.99% of the
total fluctuation in heating oil can be explained by temperature
and amount of insulation after adjusting for the number of
explanatory variables and sample size
- Slide 18
- Bina Nusantara Simple and Multiple Regression Compared simple
The slope coefficient in a simple regression picks up the impact of
the independent variable plus the impacts of other variables that
are excluded from the model, but are correlated with the included
independent variable and the dependent variable multiple
Coefficients in a multiple regression net out the impacts of other
variables in the equation Hence, they are called the net regression
coefficients They still pick up the effects of other variables that
are excluded from the model, but are correlated with the included
independent variables and the dependent variable
- Slide 19
- Bina Nusantara Simple and Multiple Regression Compared: Example
Two Simple Regressions: Multiple Regression:
- Slide 20
- Bina Nusantara Simple and Multiple Regression Compared: Slope
Coefficients
- Slide 21
- Bina Nusantara Simple and Multiple Regression Compared: r
2
- Slide 22
- Bina Nusantara Example: Adjusted r 2 Can Decrease Adjusted r 2
decreases when k increases from 2 to 3 Color is not useful in
explaining the variation in oil consumption.
- Slide 23
- Bina Nusantara Using the Regression Equation to Make
Predictions Predict the amount of heating oil used for a home if
the average temperature is 30 0 and the insulation is 6 inches. The
predicted heating oil used is 278.97 gallons.
- Slide 24
- Bina Nusantara Test for Overall Significance Excel Output:
Example k = 2, the number of explanatory variables n - 1 p
-value
- Slide 25
- Bina Nusantara Test for Overall Significance: Example Solution
F 03.89 H 0 : 1 = 2 = = k = 0 H 1 : At least one j 0 =.05 df = 2
and 12 Critical Value : Test Statistic: Decision: Conclusion:
Reject at = 0.05. There is evidence that at least one independent
variable affects Y. = 0.05 F 168.47 (Excel Output)
- Slide 26
- Bina Nusantara Test for Significance: Individual Variables Show
If Y Depends Linearly on a Single X j Individually While Holding
the Effects of Other X s Fixed Use t Test Statistic Hypotheses: H 0
: j 0 (No linear relationship) H 1 : j 0 (Linear relationship
between X j and Y )
- Slide 27
- Bina Nusantara t Test Statistic Excel Output: Example t Test
Statistic for X 1 (Temperature) t Test Statistic for X 2
(Insulation)
- Slide 28
- Bina Nusantara t Test : Example Solution H 0 : 1 = 0 H 1 : 1 0
df = 12 Critical Values: Test Statistic: Decision: Conclusion:
Reject H 0 at = 0.05. There is evidence of a significant effect of
temperature on oil consumption holding constant the effect of
insulation. t 0 2.1788 -2.1788.025 Reject H 0 0.025 Does
temperature have a significant effect on monthly consumption of
heating oil? Test at = 0.05. t Test Statistic = -16.1699
- Slide 29
- Bina Nusantara Venn Diagrams and Estimation of Regression Model
Oil Temp Insulation Only this information is used in the estimation
of This information is NOT used in the estimation of nor
- Slide 30
- Bina Nusantara Confidence Interval Estimate for the Slope
Provide the 95% confidence interval for the population slope 1 (the
effect of temperature on oil consumption). -6.169 1 -4.704 We are
95% confident that the estimated average consumption of oil is
reduced by between 4.7 gallons to 6.17 gallons per each increase of
1 0 F holding insulation constant. We can also perform the test for
the significance of individual variables, H 0 : 1 = 0 vs. H 1 : 1
0, using this confidence interval.
- Slide 31
- Bina Nusantara Contribution of a Single Independent Variable
Let X j Be the Independent Variable of Interest Measures the
additional contribution of X j in explaining the total variation in
Y with the inclusion of all the remaining independent
variables
- Slide 32
- Bina Nusantara Contribution of a Single Independent Variable
Measures the additional contribution of X 1 in explaining Y with
the inclusion of X 2 and X 3. From ANOVA section of regression
for
- Slide 33
- Bina Nusantara Coefficient of Partial Determination of Measures
the proportion of variation in the dependent variable that is
explained by X j while controlling for (holding constant) the other
independent variables
- Slide 34
- Bina Nusantara Coefficient of Partial Determination for
(continued) Example: Model with two independent variables
- Slide 35
- Bina Nusantara Venn Diagrams and Coefficient of Partial
Determination for Oil Temp Insulation =
- Slide 36
- Bina Nusantara Coefficient of Partial Determination in PHStat
PHStat | Regression | Multiple Regression Check the Coefficient of
Partial Determination box Excel spreadsheet for the heating oil
example
- Slide 37
- Bina Nusantara Contribution of a Subset of Independent
Variables Let X s Be the Subset of Independent Variables of
Interest Measures the contribution of the subset X s in explaining
SST with the inclusion of the remaining independent variables
- Slide 38
- Bina Nusantara Contribution of a Subset of Independent
Variables: Example Let X s be X 1 and X 3 From ANOVA section of
regression for
- Slide 39
- Bina Nusantara Testing Portions of Model Examines the
Contribution of a Subset X s of Explanatory Variables to the
Relationship with Y Null Hypothesis: Variables in the subset do not
improve the model significantly when all other variables are
included Alternative Hypothesis: At least one variable in the
subset is significant when all other variables are included
- Slide 40
- Bina Nusantara Testing Portions of Model One-Tailed Rejection
Region Requires Comparison of Two Regressions One regression
includes everything Another regression includes everything except
the portion to be tested (continued)
- Slide 41
- Bina Nusantara Partial F Test for the Contribution of a Subset
of X Variables Hypotheses: H 0 : Variables X s do not significantly
improve the model given all other variables included H 1 :
Variables X s significantly improve the model given all others
included Test Statistic: with df = m and ( n-k-1 ) m = # of
variables in the subset X s
- Slide 42
- Bina Nusantara Partial F Test for the Contribution of a Single
Hypotheses: H 0 : Variable X j does not significantly improve the
model given all others included H 1 : Variable X j significantly
improves the model given all others included Test Statistic: with
df = 1 and ( n-k-1 ) m = 1 here
- Slide 43
- Bina Nusantara Testing Portions of Model: Example Test at the
=.05 level to determine if the variable of average temperature
significantly improves the model, given that insulation is
included.
- Slide 44
- Bina Nusantara Testing Portions of Model: Example H 0 : X 1
(temperature) does not improve model with X 2 (insulation) included
H 1 : X 1 does improve model =.05, df = 1 and 12 Critical Value =
4.75 (For X 1 and X 2 )(For X 2 ) Conclusion: Reject H 0 ; X 1 does
improve model.
- Slide 45
- Bina Nusantara Testing Portions of Model in PHStat PHStat |
Regression | Multiple Regression Check the Coefficient of Partial
Determination box Excel spreadsheet for the heating oil
example
- Slide 46
- Bina Nusantara Do We Need to Do This for One Variable? The F
Test for the Contribution of a Single Variable After All Other
Variables are Included in the Model is IDENTICAL to the t Test of
the Slope for that Variable The Only Reason to Perform an F Test is
to Test Several Variables Together
- Slide 47
- Bina Nusantara Dummy-Variable Models Categorical Explanatory
Variable with 2 or More Levels Yes or No, On or Off, Male or
Female, Use Dummy-Variables (Coded as 0 or 1) Only Intercepts are
Different Assumes Equal Slopes Across Categories The Number of
Dummy-Variables Needed is (# of Levels - 1) Regression Model Has
Same Form:
- Slide 48
- Bina Nusantara Dummy-Variable Models (with 2 Levels) Given: Y =
Assessed Value of House X 1 = Square Footage of House X 2 =
Desirability of Neighborhood = Desirable ( X 2 = 1) Undesirable ( X
2 = 0) 0 if undesirable 1 if desirable Same slopes
- Slide 49
- Bina Nusantara Undesirable Desirable Location Dummy-Variable
Models (with 2 Levels) (continued) X 1 (Square footage) Y (Assessed
Value) b 0 + b 2 b0b0 Same slopes Intercepts different
- Slide 50
- Bina Nusantara Interpretation of the Dummy-Variable Coefficient
(with 2 Levels) Example: : GPA 0 non-business degree 1 business
degree : Annual salary of college graduate in thousand $ With the
same GPA, college graduates with a business degree are making an
estimated 6 thousand dollars more than graduates with a
non-business degree, on average. :
- Slide 51
- Bina Nusantara Dummy-Variable Models (with 3 Levels)
- Slide 52
- Bina Nusantara Interpretation of the Dummy-Variable
Coefficients (with 3 Levels) With the same footage, a Split- level
will have an estimated average assessed value of 18.84 thousand
dollars more than a Condo. With the same footage, a Ranch will have
an estimated average assessed value of 23.53 thousand dollars more
than a Condo.
- Slide 53
- Bina Nusantara Regression Model Containing an Interaction Term
Hypothesizes Interaction between a Pair of X Variables Response to
one X variable varies at different levels of another X variable
Contains a Cross-Product Term Can Be Combined with Other Models
E.g., Dummy-Variable Model
- Slide 54
- Bina Nusantara Effect of Interaction Given: Without Interaction
Term, Effect of X 1 on Y is Measured by 1 With Interaction Term,
Effect of X 1 on Y is Measured by 1 + 3 X 2 Effect Changes as X 2
Changes
- Slide 55
- Bina Nusantara Y = 1 + 2X 1 + 3(1) + 4X 1 (1) = 4 + 6X 1 Y = 1
+ 2X 1 + 3(0) + 4X 1 (0) = 1 + 2X 1 Interaction Example Effect
(slope) of X 1 on Y depends on X 2 value X1X1 4 8 12 0 010.51.5 Y Y
= 1 + 2X 1 + 3X 2 + 4X 1 X 2
- Slide 56
- Bina Nusantara Interaction Regression Model Worksheet Multiply
X 1 by X 2 to get X 1 X 2 Run regression with Y, X 1, X 2, X 1 X 2
Case, iYiYi X 1i X 2i X 1i X 2i 11133 248540 31326 435630
:::::
- Slide 57
- Bina Nusantara Interpretation When There Are 3+ Levels MALE = 0
if female and 1 if male MARRIED = 1 if married; 0 if not DIVORCED =
1 if divorced; 0 if not MALEMARRIED = 1 if male married; 0
otherwise = (MALE times MARRIED) MALEDIVORCED = 1 if male divorced;
0 otherwise = (MALE times DIVORCED)
- Slide 58
- Bina Nusantara Interpretation When There Are 3+ Levels
(continued)
- Slide 59
- Bina Nusantara Interpreting Results FEMALE Single: Married:
Divorced: MALE Single: Married: Divorced: Main Effects : MALE,
MARRIED and DIVORCED Interaction Effects : MALEMARRIED and
MALEDIVORCED Difference
- Slide 60
- Bina Nusantara Suppose X 1 and X 2 are Numerical Variables and
X 3 is a Dummy-Variable To Test if the Slope of Y with X 1 and/or X
2 are the Same for the Two Levels of X 3 Model: Hypotheses: H 0 : =
= 0 (No Interaction between X 1 and X 3 or X 2 and X 3 ) H 1 : 4
and/or 5 0 ( X 1 and/or X 2 Interacts with X 3 ) Perform a Partial
F Test Evaluating the Presence of Interaction with Dummy-
Variable
- Slide 61
- Bina Nusantara Evaluating the Presence of Interaction with
Numerical Variables Suppose X 1, X 2 and X 3 are Numerical
Variables To Test If the Independent Variables Interact with Each
Other Model: Hypotheses: H 0 : = = = 0 (no interaction among X 1, X
2 and X 3 ) H 1 : at least one of 4, 5, 6 0 (at least one pair of X
1, X 2, X 3 interact with each other) Perform a Partial F Test