12
I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Embed Size (px)

Citation preview

Page 1: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

I231B QUANTITATIVE METHODS

ANOVA continued and Intro to Regression

Page 2: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Agenda2

Exploration and Inference revisited

More ANOVA (anova_2factor.do)

Basics of Regression (regress.do)

Page 3: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

3

It is "well known" to be "logically unsound and practically misleading" to make inference as if a model is known to be true when it has, in fact, been selected from the same data to be used for estimation purposes.

- Chris Chatfield in "Model Uncertainty, Data Mining and Statistical Inference", Journal of the Royal Statistical Society, Series A, 158 (1995), 419-486 (p 421)

Page 4: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Never mix exploratory analysis with inferential modeling of the same variables

in the same dataset.4

Exploratory model building is when you hand-pick some variables of interest and keep adding/removing them until you find something that ‘works’.

Inferential models are specified in advance: there is an assumed model and you are testing whether it actually works with the current data.

Page 5: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

(ONE IV AND ONE DV)

5

Basic Linear Regression

Page 6: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Regression versus Correlation6

Correlation makes no assumption about one whether one variable is dependent on the other– only a measure of general association

Regression attempts to describe a dependent nature of one or more explanatory variables on a single dependent variable. Assumes one-way causal link between X and Y.

Thus, correlation is a measure of the strength of a relationship -1 to 1, while regression measures the exact nature of that relationship (e.g., the specific slope which is the change in Y given a change in X)

Page 7: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Basic Linear Model7

Yi = b0 + b1xi + ei.

X (and X-axis) is our independent variable(s)

Y (and Y-axis) is our dependent variable

b0 is a constant (y-intercept)

b1 is the slope (change in Y given a one-unit change in X)

e is the error term (residuals)

Page 8: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Basic Linear Function8

Page 9: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Slope9

But...what happens if B is negative?

Page 10: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Statistical Inference Using Least Squares

10

We obtain a sample statistic, b, which estimates the population parameter.

We also have the standard error for b

Uses standard t-distribution with n-2 degrees of freedom for hypothesis testing.

YYii = b = b0 0 + b+ b11xxii + e + eii..

Page 11: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Why Least Squares?11

For any Y and X, there is one and only one line of best fit. The least squares regression equation minimizes the possible error between our observed values of Y and our predicted values of Y (often called y-hat).

Page 12: I231B QUANTITATIVE METHODS ANOVA continued and Intro to Regression

Data points and Regression12

http://www.math.csusb.edu/faculty/stanton/m262/regress/regress.html