Lecture 13 Psychology 790
Chapter 1 : Linear Regression With OnePredictor Variable
Lecture 13October 24, 2006Psychology 790
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression
Wrapping Up
Lecture 13 Psychology 790
Todays Lecture
Where we are going for the rest of the semester.
Simple linear regression.
Chapter 1 of Kutner.
Regression concepts.
Lecture 13 Psychology 790
Our New Schedule
Lecture 13 Psychology 790
Regression Concepts
Todays Lecture Schedule
RegressionConcepts Linear
Regression Basic Concepts in
Regression Other Forms of
Regression
Simple LinearRegression
Wrapping Up
Lecture 13 Psychology 790
Linear Regression
We use regression analysis when we want to predict onevariable from another.
The most basic form of regression is called simpleregression:
We have 1 independent variable and 1 dependentvariable.
We are predicting a linear trend (both are continuousvariables).
Yi = 0 + 1Xi + i
In regression, we attempt to determine the magnitude of the(typically imperfect) relationship between a set ofindependent variables and the dependent variable.
Todays Lecture Schedule
RegressionConcepts Linear
Regression Basic Concepts in
Regression Other Forms of
Regression
Simple LinearRegression
Wrapping Up
Lecture 13 Psychology 790
Linear Regression
Independent variable(s) (X): Also called the predictorvariable. The variable that we believe influences ourdependent variable.
Independent variables are on the right side of theequation, dependent variables are on the left side of theequation.
Dependent variable(s) (Y): Also called the response variable.The variable of interest that we want to predict.
Todays Lecture Schedule
RegressionConcepts Linear
Regression Basic Concepts in
Regression Other Forms of
Regression
Simple LinearRegression
Wrapping Up
Lecture 13 Psychology 790
Basic Concepts in Regression
A regression model is a formal way of stating both of thefollowing:
1. A tendency of the response variable (dependent) Y tovary with the predictor variable (independent) X .
2. A scattering of points around some statistical relationship(in our case a line).
The two following characteristics of a regression model are:
1. There is a probability distribution of Y for each level of X .
2. The means of these probability distributions vary is somesystematic fashion with X .
Todays Lecture Schedule
RegressionConcepts Linear
Regression Basic Concepts in
Regression Other Forms of
Regression
Simple LinearRegression
Wrapping Up
Lecture 13 Psychology 790
Other Forms of Regression
As we will see later, regression can take on many differentforms.
We can alter our simple regression in the following ways:
Add more than 1 independent variable.
Add more than 1 dependent variable.
Study a non-linear relationship.
Study relationship with categorical independent variables(ANOVA).
What if we wanted to linearly predict Y given a value of asingle variable X?
We use Simple Linear Regression.
Lecture 13 Psychology 790
Simple Linear Regression
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Simple Linear Regression
Assume (for now) X is fixed at pre-determined levels in anexperiment - independent variable.
For example, we have an experiment where subjects aregiven X cups of coffee.
Subjects should be randomly assigned to a group drinkingeither 1, 2, 3, 4,or 5 cups of coffee.
Then we want to estimate the linear effect of theindependent variable X on the dependent variable Y .
For our example, we want to see how coffee drinkingaffects blood pressure.
Blood pressure = Y = dependent variable.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
The Basics
The linear regression model (for observation i = 1, . . . , N ):
Yi = 0 + 1Xi + i
Dont be confused by the Greek alphabet, this is simply theequation for a line (y = mx + b).
0 is the mean of the population when X is zero...the Yintercept.
1 is the slope of the line, the amount of increase in Ybrought about by a unit increase (X = X + 1) in X .
i is the random error, specific to each observation.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Important Features
1. The response Yi is a random variable.
2. E(i) = 0 therefore E(Yi) = 0 + 1Xi.
3. The response term Yi varies by the error term i.
4. 2(i) = 2(Yi) = 2 - Each probability distribution of Y hasthe same variance 2.
5. All error terms are uncorrelated.
Each response Yi comes from a probability distribution with:Mean: E(Yi) = 0 + 1Xi
Variance: Var(Yi) = 2(Yi) = 2(0 + 1Xi + i) = 2(i) 2
Any two responses are uncorrelated.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Parameter Estimates
The simple linear regression model is parameterized as:
Yi = 0 + 1Xi + i
To find estimates for 0 and 1 there are quite a few choices:
So thatN
i
2i is minimized.
By making distributional assumptions about i and usingmaximum likelihood estimators.
So thatN
i
|i| is minimized.
From some guy in the hallway, or Bob Hensons(http://www.uncg.edu/ rahenson/) Dad.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
And The Winner Is...
Finding 0 and 1 that minimize:
N
i
2i
Using calculus, these happen to be:
1 =
(Xi X)(Yi Y )
(Xi X)2= rxy
sy
sx=
xyx2
And:
0 = Y 1X
LS estimates are considered BLUE: Best Linear UnbiasedEstimators.
You are in luck: the LS estimators for 0 and 1 are also theMLEs for 0 and 1 when error terms are N(0, 2).
Lecture 13 Psychology 790
An Example of Simple Linear Regression
The following is data from an experiment where X was the number of hoursgiven for study, and Y is the score on a test.
15-1
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Example (continued)
We can tell that:
(Xi X)
2 =
X2i = 40
(Xi X)(Yi Y ) =
XiYi = 30
X = 3.0
Y = 7.3
So:
1 =
XiYiX2i
=30
40= 0.75
0 = Y 1X = 7.3 (0.75 3.0) = 5.05
Given these estimates, the linear regression line is given by:
Y = 5.05 + 0.75X
Lecture 13 Psychology 790
Example (continued)
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Example (continued)
1.00 2.00 3.00 4.00 5.00
Hours of Study
4.00
6.00
8.00
10.00
12.00
Te
st
Sco
re
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
W
Test Score = 5.05 + 0.75 * X
RSquare = 0.17
Example SAS Codelibname ex1 C:\Documents and Settings\
Jonathan Templin\Desktop\Psych 790\Lectures\10_24\data;
proc gplot data=ex1.sasex1;plot y*x;run;
ods html style=journal;ods graphics on;
proc print data=ex1.sasex1;run;
proc glm data=ex1.sasex1;model y=x /solution;run;
ods graphics off;ods html close;
18-1
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Example (continued)
Ok, so now you have the parameter estimates, so what dothey mean?
Yi = 5.05 + 0.75Xi
Meaning of 0
In general, it is mean of Y when X = 0.
For this example, it is the mean test score when studentsdo not study for the test.
So, students score a 5.05 on average when they did notstudy.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Example (continued)
Y = 5.05 + 0.75X
Meaning of 1
In general, increase in Y for each unit increase in X .
For this example, the mean test score for studentsincreases by .75 for each additional hour they study.
So, adding an additional hour to your study time will resultin an average score of .75 points higher, two hours equal1.5 points higher, etc.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Point Estimation
How do we estimate or predict the value of Y given a certainvalue of X .
With any probability distribution, our best estimate is themean.
How do we find the mean at a given point?
Well, E(Yi) = 0 + 1Xi (use the regression equation andplug in your value of X).
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Point Estimation
Back to our example, what is the expected value on theexam for a person that studies for 4 hours?
E(Y ) = 5.05 + 0.75 4
E(Y ) = 8.05
For a person studying 4 hours, the expected score on theexam is Y = 8.05.
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression The Basics Important
Features Estimation Example Point Estimation Variance
Estimation
Wrapping Up
Lecture 13 Psychology 790
Variance Estimation
As an added note, we can also estimate the variance of Y ,2.
The long way is to compute it is by:
2 =
(Yi Yi)
2
n
The shortcut way is to use the SAS output we have (see nextslide).
You will notice on your output that you have an ANOVA table- SSE (Sum of Squares Error) is an estimate of your variance2
Lecture 13 Psychology 790
Variance Estimation
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression
Wrapping Up Final Thought Next Class
Lecture 13 Psychology 790
Final Thought
Today we introducedregression - a topic we willcover for the rest of thesemester.
We will come to see howwe can use regression(as part of the generallinear model) toaccomplish most of ourstatistical tasks.
The simple linear regression model is easily extendable tomore complicated regression models.
We will see the types of hypothesis tests we can use forregression next time.
get_video.mpgMedia File (video/mpeg)
Todays Lecture Schedule
RegressionConcepts
Simple LinearRegression
Wrapping Up Final Thought Next Class
Lecture 13 Psychology 790
Next Time
Kutner Chapter 2 (please read before class).
Inferences in Regression and Correlation.
Testing the regression parameters.
Intervals for Y
Today's LectureOur New ScheduleRegression ConceptsLinear RegressionLinear RegressionBasic Concepts in RegressionOther Forms of Regression
Simple Linear RegressionSimple Linear RegressionThe BasicsImportant FeaturesParameter EstimatesAnd The Winner Is...An Example of Simple Linear RegressionExample (continued)Example (continued)Example (continued)Example (continued)Example (continued)Point EstimationPoint EstimationVariance EstimationVariance Estimation
Wrapping UpFinal ThoughtNext Time