Download pptx - Regression

Page 1: Regression



Page 2: Regression


The dictionary meaning of regression is “the act of returning or going back”;

First used in 1877 by Francis Galton; Regression is the statistical tool with the help

of which we are in a position to estimate (predict) the unknown values of one variable from the known values of another variable;

It helps to find out average probable change in one variable given a certain amount of change in another;

Page 3: Regression


Page 4: Regression

Regression lines

For two variables X and Y, we will have two regression lines:

1. Regression line X on Y gives values of Y for given values of X;

2. Regression line Y on X gives values of X for given values of Y;

Page 5: Regression

Regression Equation

Regression equations are algebraic expressions of regression lines;

Y on XRegression equation expressed as

Y=a+bXY is dependent variableX is independent variable‘a’ & ‘b’ are constants/parameters of line‘a’ determines the level of fitted line (i.e. distance

of line above or below origin)‘b’ determines the slope of line (i.e change in Y for

unit change in X)

Page 6: Regression

Regression equations are algebraic expressions of regression lines;

X on YRegression equation expressed as

X=a+bYX is dependent variableY is independent variable‘a’ & ‘b’ are constants/parameters of line‘a’ determines the level of fitted line (i.e. distance

of line above or below origin)‘b’ determines the slope of line (i.e change in Y for

unit change in X)

Page 7: Regression

Method of Least Square

Constant “a” & “b” can be calculated by method of least square;

The line should be drawn through the plotted points in such a manner that the sum of square of the vertical deviations of actual Y values from estimated Y values is the least i.e. ∑(Y-Ye)2 should be minimum;

Such a line is known as line of best fit; with algebra & calculus:For Y on X For X on Y∑Y=Na+b ∑X ∑X=Na+b ∑Y∑XY=a ∑X + b ∑X2 ∑XY=a ∑Y + b ∑Y2

Page 8: Regression

Multiple Regression

When we use more than one independent variable to estimate the dependent variable in order to increase the accuracy of the estimate; the process is called multiple regression analysis.

It is based on the same assumptions & procedure that are encountered using simple regression.

The principal advantage of multiple regression is that it allows us to use more of the information available to us to estimate the dependent variable;

Page 9: Regression

Estimating equation describing relationship among three variables

Y= a+b1X1+b2X2

where, Y = estimated value corresponding to the dependent variable

a= Y intercept b1 and b2 = slopes associated with X1

and X2, respectively

X1 and X2 = values of the two independent variables

Page 10: Regression

Normal Equations:

we use three equations (which statistician call the “normal equation”) to determine the values of the constants a, b1 and b2

∑Y=Na+b1∑X1 + b2∑X2

∑X1Y=a ∑X1 + b1 ∑X1

2 + b2∑X1 X2

∑X2Y=a ∑X2 + b2 ∑X2

2 + b1∑X1 X2

Page 11: Regression

Difference between regression & correlation

Correlation coefficient (r) between x & y is a measure of direction & degree of linear relationship between x & y;

It does not imply cause & effect relationship between the variables.

It indicates the degree of association

bxy & byx are mathematical measures expressing the average relationship between the two variables

It indicates the cause & effect relationship between variables.

It is used to forecast the nature of dependent variable when the value of independent variable is know

Correlation Regression
