67
REGRESSION

Regression

Embed Size (px)

Citation preview

Page 1: Regression

REGRESSION

Page 2: Regression

Meaning

• In statistics, regression analysis is a statistical process for estimating the relationships among variables.

• Regression analysis:There are several types of regression:– Linear regression– Simple linear regression– Logistic regression– Nonlinear regression– Nonparametric regression– Robust regression– Stepwise regression

Page 3: Regression

Meaning: regression

Noun1.the act of going back to a previous place or state; return or reversion.2.retrogradation; retrogression.3.Biology. reversion to an earlier or less advanced

state or form or to a common or general type.4.Psychoanalysis. the reversion to a chronologically earli

er or less adapted pattern of behaviour and feeling.5.a subsidence of a disease or its manifestations:

a regression of symptoms.

Page 4: Regression

INTRODUCTION

• Many engineering and scientific problems – concerned with determining a relationship between a

set of variables.• In a chemical process, might be interested in– the relationship between

• the output of the process, the temperature at which it occurs, and

• the amount of catalyst employed.

• Knowledge of such a relationship would enable us– to predict the output for various values of temperature

and amount of catalyst.

Page 5: Regression

INTRODUCTION

• Situation- – there is a single response variable Y , also called the dependent variable 

– depends on the value of a set of input, also called independent, variables x1, . . . , xr 

• The simplest type of relationship these varibles is a linear relationship

Page 6: Regression

INTRODUCTION

• If this was the linear relationship between Y and the xi, i = 1, . . . , r, then it would be possible – once the βi were learned – to exactly predict the response for any set of input 

values. • In practice, such precision is almost never

attainable• The most that one can expect is that Equation

would be valid subject to random error

Page 7: Regression

INTRODUCTION

Page 8: Regression

Introduction

Page 9: Regression
Page 10: Regression

• Suppose that the responses Yi corresponding to the input values xi, i = 1, . . . , n are to be observed and used to estimate α and β in a simple linear regression model. 

• To determine estimators of α and β we reason as follows: – If A is the estimator of α and B of β, then the estimator of the response

corresponding to the input variable xi would be A + Bxi.

 

• Since the actual response is Yi , – the squared difference is (Yi − A − Bxi)2, and so if A and B are the

estimators of α and β, – then the sum of the squared differences between the estimated responses– the actual response values—call it SS — is given by

Page 11: Regression
Page 12: Regression
Page 13: Regression
Page 14: Regression
Page 15: Regression
Page 16: Regression

Joke on Regression

Page 17: Regression

DISTRIBUTION OF THE ESTIMATORS

• To specify the distribution of the estimators A and B, it is necessary to make additional assumptions about the random errors aside from just assuming that their mean is 0.

• The usual approach is to assume that the random errors are independent normal random variables having mean 0 and variance σ2.

Page 18: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 19: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 20: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 21: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 22: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 23: Regression

DISTRIBUTION OF THE ESTIMATORS

• Remarks

Page 24: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 25: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 26: Regression

DISTRIBUTION OF THE ESTIMATORS

• Notation:

Page 27: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 28: Regression

DISTRIBUTION OF THE ESTIMATORS

Page 29: Regression

EXAMPLE

• The following data relate – X: the moisture of a wet mix of a certain product– Y: the density of the finished product

• Fit a linear curve to these data. Also determine SSR.

Page 30: Regression

EXAMPLE

Page 31: Regression

EXAMPLE

Page 32: Regression

STATISTICAL INFERENCES ABOUT THEREGRESSION PARAMETERS

• Inferences Concerning β

Page 33: Regression

Inferences Concerning β

Page 34: Regression

Inferences Concerning β

Page 35: Regression

Inferences Concerning β

Page 36: Regression

EXAMPLE• An individual claims that the fuel consumption of his

automobile does not depend on how fast the car is driven.

• To test the plausibility of this hypothesis, the car was tested at various speeds between 45 and 70 miles per hour. The miles per gallon attained at each of these speeds was determined, with the following data resulting is given.

• Do these data refute the claim that the mileage per gallon of gas is unaffected by the speedat which the car is being driven?

Page 37: Regression

EXAMPLE

Page 38: Regression

EXAMPLE

Page 39: Regression

Inferences Concerning α

Page 40: Regression
Page 41: Regression
Page 42: Regression
Page 43: Regression
Page 44: Regression
Page 45: Regression

Summary of Distributional Results

Page 46: Regression

Summary of Distributional Results

Page 47: Regression

THE COEFFICIENT OF DETERMINATION AND THE SAMPLE CORRELATION COEFFICIENT

Page 48: Regression

THE COEFFICIENT OF DETERMINATION AND THE SAMPLE CORRELATION COEFFICIENT

Page 49: Regression

THE COEFFICIENT OF DETERMINATION AND THE SAMPLE CORRELATION COEFFICIENT

Page 50: Regression

ANALYSIS OF RESIDUALS: ASSESSING THE MODEL

The figure shows that, as indicated both by its scatter diagram and the random nature of its standardized residuals, appears to fit the straight-line model quite well.

Page 51: Regression

The figure of the residual plot shows a discernible pattern, in that the residuals appear to be first decreasing and then increasing as the input level increases. This often means that higher-order (than just linear) terms are needed to describe the relationship between the input and response. Indeed, this is also indicated by the scatter diagram in this case.

Page 52: Regression

The standardized residual plot shows a pattern, in that the absolute value of the residuals, and thus their squares, appear to be increasing, as the input level increases. This often indicates that the variance of the response is not constant but, rather, increases with the input level.

Page 53: Regression

TRANSFORMING TO LINEARITY

• The mean response is not a linear function • In such cases, if the form of the relationship

can be determined it is sometimespossible, by a change of variables, to transform it into a linear form.

• For instance, in certain applications it is known that W(t), the amplitude of a signal a time t after its origination, is approximately related to t by the functional form

Page 54: Regression

TRANSFORMING TO LINEARITY

Page 55: Regression

EXAMPLE

• The following table gives the percentages of a chemical that were used up when an experiment was run at various temperatures (in degrees celsius).

• Use it to estimate the percentage of the chemical that would be used up if the experiment were to be run at 350 degrees.

Page 56: Regression

EXAMPLE• Let P(x) be the percentage of the chemical that is used up when the experiment is

run at 10x degrees.

• Even though a plot of P(x) looks roughly linear, we can improve upon the fit by considering a nonlinear relationship between x and P(x).

• Specifically, let us consider a relationship of the form : 1 − P(x) ≈ c(1 − d)x

Page 57: Regression

EXAMPLE

Page 58: Regression

EXAMPLE

Page 59: Regression

POLYNOMIAL REGRESSION

Page 60: Regression

EXAMPLE

• Fit a polynomial to the following data.

Page 61: Regression

EXAMPLE

Page 62: Regression

EXAMPLE

Page 63: Regression

MULTIPLE LINEAR REGRESSION

Page 64: Regression

MULTIPLE LINEAR REGRESSION

Page 65: Regression

MULTIPLE LINEAR REGRESSION

Page 66: Regression

MULTIPLE LINEAR REGRESSION

Page 67: Regression

Thank you