3
 Regression o  Regression analysis  is a statistical process for estimating the relationships among variables. o It includes many techniques for modeling and analy zing several variable s, when the focus is on the relationship between a depe nden t var iabl e  and one or  more independent variables  (or 'predictors'). o The estimation target is a function of the independent variables called the regression function. o In regres sio n anal ys is, it is als o of int erest to cha ract eri ze the var iat ion of the depende nt var iable around the reg res sio n functi on whi ch can be des cri bed by a  probability distribution . o egression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning . o The performance of regression analysis methods in practice depends on the form of the data generating process , and how it relates to the regression approach being used o Two types

BIDM_Unit 5

Embed Size (px)

Citation preview

Page 1: BIDM_Unit 5

7/26/2019 BIDM_Unit 5

http://slidepdf.com/reader/full/bidmunit-5 1/3

  Regression

o  Regression analysis is a statistical process for estimating the relationships among

variables.

o It includes many techniques for modeling and analyzing several variables, when the

focus is on the relationship between a dependent variable and one or  

more independent variables (or 'predictors').

o The estimation target is a function of the independent variables called the regression

function.

o In regression analysis, it is also of interest to characterize the variation of the

dependent variable around the regression function which can be described bya  probability distribution.

o egression analysis is widely used for prediction and forecasting, where its use has

substantial overlap with the field of machine learning.

o The performance of regression analysis methods in practice depends on the form of 

the data generating process, and how it relates to the regression approach being used

o

Two types

Page 2: BIDM_Unit 5

7/26/2019 BIDM_Unit 5

http://slidepdf.com/reader/full/bidmunit-5 2/3

!. "onlinear regression

#. $inear regression

!. "onlinear regression

o In statistics, nonlinear regression is a form of   regression analysis in which

observational data are modeled by a function which is a nonlinear combination of the

model parameters and depends on one or more independent variables.

o The data are fitted by a method of successive appro%imations.

o &urve itting rocedure

!. lot your variables to visualize the relationship

a. *hat curve does the pattern resemble+

 b. *hat might alternative options be+

#. ecide on the curves you want to compare and run a non-linear regression curve

fitting a. ou will have to estimate your parameters from your curve to have starting

values for your curve fitting function.

/. 0nce you have parameters for your curves compare models with 1I&.

2. lot the model with the lowest 1I& on your point data to visualize fit.

#. $inear regression

/. $ogistic egression

• $ogistic regression measures the relationship between the categorical dependent

variable and one or more independent variables by estimating probabilities using

a logistic function, which is the cumulative logistic distribution.

• Thus, it treats the same set of problems as  probit regression using similar techniques,

with the latter using a cumulative normal distribution curve instead.

Page 3: BIDM_Unit 5

7/26/2019 BIDM_Unit 5

http://slidepdf.com/reader/full/bidmunit-5 3/3

• 3quivalently, in the latent variable interpretations of these two methods, the first

assumes a standard logistic distribution of errors and the second a standard normal

distribution of errors.

• $ogistic regression can be seen as a special case of generalized linear model and thus

analogous to linear regression.

• The model of logistic regression, however, is based on quite different assumptions

(about the relationship between dependent and independent variables) from those of 

linear regression.

• In particular the 4ey differences of these two models can be seen in the following two

features of logistic regression. irst, the conditional distribution is a 5ernoulli

distribution rather than a 6aussian distribution, because the dependent variable is

 binary. 7econd, the predicted values are probabilities and are therefore restricted to

(8,!) through the logistic distribution function because logistic regression predicts

the probability of particular outcomes.