32
Regression CS294 Practical Machine Learning Romain Thibaux 09/18/06

Regression

Embed Size (px)

Citation preview

Page 1: Regression

Regression

CS294 Practical Machine Learning

Romain Thibaux09/18/06

Page 2: Regression

Outline

• Ordinary Least Squares regression– Derivation from minimizing the sum of squares– Probabilistic interpretation– Online version (LMS)

• Overfitting and Regularization

• Numerical stability

• L1 Regression

• Kernel Regression, Spline Regression• Multiple Adaptive Regression Splines (MARS)

Page 3: Regression

Classification (reminder)

X ! YAnything:

• continuous (, d, …)

• discrete ({0,1}, {1,…k}, …)

• structured (tree, string, …)

• …

• discrete:

– {0,1} binary

– {1,…k} multi-class

– tree, etc. structured

Page 4: Regression

Classification (reminder)

XAnything:

• continuous (, d, …)

• discrete ({0,1}, {1,…k}, …)

• structured (tree, string, …)

• …

Page 5: Regression

Classification (reminder)

XAnything:

• continuous (, d, …)

• discrete ({0,1}, {1,…k}, …)

• structured (tree, string, …)

• …

Perceptron

Logistic Regression

Support Vector Machine

Decision TreeDecision TreeRandom ForestRandom Forest

Kernel trickKernel trick

Page 6: Regression

Regression

X ! Y• continuous:

– , dAnything:

• continuous (, d, …)

• discrete ({0,1}, {1,…k}, …)

• structured (tree, string, …)

• …

1

Page 7: Regression

Examples

• Voltage ! Temperature

• Processes, memory ! Power consumption• Protein structure ! Energy [next week]

• Robot arm controls ! Torque at effector

• Location, industry, past losses ! Premium

Page 8: Regression

Linear regression

010

2030

40

0

10

20

30

20

22

24

26

Tem

pera

ture

0 10 200

20

40

[start Matlab demo lecture2.m]

Given examples

Predict given a new point

Page 9: Regression

0 200

20

40

010

2030

40

0

10

20

30

20

22

24

26

Tem

pera

ture

Linear regression

Prediction Prediction

Page 10: Regression

Ordinary Least Squares (OLS)

0 200

Error or “residual”

Prediction

Observation

Sum squared error

Page 11: Regression

Minimize the sum squared error

Sum squared error

Linear equation

Linear system

Page 12: Regression

Alternative derivation

n

d Solve the system (it’s better not to invert the matrix)

Page 13: Regression

LMS Algorithm(Least Mean Squares)

where

Online algorithm

Page 14: Regression

Beyond lines and planes

everything is the same with

still linear in

0 10 200

20

40

Page 15: Regression

Geometric interpretation

[Matlab demo]

010

200

100

200

300

400

-10

0

10

20

Page 16: Regression

Ordinary Least Squares [summary]

n

d

Let

For example

Let

Minimize by solving

Given examples

Predict

Page 17: Regression

Probabilistic interpretation

0 200

Likelihood

Page 18: Regression

Assumptions vs. Reality

Voltage

0 1 2 3 4 5 6 70

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Intel sensor network data

Temperature

Page 19: Regression

Overfitting

0 2 4 6 8 10 12 14 16 18 20-15

-10

-5

0

5

10

15

20

25

30

[Matlab demo]

Degree 15 polynomial

Page 20: Regression

Ridge Regression(Regularization)

0 2 4 6 8 10 12 14 16 18 20-10

-5

0

5

10

15Effect of regularization (degree 19)

with “small”

Minimize by solving

Page 21: Regression

Probabilistic interpretation

Likelihood

Prior

Posterior

Page 22: Regression

Numerical Accuracy

Condition number

vs

We want covariates as perpendicular as possible, and roughly the same scale• Regularization• Preconditioning

Page 23: Regression

Errors in Variables(Total Least Squares)

00

Page 24: Regression

Sensitivity to outliers

High weight given to outliers

010

2030

40

0

10

20

30

5

10

15

20

25

Temperature at noon

Influence function

Page 25: Regression

L1 Regression

Linear program Influence function

Page 26: Regression

Kernel Regression

0 2 4 6 8 10 12 14 16 18 20-10

-5

0

5

10

15Kernel regression (sigma=1)

Page 27: Regression

Spline RegressionRegression on each interval

5200 5400 5600 5800

50

60

70

Page 28: Regression

Spline RegressionWith equality constraints

5200 5400 5600 5800

50

60

70

Page 29: Regression

Spline RegressionWith L1 cost

5200 5400 5600 5800

50

60

70

Page 30: Regression

0 1 20

#requests per minute

Time (days)

5000

Heteroscedasticity

Page 31: Regression

MARSMultivariate Adaptive Regression Splines

…on the board…

Page 32: Regression

Further topics

• Generalized Linear Models

• Gaussian process regression

• Local Linear regression• Feature Selection [next class]