49
Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization and Data Science, Novi Saad, March 2017

Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Introduction to Machine Learning and Stochastic Optimization

Robert M. Gower

Spring School on Optimization and Data Science, Novi Saad, March 2017

Page 2: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

An Introduction to Supervised Learning

Page 3: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Some References

Understanding Machine Learning: From Theory to Algorithms

Stanford Machine Learning on Coursera by Andrew Ng

Graduate level Undergraduate level

Page 4: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Is There a Cat in the Photo?

Yes

No

Page 5: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Is There a Cat in the Photo?

Yes

Page 6: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Is There a Cat in the Photo?

Yes

Page 7: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Is There a Cat in the Photo?

No

Page 8: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Is There a Cat in the Photo?

Yes

Page 9: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Find mapping h that assigns the “correct” target to each input

Is There a Cat in the Photo?

Yes

No

x: Input/Feature y: Output/Target

Page 10: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Labeled Data

Page 11: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Labeled Data

y= -1 means no/false

Page 12: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Labeled Data

Learning Algorithm

y= -1 means no/false

Page 13: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Labeled Data

Learning Algorithm

y= -1 means no/false

Page 14: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Labeled Data

Learning Algorithm

-1

y= -1 means no/false

Page 15: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Example: Linear Regression for Height

Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Page 16: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Example Hypothesis: Linear Model

Example: Linear Regression for Height

Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Page 17: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Example Training Problem:

Example Hypothesis: Linear Model

Example: Linear Regression for Height

Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Page 18: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear Regression for Height

Age

Height

Page 19: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear Regression for Height

The Training Algorithm

Age

Height

Page 20: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear Regression for Height

The Training Algorithm

Age

Height

Other options aside from linear?

Page 21: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Parametrizing the HypothesisHeight

Age

Linear:

Polinomial:

Age

Height

Neural Net:

Page 22: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Loss Functions

Why a SquaredLoss?

Page 23: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Loss Functions

Why a SquaredLoss?

Loss Functions

The Training Problem

Page 24: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Loss Functions

Why a SquaredLoss?

Loss Functions

The Training Problem

Typically a convex function

Page 25: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Choosing the Loss Function

Quadratic Loss

Binary Loss

Hinge Loss

Page 26: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Choosing the Loss Function

Quadratic Loss

Binary Loss

Hinge Loss

y=1 in all figures

Page 27: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Choosing the Loss Function

Quadratic Loss

Binary Loss

Hinge Loss

EXE: Plot the binary and hinge loss function in when

y=1 in all figures

Page 28: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Loss Functions

Is a notion of Loss enough?

What happens when we do not have enough data?

Page 29: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Loss FunctionsThe Training Problem

Is a notion of Loss enough?

What happens when we do not have enough data?

Page 30: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting 1st order polynomial

Page 31: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting 1st order polynomial

Page 32: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting 3rd order polynomial

Page 33: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting 9th order polynomial

Page 34: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Regularizor Functions

General Training Problem

Regularization

Page 35: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Regularizor Functions

General Training Problem

Regularization

Goodness of fit, fidelity term ...etc

Page 36: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Regularizor Functions

General Training Problem

Regularization

Goodness of fit, fidelity term ...etc

Penlizes complexity

Page 37: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Regularizor Functions

General Training Problem

Regularization

Goodness of fit, fidelity term ...etc

Penlizes complexity

Controls tradeoff between fit and complexity

Page 38: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Regularizor Functions

General Training Problem

Regularization

Exe:

Goodness of fit, fidelity term ...etc

Penlizes complexity

Controls tradeoff between fit and complexity

Page 39: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting kth order polynomial

Page 40: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Overfitting and Model Complexity

Fitting kth order polynomial

For big enough, λthe solution is a 2nd order polynomial

Page 41: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear hypothesis

Exe: Ridge Regression

Ridge Regression

L2 loss

L2 regularizor

Page 42: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear hypothesis

Exe: Support Vector Machines

SVM with soft margin

Hinge loss

L2 regularizor

Page 43: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Linear hypothesis

Exe: Logistic Regression

Logistic Regression

Logistic loss

L2 regularizor

Page 44: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job

Page 45: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job

Page 46: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job

Page 47: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job

Page 48: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job

Page 49: Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

The Machine Learners Job