Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization

Introduction to Machine Learning and Stochastic Optimization

Robert M. Gower

Spring School on Optimization and Data Science, Novi Saad, March 2017

An Introduction to Supervised Learning

Some References

Understanding Machine Learning: From Theory to Algorithms

Stanford Machine Learning on Coursera by Andrew Ng

Graduate level Undergraduate level

Is There a Cat in the Photo?

Yes

No


Yes


Yes


No


Yes

Find mapping h that assigns the “correct” target to each input


Yes

No

x: Input/Feature y: Output/Target

Labeled Data

Labeled Data

y= -1 means no/false

Labeled Data

Learning Algorithm


Labeled Data

Learning Algorithm


Labeled Data

Learning Algorithm

-1


Example: Linear Regression for Height

Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Example Hypothesis: Linear Model


Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Example Training Problem:

Example Hypothesis: Linear Model


Sex Male

Age 30

Height 1,72 cm

Sex Female

Age 70

Height 1,52 cm

Labeled data

Linear Regression for Height

Age

Height


The Training Algorithm

Age

Height


The Training Algorithm

Age

Height

Other options aside from linear?

Parametrizing the HypothesisHeight

Age

Linear:

Polinomial:

Age

Height

Neural Net:

Loss Functions

Why a SquaredLoss?

Loss Functions

Why a SquaredLoss?

Loss Functions

The Training Problem

Loss Functions

Why a SquaredLoss?

Loss Functions

The Training Problem

Typically a convex function

Choosing the Loss Function

Quadratic Loss

Binary Loss

Hinge Loss


Quadratic Loss

Binary Loss

Hinge Loss

y=1 in all figures


Quadratic Loss

Binary Loss

Hinge Loss

EXE: Plot the binary and hinge loss function in when

y=1 in all figures

Loss Functions

Is a notion of Loss enough?

What happens when we do not have enough data?

Loss FunctionsThe Training Problem

Is a notion of Loss enough?

What happens when we do not have enough data?

Overfitting and Model Complexity

Fitting 1st order polynomial


Fitting 1st order polynomial


Fitting 3rd order polynomial


Fitting 9th order polynomial

Regularizor Functions

General Training Problem

Regularization



Regularization

Goodness of fit, fidelity term ...etc



Regularization


Penlizes complexity



Regularization


Penlizes complexity

Controls tradeoff between fit and complexity



Regularization

Exe:


Penlizes complexity

Controls tradeoff between fit and complexity


Fitting kth order polynomial


Fitting kth order polynomial

For big enough, λthe solution is a 2nd order polynomial

Linear hypothesis

Exe: Ridge Regression

Ridge Regression

L2 loss

L2 regularizor

Linear hypothesis

Exe: Support Vector Machines

SVM with soft margin

Hinge loss

L2 regularizor

Linear hypothesis

Exe: Logistic Regression

Logistic Regression

Logistic loss

L2 regularizor

The Machine Learners Job






Documents

Introduction to Machine Learning and Stochastic Optimization · 2017-03-12 · Introduction to Machine Learning and Stochastic Optimization Robert M. Gower Spring School on Optimization