View
0
Download
0
Embed Size (px)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Michael Brückner
Manager Machine Learning
25/02/2016
Machine Learning 101
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
• What is Machine Learning and why do we need it?
• Model Building
• Model Evaluation & Tuning
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Machine Learning?
Methods and Systems that …
Adapt based on recorded
data
Predict new data based on recorded
data
Optimize an action given a utility
function
Extract hidden
structure from the
data
Summarize data into concise
descriptions
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Machine Learning NOT?
Methods and Systems that …
can yield Garbage-In Knowledge-
Out
perform well without
data modeling & feature
engineering
avoid the curse-of-
dimensionality
are a replacement for business
rules
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Infer-Predict-Decide Cycle
Inference
Build & evaluate Predictor
Prediction
Apply the learned Predictor
Decision Making
Adjust Business loss and get new/more data
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What for?
Automate tasks, which typically require humans in order to
• scale
• improve over humans (non-experts)
• preserve privacy
or solve tasks that are impossible for humans
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Examples: Personalized Recommandation
• Input:
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Examples: Personalized Recommandation
• Output:
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Examples: Face Detection & Recognition
Face detection
• Input: image
• Output: face position
Face recognition
• Input: face (image & face position)
• Output: person’s name
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Examples: Full-Text Translation
• Input: text in one language
• Output: text of another language
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Examples: Spam Filtering
• Input: email (text, images, …)
• Output: spam/non-spam flag
• Challenges:
• extremely high precision for
legitimate emails
• spam changes constantly
• noisy ground truth
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Supervised Machine Learning
1. Model problem in terms of input data and output data
2. Collect sample of input-output pairs
3. Learn a mapping that produces the output given the
input
4. Apply this function on new inputs to make predictions
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A Programer’s Perspective
Traditional Programming (Predicting)
Supervised Machine Learning
Computer
Input Data
Mapping
Output Data
Computer
Input Data
Output Data
Mapping
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Advantages
• Use data instead of intuition to derive the mapping
• Can solve very complex tasks
• Can adapt to new situations (collect more data)
• Does not require much expert knowledge
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Input Data
Description Type Cost Actual Cost Diff In Catalogue
Movies Entertainment $50 $28 $22 Yes
Music (CDs, MP3s, etc.) $500 $30 $470 No
Sporting Events Entertainment $0 $40 ($40) No
Dining Out Food $1,000 $1,200 ($200) Yes
Groceries $100 $0 $100 Yes
Charity 1 Gifts and Charity $200 $200 $0 No
Charity 2 $500 $500 $0 No
Cable/Satellite Housing $100 $100 $0 Yes
Electric Housing $45 $40 $5 Yes
Mortgage or Rent $700 $700 $0 Yes
Health Insurance $400 $400 $0 Yes
Home Insurance $400 $400 $0 No
Credit Card 1 $0 Yes
Dataset
Categorical Data
Missing Data
Binary Data
Numerical Data
Attribute Name
Attribute Value
Attribute
Text Data
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Description Type Cost Actual Cost Diff In Catalogue
Movies Entertainment $50 $28 $22 Yes
Music (CDs, MP3s, etc.) ? $500 $30 $470 No
Sporting Events Entertainment $0 $40 ($40) No
Dining Out Food $1,000 $1,200 ($200) Yes
Groceries ? $100 $0 $100 Yes
Charity 1 Gifts and Charity $200 $200 $0 No
Charity 2 ? $500 $500 $0 No
Cable/Satellite Housing $100 $100 $0 Yes
Electric Housing $45 $40 $5 Yes
Mortgage or Rent ? $700 $700 $0 Yes
Health Insurance $400 $400 $0 Yes
Home Insurance $400 $400 $0 No
Credit Card 1 ? $0 Yes
Output Data
Target Attribute Values
Target Attribute
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
• What is Machine Learning and why do we need it?
• Model Building
• Model Evaluation & Tuning
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Problem Setting
• Input: vector of observable attributes, x
• Output: target attribute value, y
• Training data: pairs of input and corresponding output,
D = (x1,y1),…,(xN,yN)
• Application data: inputs only
• Goal: learn mapping fw:x ↦ y
Predictor
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Challenges in Model Building
• Which function class for Predictor (data modeling)?
• How to pre-process the data (feature engineering)?
• How to learn this Predictor from our training data?
• How to generalize to new data?
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which function class for Predictor?
Types of prediction tasks (output type):
• Binary Classification ⇒ binary target y {–1, +1}
• Multinomial Classification ⇒ categorical target y {1… K}
• Regression ⇒ numeric target y [ l ,u] R
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which function class for Binary Classification?
• Decision Tree
+
+-
-
-
x2 > 7?
no yes
+
+
+
+
+
x1 < 3?
no yes
x2 < 5?
no yes
x1 < 1?
no yes
+
+
-
-
x2
x11 3
5
7
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which function class for Binary Classification?
• Decision Tree
+-
x2 > 7?
no yes
+
x1 < 3?
no yes
x2 < 5?
no yes
x1 < 1?
no yes
+ -
x2
x1
+
- -
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which function class for Binary Classification?
• Linear function
• binary target attribute
values y {–1, +1}
x2
x1
Hw +
-
y(x) = sign( f w (x))
H w
={x | f w (x) = xTw+ w
0 = 0}
^
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which function class for Binary Classification?
• Generalized linear function
(Kernel methods)
• Layered Generalized linear
function (Neural Networks)
• Ensemble of functions
• …
x2
x1
+
- +
-
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to pre-process the data?
• Predictor’s function class defined for limited input domain
⇒ transform/extract attributes first (pre-processing)
• Number to (normalized) Number:
• z-standardization, min-max normalization
• Number to Category:
• Binning (quantile, equidistant)
• Category to (numeric) Vector:
• One-hot encoding
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to pre-process the data?
• Predictor’s function class defined for limited input domain
⇒ transform/extract attributes f