Upload
fredverheul
View
428
Download
1
Embed Size (px)
Citation preview
Machine Learning 101Fred Verheul
2
What we won’t cover…
• Deep learning / Neural Networks
• Specifics of ML-algorithms
• Tools / Libraries / Code
• SAP Products, like HANA / Predictive Analytics / Vora / …
• Ethics, algorithmic transparency & fairness
• Hardware
3
Examples: Recommender systems
4
Examples, continued…
SPAM-filtering
Handwriting recognition
5
ML in the news: Deepmind’s AlphaGo
6
7
Machine Learning
"Field of study that gives computers the ability to learnwithout being explicitly programmed” (Arthur Samuel, 1959)
8
What is Machine Learning?
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program Output
ProgramOutput
9
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:• Too many rules• Too many factors influencing the rules• Too finely tuned• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
10
Basic Machine Learning ‘workflow’
Feature Vectors
Training data
Labels
Machine Learning Algorithm
Feature Vectors
New data Prediction
Training Phase
Operational Phase
Predictive Model
11
Training Phase in more detail
Raw dataData
preparation Feature Vectors
Training Data
Test data
Model Building (by ML
algorithm)
Model Evaluation
Predictive Model
Feedback loop
data cleansingdata transformation
normalizationfeature extraction
aka ‘learning’
12
CRISP-DM: data mining process
ML important
ML important
13
Examples of ML tasksSupervised learning
Regression target is numeric
Classification target is categorical
Unsupervised learning
Clustering
Dimensionalityreduction
14
Modeling: so many algorithms…
15
ML Algorithms: by RepresentationCollection of candidate models/programs, aka hypothesis space
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
16
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test for a disease
Accuracy: Better evaluation metrics:• Precision: 8 / (8 + 19)• Recall: 8 / (8 + 2)
17
Optimization: how the algorithm ‘learns’, depends on representation and evaluation
ML Algorithms: by Optimization
Greedy Search, ex. of combinatorial optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:Constrained/Nonlinear Optimization)
18
Training error vs test error
19
Data Science for Business
• Focuses more on general principles than specific algorithms
• Not math-heavy, does contain some math
• O’Reilly link: http://shop.oreilly.com/product/0636920028918.do
• Book website: http://data-science-for-biz.com/DSB/Home.html
20
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• Part of ‘Data Mining Process’, not a goal in and of itself
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:• Ethics
• Algorithmic transparency
Thank [email protected]@SOAPEOPLE
Fred VerheulBig Data Consultant+31 6 3919 [email protected]@fredverheul