Machine Learning 101 - Amazon Web Servicesaws-de-media.s3. ... آ© 2016, Amazon Web Services, Inc. or

  • View
    0

  • Download
    0

Embed Size (px)

Text of Machine Learning 101 - Amazon Web Servicesaws-de-media.s3. ... آ© 2016, Amazon Web Services, Inc....

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Michael Brückner

    Manager Machine Learning

    25/02/2016

    Machine Learning 101

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Agenda

    • What is Machine Learning and why do we need it?

    • Model Building

    • Model Evaluation & Tuning

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    What is Machine Learning?

    Methods and Systems that …

    Adapt based on recorded

    data

    Predict new data based on recorded

    data

    Optimize an action given a utility

    function

    Extract hidden

    structure from the

    data

    Summarize data into concise

    descriptions

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    What is Machine Learning NOT?

    Methods and Systems that …

    can yield Garbage-In Knowledge-

    Out

    perform well without

    data modeling & feature

    engineering

    avoid the curse-of-

    dimensionality

    are a replacement for business

    rules

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Infer-Predict-Decide Cycle

    Inference

    Build & evaluate Predictor

    Prediction

    Apply the learned Predictor

    Decision Making

    Adjust Business loss and get new/more data

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    What for?

    Automate tasks, which typically require humans in order to

    • scale

    • improve over humans (non-experts)

    • preserve privacy

    or solve tasks that are impossible for humans

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Examples: Personalized Recommandation

    • Input:

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Examples: Personalized Recommandation

    • Output:

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Examples: Face Detection & Recognition

    Face detection

    • Input: image

    • Output: face position

    Face recognition

    • Input: face (image & face position)

    • Output: person’s name

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Examples: Full-Text Translation

    • Input: text in one language

    • Output: text of another language

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Examples: Spam Filtering

    • Input: email (text, images, …)

    • Output: spam/non-spam flag

    • Challenges:

    • extremely high precision for

    legitimate emails

    • spam changes constantly

    • noisy ground truth

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Supervised Machine Learning

    1. Model problem in terms of input data and output data

    2. Collect sample of input-output pairs

    3. Learn a mapping that produces the output given the

    input

    4. Apply this function on new inputs to make predictions

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    A Programer’s Perspective

    Traditional Programming (Predicting)

    Supervised Machine Learning

    Computer

    Input Data

    Mapping

    Output Data

    Computer

    Input Data

    Output Data

    Mapping

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Advantages

    • Use data instead of intuition to derive the mapping

    • Can solve very complex tasks

    • Can adapt to new situations (collect more data)

    • Does not require much expert knowledge

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Input Data

    Description Type Cost Actual Cost Diff In Catalogue

    Movies Entertainment $50 $28 $22 Yes

    Music (CDs, MP3s, etc.) $500 $30 $470 No

    Sporting Events Entertainment $0 $40 ($40) No

    Dining Out Food $1,000 $1,200 ($200) Yes

    Groceries $100 $0 $100 Yes

    Charity 1 Gifts and Charity $200 $200 $0 No

    Charity 2 $500 $500 $0 No

    Cable/Satellite Housing $100 $100 $0 Yes

    Electric Housing $45 $40 $5 Yes

    Mortgage or Rent $700 $700 $0 Yes

    Health Insurance $400 $400 $0 Yes

    Home Insurance $400 $400 $0 No

    Credit Card 1 $0 Yes

    Dataset

    Categorical Data

    Missing Data

    Binary Data

    Numerical Data

    Attribute Name

    Attribute Value

    Attribute

    Text Data

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Description Type Cost Actual Cost Diff In Catalogue

    Movies Entertainment $50 $28 $22 Yes

    Music (CDs, MP3s, etc.) ? $500 $30 $470 No

    Sporting Events Entertainment $0 $40 ($40) No

    Dining Out Food $1,000 $1,200 ($200) Yes

    Groceries ? $100 $0 $100 Yes

    Charity 1 Gifts and Charity $200 $200 $0 No

    Charity 2 ? $500 $500 $0 No

    Cable/Satellite Housing $100 $100 $0 Yes

    Electric Housing $45 $40 $5 Yes

    Mortgage or Rent ? $700 $700 $0 Yes

    Health Insurance $400 $400 $0 Yes

    Home Insurance $400 $400 $0 No

    Credit Card 1 ? $0 Yes

    Output Data

    Target Attribute Values

    Target Attribute

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Agenda

    • What is Machine Learning and why do we need it?

    • Model Building

    • Model Evaluation & Tuning

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Problem Setting

    • Input: vector of observable attributes, x

    • Output: target attribute value, y

    • Training data: pairs of input and corresponding output,

    D = (x1,y1),…,(xN,yN)

    • Application data: inputs only

    • Goal: learn mapping fw:x ↦ y

    Predictor

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Challenges in Model Building

    • Which function class for Predictor (data modeling)?

    • How to pre-process the data (feature engineering)?

    • How to learn this Predictor from our training data?

    • How to generalize to new data?

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Which function class for Predictor?

    Types of prediction tasks (output type):

    • Binary Classification ⇒ binary target y {–1, +1}

    • Multinomial Classification ⇒ categorical target y {1… K}

    • Regression ⇒ numeric target y [ l ,u]  R

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Which function class for Binary Classification?

    • Decision Tree

    +

    +-

    -

    -

    x2 > 7?

    no yes

    +

    +

    +

    +

    +

    x1 < 3?

    no yes

    x2 < 5?

    no yes

    x1 < 1?

    no yes

    +

    +

    -

    -

    x2

    x11 3

    5

    7

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Which function class for Binary Classification?

    • Decision Tree

    +-

    x2 > 7?

    no yes

    +

    x1 < 3?

    no yes

    x2 < 5?

    no yes

    x1 < 1?

    no yes

    + -

    x2

    x1

    +

    - -

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Which function class for Binary Classification?

    • Linear function

    • binary target attribute

    values y {–1, +1}

    x2

    x1

    Hw +

    -

    y(x) = sign( f w (x))

    H w

    ={x | f w (x) = xTw+ w

    0 = 0}

    ^

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    Which function class for Binary Classification?

    • Generalized linear function

    (Kernel methods)

    • Layered Generalized linear

    function (Neural Networks)

    • Ensemble of functions

    • …

    x2

    x1

    +

    - +

    -

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    How to pre-process the data?

    • Predictor’s function class defined for limited input domain

    ⇒ transform/extract attributes first (pre-processing)

    • Number to (normalized) Number:

    • z-standardization, min-max normalization

    • Number to Category:

    • Binning (quantile, equidistant)

    • Category to (numeric) Vector:

    • One-hot encoding

  • © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

    How to pre-process the data?

    • Predictor’s function class defined for limited input domain

    ⇒ transform/extract attributes f