56
INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

Embed Size (px)

Citation preview

Page 1: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

INTRODUCTION TO Machine LearningAdapted from: ETHEM ALPAYDIN

Lecture Slides for

Page 2: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

CHAPTER 1: Introduction

Page 3: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

3

Why “Learn” ?Why “Learn” ?

Machine learning is programming computers to optimize a performance criterion using example data or past experience.

There is no need to “learn” to calculate payroll Learning is used when:

Human expertise does not exist (navigating on Mars), Humans are unable to explain their expertise (speech

recognition) Solution changes in time (routing on a computer network) Solution needs to be adapted to particular cases (user

biometrics)

Page 4: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

4

What We Talk About When We Talk About“Learning”What We Talk About When We Talk About“Learning” Learning general models from a data of particular examples Data is cheap and abundant (data warehouses, data marts);

knowledge is expensive and scarce. Example in retail: Customer transactions to consumer

behavior: People who bought “Introduction to Machine Learning (Adaptive Computation and Machine Learning)” also bought “Data Mining: Practical Machine Learning Tools and Techniques

” (www.amazon.com) Build a model that is a good and useful approximation to

the data.

Page 5: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

5

Data MiningData Mining

Retail: Market basket analysis, Customer relationship management (CRM)

Finance: Credit scoring, fraud detection Manufacturing: Optimization, troubleshooting Medicine: Medical diagnosis Telecommunications: Quality of service

optimization Bioinformatics: Motifs, alignment Web mining: Search engines ...

Page 6: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

6

What is Machine Learning?What is Machine Learning?

Optimize a performance criterion using example data or past experience.

Role of Statistics: Inference from a sample Role of Computer science: Efficient algorithms to

Solve the optimization problem Representing and evaluating the model for

inference

Page 7: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

7

What is Machine Learning? An AbstractWhat is Machine Learning? An Abstract

LearningMachine

(Algorithm)

TRAININGDATA Answer

TrainedMachine

Query

Page 8: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

8

Some Learning MachinesSome Learning Machines

Linear models Kernel methods Neural networks Markov Model Decision trees …

Page 9: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

9

Some Area of Applications Some Area of Applications

inputs

Trai

ning

exa

mpl

es

10

102

103

104

105

Bioinformatics

Ecology

OCRHWR

MarketAnalysis

TextCategorization

Machine Vision

Syst

em d

iagn

osis

10 102 103 104 105

Page 10: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

10

Bioinformatics: Biomedical / BiometricsBioinformatics: Biomedical / Biometrics

Medicine: Screening Diagnosis and prognosis Drug discovery

Security: Face recognition Signature / fingerprint / iris

verification DNA fingerprinting

6

Page 11: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

11

Computer / InternetComputer / Internet

Computer interfaces: Troubleshooting wizards Handwriting and speech Brain waves

Internet Hit ranking Spam filtering Text categorization Text translation Recommendation

7

Page 12: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

Application Types

Page 13: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

13

Application TypesApplication Types

Learning Association Supervised Learning

Classification Regression

Unsupervised Learning Reinforcement Learning (unsupervised)

Page 14: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

14

Learning AssociationsLearning Associations

Basket analysis: P (Y | X ) probability that somebody who buys X also buys Y where X and Y are products/services.

Example: P ( pen | note book ) = 0.7

Page 15: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

15

Application TypesApplication Types

Learning Association Supervised Learning

Classification Regression

Unsupervised Learning Reinforcement Learning (unsupervised)

Page 16: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

16

ClassificationClassification

Example: Credit scoring

Differentiating between low-risk and high-risk customers from their income and savings

Discriminant: IF income > θ1 AND savings > θ2 THEN low-risk ELSE high-risk

Page 17: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

17

Classification: ApplicationsClassification: Applications

Face recognition: Pose, lighting, occlusion (glasses, beard), make-up, hair style

Character recognition: Different handwriting styles.

Speech recognition: Temporal dependency. Use of a dictionary or the syntax of the language. Sensor fusion: Combine multiple modalities; eg, visual

(lip image) and acoustic for speech Medical diagnosis: From symptoms to illnesses ...

Page 18: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

18

Face RecognitionFace Recognition

Training examples of a person

Test images

AT&T Laboratories, Cambridge UKhttp://www.uk.research.att.com/facedatabase.html

Page 19: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

19

Example 1Example 1

“Sorting incoming Fish on a conveyor according to species using optical sensing”

Sea bassSpecies

Salmon

Page 20: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

20

Problem Analysis

Set up a camera and take some sample images to extract features:

Length Lightness Width Number and shape of fins Position of the mouth, etc…

This is the set of all suggested features to explore for use in our classifier!

Page 21: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

21

Preprocessing

Use a segmentation operation to isolate fishes from one another and from the background.

Information from a single fish is sent to a feature extractor whose purpose is to reduce the data by measuring certain features.

The features are passed to a classifier.

Page 22: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

22

Page 23: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

23

Classification Select the length of the fish as a possible feature for

discrimination

The length is a poor feature alone!Select the lightness as a possible

feature.

Page 24: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

24

Classification Select the lightness of the fish as a possible feature for

discrimination

Move decision boundary toward smaller values of lightness in order to minimize the cost (reduce the number of sea bass that are classified salmon!)

Task of decision theory

Page 25: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

25

Adopt the lightness and add the width of the fish

Fish xT = [x1, x2 ]

lightness width (length)

We might add other features that are not correlated with the ones we already have.

Page 26: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

26

Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure:

Page 27: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

27

However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input

Issue of generalization!

However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input

Issue of generalization!

Page 28: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

28

Example: Particle ClassificationExample: Particle Classification

Particles on an air filter

P1:

P2:

P3:

Page 29: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

29

Sample data setSample data set

Area Perimeter Class

3 6 P15 7 P14 4 P17 6 P1

12 11 P215 10 P214 12 P217 13 P214 19 P313 20 P315 22 P312 18 P3… … …

Features Class

Page 30: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

30

Histogram AnalysisHistogram Analysis

P1

P2

P1: P2: P3:

Area

Num

ber

P2

Area

Num

ber

P3

Page 31: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

31

Histogram AnalysisHistogram Analysis

P2

P3

P1: P2: P3:

Perimeter

Num

ber

Area

Peri

mete

r

P1

P2

P3

Page 32: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

32

Input Space PartitioningInput Space Partitioning

P1: P2: P3:

Area

Peri

mete

r P1

P2

P3

Area

Peri

mete

r P1

P2

P3

Page 33: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

33

Classification SystemsClassification Systems

Sensing

Use of a transducer (camera or microphone) PR system depends of the bandwidth, the resolution

sensitivity distortion of the transducer

Segmentation and grouping

Patterns should be well separated and should not overlap

Page 34: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

34

Page 35: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

35

Feature extraction Discriminative features Invariant features with respect to translation, rotation

and scale.

Classification Use a feature vector provided by a feature extractor to

assign the object to a category

Post Processing Exploit context input dependent information other

than from the target pattern itself to improve performance

Page 36: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

36

The Design CycleThe Design Cycle

Data collection Feature Choice Model Choice Training Evaluation Computational Complexity

Page 37: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

37

Page 38: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

38

Data Collection How do we know when we have collected an

adequately large and representative set of examples for training and testing the system?

Feature Choice Depends on the characteristics of the problem

domain. Simple to extract, invariant to irrelevant transformation insensitive to noise.

Model Choice Unsatisfied with the performance of our fish

classifier and want to jump to another class of model.

Page 39: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

39

Training Use data to determine the classifier. Many different

procedures for training classifiers and choosing models

Evaluation Measure the error rate (or performance and

switch from one set of features to another one

Page 40: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

40

Application TypesApplication Types

Learning Association Supervised Learning

Classification Regression

Unsupervised Learning Reinforcement Learning (unsupervised)

Page 41: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

41

RegressionRegression

010

2030

40

0

10

20

30

20

22

24

26

Tem

pera

ture

0 10 200

20

40

[start Matlab demo lecture2.m]

Given examples

Predict given a new point

Page 42: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

42

010

2030

40

0

10

20

30

20

22

24

26

Tem

pera

ture

Linear regression

PredictionPrediction

0 200

20

40

Page 43: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

43

Regression - Example

Price of a used car x : car attributes

y : pricey = g (x | θ)

g ( ) : model,θ : parameters (car

type or model or options …

y = wx+w0

Page 44: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

44

Some Regression Applications

Navigating a car: Angle of the steering wheel Kinematics of a robot arm

α1= g1(x,y)

α2= g2(x,y)

α1

α2

(x,y)

Response surface design

Page 45: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

45

Response Surface Design

Response Surface Design is useful for the modeling and analysis of programs in which a response of interest is influenced by several variables and the objective is to optimize this response.

For example: Find the levels of temperature (x1) and pressure (x2) to maximize the yield (y) of a process. ),( 21 xxfy

Page 46: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

46

Application TypesApplication Types

Learning Association Supervised Learning

Classification Regression

Unsupervised Learning Reinforcement Learning (unsupervised)

Page 47: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

47

Unsupervised LearningUnsupervised Learning

Learning “what normally happens” No output Clustering: Grouping similar instances Example applications

Customer segmentation in CRM Image compression: Color quantization Bioinformatics: Learning motifs

A motif is a pattern common to a set of nucleic or amino acid subsequences which share some biological property of interest (such as being DNA binding sites for a regulatory protein).

Page 48: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

48

Application TypesApplication Types

Learning Association Supervised Learning

Classification Regression

Unsupervised Learning Reinforcement Learning (unsupervised)

Page 49: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

49

Reinforcement LearningReinforcement Learning

Learning a policy: A sequence of outputs No supervised output but delayed reward Credit assignment problem Game playing Robot in a maze Multiple agents, partial observability, ...

Page 50: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

50

Reinforcement Learning (RL)Reinforcement Learning (RL) In RL problems, an agent interacts with an

environment and iteratively learns a policy, i.e., it maps actions to states to maximize a reward signal.

Agent

Environment

action at

state st+1

reward rt+1

Update V(st) or Q(st,at)

Page 51: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

51

Learning Autonomous AgentLearning Autonomous Agent

Environment

ActionsPerceptions

Page 52: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

52

Learning Autonomous AgentLearning Autonomous Agent

Environment

ActionsPerceptions

DelayedReward

Page 53: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

53

Learning Autonomous AgentLearning Autonomous Agent

Environment

ActionsPerceptions

Newbehavior

Delayed RewardThe Reward won’t be given immediately after agent’s action.

Usually, it will be given only after achieving the goal.

This delayed reward is the only clue to agent’s learning.

Page 54: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

54

Resources: Datasets

UCI Repository: http://www.ics.uci.edu/~mlearn/MLRepository.html

UCI KDD Archive: http://kdd.ics.uci.edu/summary.data.application.html

Statlib: http://lib.stat.cmu.edu/ Delve: http://www.cs.utoronto.ca/~delve/

Page 55: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

55

Resources: Journals

Journal of Machine Learning Research www.jmlr.org Machine Learning Neural Computation Neural Networks IEEE Transactions on Neural Networks IEEE Transactions on Pattern Analysis and Machine

Intelligence Annals of Statistics Journal of the American Statistical Association ...

Page 56: INTRODUCTION TO Machine Learning Adapted from: ETHEM ALPAYDIN Lecture Slides for

56

Resources: Conferences

International Conference on Machine Learning (ICML) ICML05: http://icml.ais.fraunhofer.de/

European Conference on Machine Learning (ECML) ECML05: http://ecmlpkdd05.liacc.up.pt/

Neural Information Processing Systems (NIPS) NIPS05: http://nips.cc/

Uncertainty in Artificial Intelligence (UAI) UAI05: http://www.cs.toronto.edu/uai2005/

Computational Learning Theory (COLT) COLT05: http://learningtheory.org/colt2005/

International Joint Conference on Artificial Intelligence (IJCAI) IJCAI05: http://ijcai05.csd.abdn.ac.uk/

International Conference on Neural Networks (Europe) ICANN05: http://www.ibspan.waw.pl/ICANN-2005/

...