34
Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20 th , 2011 Some pictures are from Burr and Aarti’s lectures 1

Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Active Learning

Yi Zhang10-701, Machine Learning, Spring 2011April 20th, 2011

Some pictures are from Burr and Aarti’s lectures

1

Page 2: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and active

learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

2

Page 3: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Why active learning?

Learning classifiers using labeled examples

But obtaining labels is◦ Time consuming, e.g., document classification◦ Expensive, e.g., medical decision (need doctors)◦ Sometimes dangerous, e.g., landmine detection

3

Page 4: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Active learning

The learner actively chooses which examples to label !

Goal: reduce the number of labeled examples needed for learning

4

Page 5: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Active learning

5

Page 6: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Can active learning work?

Are all labeled examples equally important?

6

Page 7: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and

active learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

7

Page 8: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

(Passive) supervised learning

8

Page 9: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Semi-supervised learning

9

Page 10: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Active learning

10

Page 11: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Active learning vs. semi-supervised learning The same goal:◦ Attain good learning performance (e.g.,

classification accuracy) without demanding too many labeled examples

Different approaches◦ Semi-supervised learning: use unlabeled data◦ Active learning: choose labeled examples

11

Page 12: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and active

learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

12

Page 13: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Three active learning scenarios

Query synthesis◦ learner constructs examples for labeling

Selective sampling◦ Unlabeled data come as a stream◦ For each arrived point, learner decides to query

or discard

Pool-based active learning (*)◦ Given a pool of unlabeled data◦ Learner chooses from the pool for labeling

13

Page 14: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Pool-based active learning

A pool of unlabeled samples A learned model (or a set of models) Choose: which sample to label next?

14

Page 15: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

Uncertainty sampling◦ Query the sample x that the learner is most

uncertain about◦ How to measure “uncertainty”?

15

Page 16: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

Uncertainty sampling◦ Maximum entropy

16

Page 17: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

Uncertainty sampling◦ Smallest margin (between most likely and

second most likely labels)

17

Page 18: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

Uncertainty sampling◦ Least confidence

18

Page 19: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

19

Page 20: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Uncertainty sampling

20

Page 21: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and active

learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

21

Page 22: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Version space

The set of classifiers that are consistent with labeled examples

22

Page 23: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Recall: uncertainty sampling

Uncertainty sampling

23

Page 24: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Version space reduction

Version space reduction◦ Query the sample x that reduces the version most “Expected” reduction of version space “Worst case” reduction of version space “Best-case” reduction of version space

24

Page 25: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

A simplifying example

How to query? Binary search !

25

Page 26: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

A simplifying example Why binary search ?

◦ Recall: version space

◦ Maximize the “worst-case” reduction of version space

26

Page 27: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Version space reduction for SVMs

27

Page 28: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Query by committee

Query by committee◦ Keep a committee of classifiers◦ Query the instance that the committee members

disagree

28

Page 29: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Query by committee

29

Page 30: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Query by committee Query by committee◦ Keep a committee of classifiers◦ Query the instance that the committee members

disagree

QBC as version space reduction◦ Committee is an approximation to the version space

QBC as uncertainty sampling◦ Use committee members to measure the uncertainty

30

Page 31: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and active

learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

31

Page 32: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Expected error (uncertainty) reduction

32

Page 33: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Outline

Basic idea of active learning Supervised, semi-supervised, and active

learning Uncertainty sampling Version space reduction and query by

committee Expected error reduction Other active learning methods

33

Page 34: Active Learning - Carnegie Mellon School of Computer Sciencetom/10701_sp11/recitations/Recitation_13.pdf · Active Learning Yi Zhang 10-701, Machine Learning, Spring 2011 April 20th,

Other active learning methods

Active learning is an active research area Cost sensitive active learning◦ Labeling costs of examples differ

Batch mode active learning◦ Query multiple instances at once

Multi-task active learning◦ Query labels for multiple learning tasks

See survey of Burr Settles ! [Settles, 2008]

34