60
1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h such that h f given a training set of examples (This is a highly simplified model of real learning: Ignores prior knowledge Assumes examples are given)

1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

Embed Size (px)

Citation preview

Page 1: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

1

Inductive learning

• Simplest form: learn a function from examplesf is the target function

An example is a pair (x, f(x))

Problem: find a hypothesis hsuch that h ≈ fgiven a training set of examples

(This is a highly simplified model of real learning:– Ignores prior knowledge– Assumes examples are given)

Page 2: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

2

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• E.g., curve fitting:•

Page 3: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

3

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• E.g., curve fitting:•

Page 4: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

4

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• E.g., curve fitting:•

Page 5: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

5

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• E.g., curve fitting:•

Page 6: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

6

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• E.g., curve fitting:

Page 7: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

7

Inductive learning method

• Construct/adjust h to agree with f on training set• (h is consistent if it agrees with f on all examples)

• Ockham’s razor: prefer the simplest hypothesis consistent with data

• E.g., curve fitting:

Page 8: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

8

Learning decision trees

Problem: decide whether to wait for a table at a restaurant, based on the following attributes:1. Alternate: is there an alternative restaurant nearby?2. Bar: is there a comfortable bar area to wait in?3. Fri/Sat: is today Friday or Saturday?4. Hungry: are we hungry?5. Patrons: number of people in the restaurant (None, Some, Full)6. Price: price range ($, $$, $$$)7. Raining: is it raining outside?8. Reservation: have we made a reservation?9. Type: kind of restaurant (French, Italian, Thai, Burger)10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

Page 9: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

9

Attribute-based representations

• Examples described by attribute values (Boolean, discrete, continuous)

• E.g., situations where I will/won't wait for a table:

• Classification of examples is positive (T) or negative (F)

Page 10: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

10

Decision trees

• One possible representation for hypotheses• E.g., here is the “true” tree for deciding whether to wait:

Page 11: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

11

Expressiveness• Decision trees can express any function of the input attributes.• E.g., for Boolean functions, truth table row → path to leaf:

• Trivially, there is a consistent decision tree for any training set with one path to leaf for each example (unless f nondeterministic in x) but it probably won't generalize to new examples

• Prefer to find more compact decision trees

Page 12: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

12

Hypothesis spaces

How many distinct decision trees with n Boolean attributes?

= number of Boolean functions

= number of distinct truth tables with 2n rows = 22n

• E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees

Page 13: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

13

3.1 Decision Trees An Algorithm for Building

Decision Trees1. Let T be the set of training instances.2. Choose an attribute that best differentiates the instances in T.3. Create a tree node whose value is the chosen attribute.

-Create child links from this node where each link represents a unique value for the chosen attribute.-Use the child link values to further subdivide the instances into

subclasses. 4. For each subclass created in step 3: -If the instances in the subclass satisfy predefined criteria or if the set of

remaining attribute choices for this path is null, specify the classification for new instances following this decision path. -If the subclass does not satisfy the criteria and there is at least one

attribute to further subdivide the path of the tree, let T be the current set of subclass instances and return to step 2.

Page 14: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

14

Table 3.1 • The Credit Card Promotion Database

Income Life Insurance Credit CardRange Promotion Insurance Sex Age

40–50K No No Male 4530–40K Yes No Female 4040–50K No No Male 4230–40K Yes Yes Male 4350–60K Yes No Female 3820–30K No No Female 5530–40K Yes Yes Male 3520–30K No No Male 2730–40K No No Male 4330–40K Yes No Female 4140–50K Yes No Female 4320–30K Yes No Male 2950–60K Yes No Female 3940–50K No No Male 5520–30K Yes Yes Female 19

Table 3.1

Page 15: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

15

Age

Sex

<= 43

Male

Yes (6/0)

Female

> 43

CreditCard

Insurance

YesNo

No (4/1) Yes (2/0)

No (3/0)

Page 16: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

16

IncomeRange

30-40K

4 Yes1 No

2 Yes2 No

1 Yes3 No

2 Yes

50-60K40-50K20-30K

Figure 3.1

Page 17: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

17

CreditCard

Insurance

No Yes

3 Yes0 No

6 Yes6 No

Page 18: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

18

Age

<= 43 > 43

0 Yes3 No

9 Yes3 No

Page 19: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

19

Age

Sex

<= 43

Male

Yes (6/0)

Female

> 43

CreditCard

Insurance

YesNo

No (4/1) Yes (2/0)

No (3/0)Decision Trees for the Credit Card

Promotion Database

Figure 3.4

Page 20: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

20

CreditCard

Insurance

Sex

No

Male

Yes (6/1)

Female

Yes

Yes (3/0)

No (6/1)

Figure 3.5

Page 21: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

21

Table 3.2 • Training Data Instances Following the Path in Figure 3.4 to Credit CardInsurance = No

Income Life Insurance Credit CardRange Promotion Insurance Sex Age

40–50K No No Male 4220–30K No No Male 2730–40K No No Male 4320–30K Yes No Male 29

Page 22: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

22

Decision Tree Rules A Rule for the Tree in Figure 3.4

IF Age <=43 & Sex = Male & Credit Card Insurance = No

THEN Life Insurance Promotion = No

Page 23: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

23

A Simplified Rule Obtained by Removing Attribute Age

IF Sex = Male & Credit Card Insurance = No

THEN Life Insurance Promotion = No

Page 24: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

24

Other Methods for Building Decision Trees

• CART

• CHAID

Page 25: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

25

Advantages of Decision Trees

• Easy to understand.

• Map nicely to a set of production rules.• Applied to real problems.• Make no prior assumptions about the data.• Able to process both numerical and categorical data.

Page 26: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

26

Disadvantages of Decision Trees

• Output attribute must be categorical.

• Limited to one output attribute.• Decision tree algorithms are unstable.• Trees created from numeric datasets can be complex.

Page 27: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

27

3.2 Generating Association Rules Confidence and Support

Page 28: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

28

Rule Confidence

Given a rule of the form “If A then B”, rule confidence is the conditional probability that B is true when A is known to be true.

Page 29: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

29

Rule Confidence

• If customers purchase milk they also purchase bread

• If customers purchase bread they also purchase milk

Page 30: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

30

Rule Support

The minimum percentage of instances in the database that contain all items listed in a given association rule.

Page 31: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

31

Mining Association Rules: An Example

Page 32: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

32

Table 3.3 • A Subset of the Credit Card Promotion Database

Magazine Watch Life Insurance Credit CardPromotion Promotion Promotion Insurance Sex

Yes No No No MaleYes Yes Yes No FemaleNo No No No MaleYes Yes Yes Yes MaleYes No Yes No FemaleNo No No No FemaleYes No Yes Yes MaleNo Yes No No MaleYes No No No MaleYes Yes Yes No Female

Page 33: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

33

Table 3.4 • Single-Item Sets Single-Item Sets Number of Items Magazine Promotion = Yes 7 Watch Promotion = Yes 4 Watch Promotion = No 6 Life Insurance Promotion = Yes 5 Life Insurance Promotion = No 5 Credit Card Insurance = No 8 Sex = Male 6 Sex = Female 4

Page 34: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

34

Table 3.5 • Two-Item Sets

Two-Item Sets Number of Items

Magazine Promotion = Yes & Watch Promotion = No 4Magazine Promotion = Yes & Life Insurance Promotion = Yes 5Magazine Promotion = Yes & Credit Card Insurance = No 5Magazine Promotion = Yes & Sex = Male 4Watch Promotion = No & Life Insurance Promotion = No 4Watch Promotion = No & Credit Card Insurance = No 5Watch Promotion = No & Sex = Male 4Life Insurance Promotion = No & Credit Card Insurance = No 5Life Insurance Promotion = No & Sex = Male 4Credit Card Insurance = No & Sex = Male 4Credit Card Insurance = No & Sex = Female 4

Page 35: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

35

Two-item set rules

IF Magazine Promotion = Yes

THEN Life Insurance Promotion = Yes

(5/7)(5/10)

IF Life Insurance Promotion = Yes

THEN Magazine Promotion = Yes

(5/5) (5/10)

Page 36: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

36

Three-item sets

• Watch Promotion=No & Life Insurance Promotion=No & Credit Card Insurance=No

Page 37: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

37

Three-item set rules

IF Watch Promotion=No & Life Insurance Promotion=No

THEN Credit Card Insurance=No

(4/4)

IF Watch Promotion = No

THEN Life Insurance Promotion=No & Credit Card Insurance=No

(4/6)

Page 38: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

38

General Considerations

• We are interested in association rules that show a lift in product sales where the lift is the result of the product’s association with one or more other products.

• We are also interested in association rules that show a lower than expected confidence for a particular association.

Page 39: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

39

3.3 The K-Means Algorithm

1. Choose a value for K, the total number of clusters.

2. Randomly choose K points as cluster centers.

3. Assign the remaining instances to their closest cluster center.

4. Calculate a new cluster center for each cluster.

5. Repeat steps 3-5 until the cluster centers do not change.

Page 40: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

40

Table 3.6 • K-Means Input Values

Instance X Y

1 1.0 1.52 1.0 4.53 2.0 1.54 2.0 3.55 3.0 2.56 5.0 6.0

An Example Using K-Means

Page 41: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

41

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6

f(x)

x

Page 42: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

42

Table 3.7 • Several Applications of the K-Means Algorithm (K = 2)

Outcome Cluster Centers Cluster Points Squared Error

1 (2.67,4.67) 2, 4, 614.50

(2.00,1.83) 1, 3, 5

2 (1.5,1.5) 1, 315.94

(2.75,4.125) 2, 4, 5, 6

3 (1.8,2.7) 1, 2, 3, 4, 59.60

(5,6) 6

Page 43: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

43

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6

x

f(x)

Page 44: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

44

General Considerations

• Requires real-valued data.

• We must select the number of clusters present in the data.

• Works best when the clusters in the data are of approximately equal size.

• Attribute significance cannot be determined.• Lacks explanation capabilities.

Page 45: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

45

3.4 Genetic Learning

Page 46: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

46

Genetic Learning Operators

• Crossover

• Mutation

• Selection

Page 47: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

47

Genetic Algorithms and Supervised Learning

Page 48: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

48

FitnessFunction

PopulationElements

Candidatesfor Crossover

& Mutation

TrainingData

Keep

Throw

Page 49: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

49

Table 3.8 • An Initial Population for Supervised Genetic Learning

Population Income Life Insurance Credit CardElement Range Promotion Insurance Sex Age

1 20–30K No Yes Male 30–392 30–40K Yes No Female 50–593 ? No No Male 40–494 30–40K Yes Yes Male 40–49

Page 50: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

50

Table 3.9 • Training Data for Genetic Learning

Training Income Life Insurance Credit CardInstance Range Promotion Insurance Sex Age

1 30–40K Yes Yes Male 30–392 30–40K Yes No Female 40–493 50–60K Yes No Female 30–394 20–30K No No Female 50–595 20–30K No No Male 20–296 30–40K No No Male 40–49

Page 51: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

51

PopulationElement

AgeSexCredit CardInsurance

Life InsurancePromotion

IncomeRange

#1 30-39MaleYesNo20-30K

PopulationElement

AgeSexCredit CardInsurance

Life InsurancePromotion

IncomeRange

#2 50-59FemNoYes30-40K

PopulationElement

AgeSexCredit CardInsurance

Life InsurancePromotion

IncomeRange

#2 30-39MaleYesYes30-40K

PopulationElement

AgeSexCredit CardInsurance

Life InsurancePromotion

IncomeRange

#1 50-59FemNoNo20-30K

Page 52: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

52

Table 3.10 • A Second-Generation Population

Population Income Life Insurance Credit CardElement Range Promotion Insurance Sex Age

1 20–30K No No Female 50–592 30–40K Yes Yes Male 30–393 ? No No Male 40–494 30–40K Yes Yes Male 40–49

Page 53: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

53

Genetic Algorithms and Unsupervised Clustering

Page 54: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

54

a1 a2 a3 . . . an

.

.

.

.

I1

Ip

I2.....

Pinstances

S1

Ek2

Ek1

E22

E21

E12

E11

SK

S2

Solutions

.

.

.

Page 55: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

55

Table 3.11 • A First-Generation Population for Unsupervised Clustering

S1

S2

S3

Solution elements (1.0,1.0) (3.0,2.0) (4.0,3.0)(initial population) (5.0,5.0) (3.0,5.0) (5.0,1.0)

Fitness score 11.31 9.78 15.55

Solution elements (5.0,1.0) (3.0,2.0) (4.0,3.0)(second generation) (5.0,5.0) (3.0,5.0) (1.0,1.0)

Fitness score 17.96 9.78 11.34

Solution elements (5.0,5.0) (3.0,2.0) (4.0,3.0)(third generation) (1.0,5.0) (3.0,5.0) (1.0,1.0)

Fitness score 13.64 9.78 11.34

Page 56: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

56

General Considerations

• Global optimization is not a guarantee.

• The fitness function determines the complexity of the algorithm.• Explain their results provided the fitness

function is understandable.• Transforming the data to a form suitable for

genetic learning can be a challenge.

Page 57: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

57

3.5 Choosing a Data Mining Technique

Page 58: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

58

Initial Considerations

• Is learning supervised or unsupervised?

• Is explanation required?–Neural networks, regression models are black-box

• What is the interaction between input and output attributes?• What are the data types of the input and output

attributes?

Page 59: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

59

Further Considerations

• Do We Know the Distribution of the Data?–Many statistical techniques assume the data to be normally distributed

• Do We Know Which Attributes Best Define the Data?

–Decision trees and certain statistical approaches–Neural network, nearest neighbor, various clustering approaches

Page 60: 1 Inductive learning Simplest form: learn a function from examples f is the target function An example is a pair (x, f(x)) Problem: find a hypothesis h

60

Further Considerations

• Does the Data Contain Missing Values?– Neural networks

• Is Time an Issue?– Decision trees

• Which Technique Is Most Likely to Give a Best Test Set Accuracy?– Multiple model approaches (Chp. 11)