Fuzzy Systems - Fuzzy Rule...

Preview:

Citation preview

Fuzzy SystemsFuzzy Rule Generation

Rudolf Kruse Christian Moewes{kruse,cmoewes}@iws.cs.uni-magdeburg.de

Otto-von-Guericke University of MagdeburgFaculty of Computer Science

Department of Knowledge Processing and Language Engineering

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 1 / 61

Outline

1. Motivation

2. Extracting Grid-Based Fuzzy Rules

3. Extracting Individual Fuzzy Rules

4. Rule Generation by Fuzzy Clustering

5. Different Approaches

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 2 / 61

Extracting Fuzzy Systems from Data

How can fuzzy systems automatically be derived from example data?

• suppose, input space X and output space Y

• we observe n training patterns (xi , yi ) ∈ S ⊆ X × Y, 1 ≤ i ≤ n

• given numerical input, X = (X1, . . . , Xp) ⊂ IRp and thus xi 7→ x i

• fuzzy rule base shall approximate S

• classical two-step approach1. find fuzzy sets by

• either predefining them on input and output variables

• or constructing them throughout learning procedure

2. find fuzzy rules• either directly

• or iteratively

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 3 / 61

Outline

1. Motivation

2. Extracting Grid-Based Fuzzy RulesWang & Mendel AlgorithmHiggins & Goodman Algorithm

3. Extracting Individual Fuzzy Rules

4. Rule Generation by Fuzzy Clustering

5. Different Approaches

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 4 / 61

Fixed Grid-Based Algorithms

• each Xj is partitioned into small set of linguistic variables

• rules use all or subset of possible combinations

⇒ global granulation of X into tiles

R1,...,1 : if x1 is µ1,1 and . . . and xp is µ1,p then . . .. . .Rl1,...,lp : if x1 is µl1,1 and . . . and xp is µlp ,p then . . .

• lj (1 ≤ j ≤ p): number of linguistic values for Xj

• problems• exponentially many rules in high-dimensional spaces• fine grid causes very high computational costs• wrong choice of grid may skip extrema

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 5 / 61

Wang & Mendel Algorithm

• basic fuzzy rule learning method

1. predefine global grid for X and Y2. determine best possible output fuzzy set for each rule

• one run through S determines closest x to geometrical rule center

• closest output fuzzy value 7→ corresponding fuzzy rule

• problems if

1. function extrema far from away center point2. number of rules gets huge

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 6 / 61

1. Granulate X and Y

x1

1

0

µ1,1 µ1,2 µ1,3

x2

1 0

µ2,1

µ2,2

x1

R1,1R

1,2 R1,3

R2,1

R2,2 R2,3

• divide each Xj into ljequidistant triangularfuzzy sets

• similarly, Y isgranulated into lytriangular fuzzy sets

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 7 / 61

2. Determine best consequence for each rule

• for each (x, y) = (x1, . . . , xp, y) ∈ Scompute membership degree to each possible tile

min{

µk1,1(x1), . . . , µkp ,p(xp), µky (y)}

with 1 ≤ kj ≤ lj and 1 ≤ ky ≤ ly

• µkj ,j = membership function of kj -th linguistic value of Xj

• ∀k1, . . . , kp tile arg max1≤ky ≤ly µ(k1, . . . , kp, ky ) is 1 rule

R(k1,...,kp) : if x1 is µk1,1 and . . . and xp is µkp ,p then y is µky

• membership degree is used as rule weight β(k1,...,kp)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 8 / 61

Determining y based on new x

• given arbitrary x, rules determine crisp output y

1. for each rule compute degree of fulfillment

µ(k1,...,kp)(x) = min{

µk1,1(x1), . . . , µkp ,p(xp)}

2. comput y by some kind of COG defuzzification

y =

l1,...,lp∑

k1=1,...,kp=1

β(k1,...,kp) · µ(k1,...,kp)(x) · y(k1,...,kp)

β(k1,...,kp) · µ(k1,...,kp)(x)

• y(k1,...,kp) = center of output region of R(k1,...,kp)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 9 / 61

Example: Wang & Mendel Algorithm

given data points step 1: granulate X and Y

• example data set with one input and one output

• note that closest points to corresponding rules are red

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 10 / 61

Example: Wang & Mendel Algorithm (cont.)

step 2: generate rules resulting crisp approximation

• fuzzy rules are shown by their α = 0.5-cuts

• learned model misses extrema far away from rule centers

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 11 / 61

Example: Wang & Mendel Algorithm (cont.)

• generated rule base:

R1 : if x is zerox then y is mediumy

R2 : if x is smallx then y is mediumy

R3 : if x is mediumx then y is largey

R4 : if x is largex then y is mediumy

• intuitively, rule R2 should probably be used to describe minimum

R ′2 : if x is smallx then y is smally

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 12 / 61

Summary

• one pattern per rule is used to compute rule’s outcome

• high variance would lead to model failures

• predefined fixed grid yields to fuzzy model which

• either does not fit underlying function very well

• or consists of large number of rules

⇒ wish to automatically determine granulations of X and Y

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 13 / 61

Higgins & Goodman Algorithm

• extension of Wang & Mendel algorithm

1. only one membership function is used for each Xj and Y

⇒ one large rule covering entire feature space

2. new membership functions are placed at points of maximum error

• both steps are repeated until

• maximum number of divisions is reached or

• approximation error remains below threshold

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 14 / 61

1. Initialization

• create membership functionfor each Xj covering entiredomain range

• create membership functionfor Y at corner points of X

• at corner point, each Xj ismaximal or minimal of itsdomain range

• for each corner point, closestexample from S is used to addmembership function at itsoutput value

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 15 / 61

2. Adding new Membership Functions

• find point within S withmaximum error

• defuzzification equalsWang & Mendel

• for each Xj , add newmembership function atcorresponding value of“maximal error point”

⇒ perfectly described by model

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 16 / 61

3. Create new Cell-based Rule Set

• new rules: associate outputmembership functions withnewly created cells

⇒ take closest point to allmembership functions of X(equals Wang & Mendel)

• associated output membershipfunction is closest one tooutput value of “closest point”

• if output value of “closestpoint” is far away, new outputfunction is created

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 17 / 61

4. Termination Detection

• if error is below threshold (orif certain number of iterationshave been done),then stop algorithm

• otherwise continue at step 2

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 18 / 61

Summary

• this approach can model extrema better than Wang & Mendel

• it favors extrema

⇒ strong tendency to outliers

• data driven granulation: difficult to interpret

• greedy algorithm: grid is often suboptimal

• it is possible to simplify learned rule base, e.g.,

by rule ranking and search for best rules

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 19 / 61

Outline

1. Motivation

2. Extracting Grid-Based Fuzzy Rules

3. Extracting Individual Fuzzy RulesIndividual Membership FunctionsBerthold & Huber Algorithm

4. Rule Generation by Fuzzy Clustering

5. Different Approaches

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 20 / 61

Extensions avoiding Global Grids

• in high-dimensional X , global granulation leads to many rules

⇒ now, no global dependence on granulation• individual membership functions for each rule• better modeling of local properties

R1 : if x1 is µ1,1 and . . . and xp is µ1,p then . . .. . .

Rr : if x1 is µr ,1 and . . . and xp is µr ,p then . . .

• not all attributes will be used for all rules• individual choice of constraints on few attributes per rule• better interpretability in high dimensions• no exponential number of rules with increasing dimensionality

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 21 / 61

Local Granulation

x1

1

0

x2

1 0x1

µ1,1µ1,2 µ1,3

µ2,1

µ2,2

µ2,2

R1

R2 R3

• 3 rules in 2 dimensions

• compare with globalgranulation à laWang & Mendel

• possible disadvantage

• potential loss ofinterpretation

• projection of allfuzzy sets onto oneXj is usually notmeaningful

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 22 / 61

Berthold & Huber Algorithm

• constructs rule base with individual fuzzy sets per rule

• parameters that must be specified• granulation of Y, i.e., number and shape of membership functions• c fuzzy sets defined by µk

y with 1 ≤ k ≤ c

• algorithm iterates over S and fine-tunes evolving model

• final rule base consists of fuzzy rules Rkd , 1 ≤ d ≤ rk

• rk = number of rules for output region k

• output for k−th region and some x equals Mamdani controller

µk(x) = max1≤d≤rk

{

min1≤j≤p

{µkd,j(xj)}

}

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 23 / 61

Form of RulesRules

• all rules rely on trapezoidal membership functions

⇒ each rule can be described by 4 parameters per Xj

if x1 is 〈a1, b1, c1, d1〉 and . . . and xp is 〈ap, bp, cp , dp〉then y is µk

y

• however, if some trapezoids cover entire domain of an Xj ,

then rule’s degree of fulfillment is independent from of this Xj

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 24 / 61

Algorithm 1 Berthold & Huber

Input: S = {(x i , ci) | 1 ≤ i ≤ n} with x ∈ IRp and c ∈ IN class of x

do {for each training example (x, c) ∈ S {

if correct rule of class c exists {increase weight by one // COVERadjust core region of rule to cover x

} else {insert new rule with core equals x // COMMITsupport equals ∞ (i.e., rule is not constrained)

}reduce support of all rules of conflicting class that cover x //SHRINK

}} while changes have occurred

• COVER and COMMIT are easy to implement

• SHRINK is based on heuristics (e.g., volume-based)

Some Remarks on Core and Support Regions

• algorithm finds rule base that completely describes data

• each rule is partial hypothesis for subset S ⊂ S

• core = most specific hypothesis covering S

• support = (one of the) most general hypotheses covering S

⇒ support is more general than core

• both core and support regions can be seen as

• smallest area with highest degree of confidence (evidence)

• largest area without conflict (no counter-example)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 26 / 61

Example: Berthold & Huber Algorithm

• given two-dimensional X andtraining data S where |Y| = 2

• task: fuzzy binaryclassification

• first, start with empty rulebase for each region/class

• Java applet is available thatdemonstrates algorithm

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 27 / 61

Example: Berthold & Huber Algorithm (cont.)

• insert general rule for firstexample pattern

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 28 / 61

Example: Berthold & Huber Algorithm (cont.)

• suppose that 2nd pattern isfrom different class

⇒ new rule is inserted for 2ndpattern

• also adjust 1st (conflicting)rule

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 29 / 61

Example: Berthold & Huber Algorithm (cont.)

• suppose that 3rd pattern isfrom same class as 2nd one

⇒ adjust free feature to avoidconflict with 3rd pattern

• and so on. . .

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 30 / 61

Choosing the Right Feature to Shrink

• in p-dimensional feature space, there are p choices

• algorithm uses several heuristics:

1. maximize remaining volume ⇒ low rule numbers, good coverage

2. minimize number of constrained attributes ⇒ feature reduction

3. minimize number of constraints on free features ⇒ interpretability

4. use information theoretic measures ⇒ feature importance

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 31 / 61

Outline

1. Motivation

2. Extracting Grid-Based Fuzzy Rules

3. Extracting Individual Fuzzy Rules

4. Rule Generation by Fuzzy ClusteringExtend Membership Values to Continuous Membership FunctionsExample: The Iris DataInformation Loss from ProjectionExample: Transfer Passenger Analysis

5. Different Approaches

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 32 / 61

Rule Generation by Fuzzy Clustering

1. apply fuzzy clustering to X ⇒ fuzzy partition matrix U = [uij ]

2. use obtained U = [uij ] to define membership functions

• usually X is multidimensional

⇒ How to specify meaningful labels for multidim. membershipfunctions?

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 33 / 61

Extend uij to Continuous Membership Functions

• assigning labels for one-dimensional domains is easier ⇒

1. project U down to X1, . . . , Xp axis, respectively2. only consider upper envelope of membership degrees3. linear interpolate membership values ⇒ membership functions4. cylindrically extend membership functions

• original clusters are interpreted as conjunction of cyl. extensions

• e.g., cylindrical extensions “x1 is low”, “x2 is high”⇒ multidimensional cluster label “x1 is low and x2 is high”

• labeled clusters = classes characterized by labels

• every cluster = one fuzzy rule

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 34 / 61

Convex Completion [Höppner et al., 1999]

• problem of this approach: non-convex fuzzy sets

⇒ having upper envelope, compute convex completion

• we denote p1, . . . , pn, p1 ≤ . . . , pk as ordered projections ofx1, . . . , xn and µi1, . . . , µin as respective membership values

• eliminate each point (pt , µit), t = 1, . . . , n, for which two limitindices tl , tr = 1, . . . , n, tl < t < tr , exist s.t.

muit < min{muitl, muitr }

• after that: apply linear interpolation of remaining points

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 35 / 61

Example: The Iris Datac© Iris Species Database http://www.badbear.com/signa/

Iris setosa Iris versicolor Iris virginica

• collected by Ronald Aylmer Fischer (famous statistician)• 150 cases in total, 50 cases per Iris flower type• measurements: sepal length/width, petal length/width (in cm)• most famous dataset in pattern recognition and data analysis

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 36 / 61

Example: The Iris Data

• shown: sepal length and petal length

• Iris setosa (red), Iris versicolor (green), Iris virginica (blue)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 37 / 61

1. Membership Degrees from FCM

4 4.5 5 5.5 6 6.5 7 7.5 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sepal length pedal length

• raw, unmodified membership degrees

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 38 / 61

2. Upper Envelope

4 4.5 5 5.5 6 6.5 7 7.5 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sepal length pedal length

• for every attribute value and cluster center, only considermaximum membership degree

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 39 / 61

3. Convex Completion

4 4.5 5 5.5 6 6.5 7 7.5 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sepal length pedal length

• convex completion removes spikes [Höppner et al., 1999]

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 40 / 61

4. Linear Interpolation

4 4.5 5 5.5 6 6.5 7 7.5 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sepal length pedal length

• interpolation for missing values (needed for normalization)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 41 / 61

5. Stretched and Normalized Fuzzy Sets

4 4.5 5 5.5 6 6.5 7 7.5 80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

sepal length pedal length

• every µi(xj) 7→ µi(xj)5 (extends core and support)

• normalization has been performed finally

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 42 / 61

Information Loss from Projection

• rule derived from fuzzy clusterrepresents approximation of cluster

• information gets lost by projection

• cluster shape of FCM is spherical

• cluster projection leads tohypercube

• hypercube contains hypersphere

• loss of information can be kept smallusing axes-parallel clusters

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 43 / 61

Example: Transfer Passenger Analysis[Keller and Kruse, 2002]

• German Aerospace Center (DLR) developed macroscopicpassenger flow model for simulating passenger movements onairport’s land side

• for passenger movements in terminal areas: distribution functionsare used today

• goal: build fuzzy rule base describing transfer passenger amountbetween aircrafts

• these rules can be used to improve macroscopic simulation

• idea: find rules based on probabilistic fuzzy c-means (FCM)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 44 / 61

Attributes for Passenger Analysis

• maximal amount of passengers in certain aircraft (depending ontype of aircraft)

• distance between airport of departure and airport of destination(in three categories: short-, medium-, and long-haul)

• time of departure

• percentage of transfer passengers in aircraft

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 45 / 61

General Clustering Procedure

evaluation

calculation

preparation

extraction offuzzy rules

sufficientclassifica-

tion?

preprocessing

calculationof prototypes

parameterselection

calculation ofmembership

degrees

initialization

identificationof outliers

scale adaptionclusteringtechnique

number of clustersor validity measure

similaritymeasure

no

yes

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 46 / 61

Distance Measure

• distance between x = (x1, x2) and c = (0, 0)

−1

0

1

−1

−0.5

0

0.5

10

0.5

1

1.5

2

−1

0

1

−1

−0.5

0

0.5

10

1

2

3

4

d2(c, x) = ‖c − x‖2d2

τ(c, x) = 1

τp ‖c − x‖2

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 47 / 61

Distance Measure with Size Adaption

d2ij =

1

τpi

· ‖c i − x j‖2

c i =

∑nj=1 um

ij x j∑n

j=1 umij

τi =

(

∑nj=1 um

ij d2ij

)1

p+1

∑ck=1

(

∑nj=1 um

kj d2kj

)1

p+1

· τ

τ =c

i=1

τi

• p determines emphasis put on size adaption during clustering

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 48 / 61

Constraints for the Objective function

• probabilistic clustering

• noise clustering

• influence of outliers

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 49 / 61

Probabilistic and Noise Clustering

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 50 / 61

Influence of Outliers

• A weighting factor ωj is attached to each datum x j

• weighting factors are adapted during clustering

• using concept of weighting factors:

• outliers in data set can be identified and

• outliers’ influence on partition is reduced

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 51 / 61

Membership Degrees and Weighting Factors

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 52 / 61

Influence of Outliers

• minimize objective function

J(X , U, C) =c

i=1

n∑

j=1

umij ·

1

ωqj

· d2ij

subject to

∀j ∈ [n] :c

i=1

uij = 1, ∀i ∈ [c] :n

j=1

uij > 0,

n∑

j=1

ωj = ω

• q determines emphasis put on weight adaption during clustering

• update equations for memberships and weights, resp.

uij =d

21−m

ij

∑ck=1 d

21−m

kj

, ωj =

(

∑ci=1 um

ij d2ij

)1

q+1

∑nk=1

(∑c

i=1 umik d2

ik

)1

q+1

· ω

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 53 / 61

Determining the Number of Clusters

• here, validity measuresevaluating whole partition ofdata

⇒ global validity measures

• clustering is run for varyingnumber of clusters

• validity of resulting partitionsis compared

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 54 / 61

Fuzzy Rules and Induced Vague Areas

• intensity of color indicatesfiring strength of specific rule

• vague areas = fuzzy clusterswhere color intensity indicatesmembership degree

• tips of fuzzy partitions insingle domains = projectionsof multidimensional clustercenters

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 55 / 61

Simplification of Fuzzy Rules

• similar fuzzy sets arecombined to one fuzzy set

• fuzzy sets similar to universalfuzzy set are removed

• rules with same input sets are

• combined if they also havesame output set(s) or

• otherwise removed fromrule set

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 56 / 61

Results

• FCM with c = 18, outlier and size adaptation, Euclidean distance:

resulting fuzzy sets simplified fuzzy sets

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 57 / 61

Evaluation of the Rule Baserule max. no. of pax De st. depart. % transfer pax

1 paxmax1 R1 time1 tpax12 paxmax2 R1 time2 tpax23 paxmax3 R1 time3 tpax34 paxmax4 R1 time4 tpax45 paxmax5 R5 time1 tpax5. . . . . . . . . . . . . . .

• rules 1 and 5: aircraft with relatively small amount of maximalpassengers (80-200), short- to medium-haul destination, anddeparting late at night usually have high amount of transferpassengers (80-90%)

• rule 2: flights with medium-haul destination and small aircraft(about 150 passengers), starting about noon, carry relatively highamount of transfer passengers (ca. 70%)

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 58 / 61

Outline

1. Motivation

2. Extracting Grid-Based Fuzzy Rules

3. Extracting Individual Fuzzy Rules

4. Rule Generation by Fuzzy Clustering

5. Different Approaches

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 59 / 61

Different Approaches

• constructive: find fuzzy rules by growing singletons

• hierarchical : merge grid cells if no points are covered or sameclass is predicted

• adaptive:

• initialize rules randomly (e.g., with expert knowledge) anditeratively optimize rule parameters (e.g., location, number offuzzy sets)

• based on, e.g., gradient descent, neural networks, . . .

• evolutionary : find rules by mutation/crossover over generations

• neuro-fuzzy : inject fuzzy rules into ANN, use its learningalgorithm

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 60 / 61

Literature about Fuzzy Rule Generation

Berthold, M. R. and Hand, D. J., editors (2003).Intelligent Data Analysis: An Introduction.Springer, Berlin, Germany, 2nd edition.

Borgelt, C., Klawonn, F., Kruse, R., and Nauck, D.(2003).Neuro-Fuzzy-Systeme: Von den Grundlagen künstlicherNeuronaler Netze zur Kopplung mit Fuzzy-Systemen.Vieweg, Wiesbaden, Germany, 3rd edition.

Höppner, F., Klawonn, F., Kruse, R., and Runkler, T.(1999).Fuzzy Cluster Analysis: Methods for Classification, DataAnalysis and Image Recognition.John Wiley & Sons Ltd, New York, NY, USA.

Keller, A. and Kruse, R. (2002).Fuzzy rule generation for transfer passenger analysis.In Wang, L., Halgamuge, S. K., and Yao, X., editors,Proceedings of the 1st International Conference on FuzzySystems and Knowledge Discovery (FSDK’02), pages667–671, Orchid Country Club, Singapore.

R. Kruse, C. Moewes Fuzzy Systems – Fuzzy Rule Generation 2010/01/27 61 / 61

Recommended