Fuzzy Modeling

Embed Size (px)

Citation preview

  • 8/4/2019 Fuzzy Modeling

    1/65

    By: Saeed [email protected]

    Fuzzy modeling

    1

  • 8/4/2019 Fuzzy Modeling

    2/65

    Introduction:Classical approach :

    - Low accuracy in complicated systems

    - Systems for which first principle and theoretical methods

    are not fully developed-

    Solution:1- human parallel processing neural networks2- human reasoning and inference system fuzzy models

    2

  • 8/4/2019 Fuzzy Modeling

    3/65

    -Although neural networks have many advantages but theyhave three main problems:

    1- data saved in some parameter which are notinterpretable

    2- nonlinear optimization problem

    3- capturing the expert knowledge is impossible

    3

  • 8/4/2019 Fuzzy Modeling

    4/65

    Fuzzy models-A mathematical model which in some way uses fuzzy sets is called

    fuzzy model [1]

    -A method for modeling complex, ill defined, and less tractablesystems.

    if( ) then ( )

    validity of rule output of rule

    fuzzy sets fuzzy sets(mamdani) or

    functions (Takagi-Sugeno)

    Example(mamdani):If pressure is high,then volume is small

    Example(TSK):If velocity is high, thenforce = k *

    4

  • 8/4/2019 Fuzzy Modeling

    5/65

    -Two different ideas are behind these modelingapproaches; while the Mamdani model tries to imitate

    the human reasoning mechanism, the Takagi-Sugenomodel tries to represent system by some local simplemodels when it is not describable by a single modelaccurately. For this reason Takagi-Sugeno model is

    sometimes called local model.

    5

  • 8/4/2019 Fuzzy Modeling

    6/65

    - input space partitioning

    partitioning ofinput space

    Grid

    partitioning

    Tree

    partitioning

    Scatter

    partitioning

    1-ANFIS (Jang)

    2-FUREGALOLIMOT(Nelles) CLUSTERING (Babuka)

    6

  • 8/4/2019 Fuzzy Modeling

    7/65

    ANFIS(Adaptive-Network-Based Fuzzy Inference System)

    7

  • 8/4/2019 Fuzzy Modeling

    8/65

    Main problems of fuzzy modeling before ANFIS:

    1) No standard methods exist for transforming human

    knowledge or experience into the rule base and database

    of a fuzzy inference system.

    2) There is a need for effective methods for tuning the

    membership functions (MFs) so as to minimize the

    output error measure or maximize performance index.

    8

  • 8/4/2019 Fuzzy Modeling

    9/65

    neural networksNeuron structure:

    Output of neuron:1

    ( )m

    k kj j k

    j

    y x b

    9

  • 8/4/2019 Fuzzy Modeling

    10/65

    - Activation function(

    ):

    The logistic function ( +() ):

    Hyperbolic tangent (tanh()):

    Nonlinear behavior ofneural networks! 10

  • 8/4/2019 Fuzzy Modeling

    11/65

    Multilayer perceptron (MLP):

    Arbitrary number of hidden layer can be used!

    11

  • 8/4/2019 Fuzzy Modeling

    12/65

    Training MLPS (back propagation)

    -Training data:

    (: Input to MLP , : desired output , :MLP output for ())

    ()

    () & (1) (1)

    () () & ()

    () - Cost function

    12

    =&

    - What should be optimized

    (neuron weights) 12

  • 8/4/2019 Fuzzy Modeling

    13/65

    -Optimization algorithm

    steepest descent: The search direction is the oppositegradient direction.

    : the gradient of output error with respect to

    - The most important advantage of this algorithm is that

    it shows that the gradient for each weight can becalculated with the aid of the gradient of neurons in thenext layer.

    13

  • 8/4/2019 Fuzzy Modeling

    14/65

    -Training procedure:

    Its two pass optimization method. In forward pass the inputsgo through the MLP and and can be calculated.It backward pass the error goes through output layer to input

    layer and update all of the MLPs weights. This procedure

    repeated by all data samples many time.

    14

  • 8/4/2019 Fuzzy Modeling

    15/65

    Fuzzy Inference System (FIS)

    15

  • 8/4/2019 Fuzzy Modeling

    16/65

    Fuzzy Inference System (FIS):

    1-Compare the input variables with the membership functionson the premise part to obtain the membership values (orcompatibility measures) of each linguistic label. (This step is

    often calledfuzzification ).2- Combine (through a specific T-norm operator, usuallymultiplication or min.) the membership values on the premisepart to getfiring strength (weight) of each rule.

    3- Generate the qualified consequent (either fuzzy or crisp) ofeach rule depending on the firing strength.

    4- Aggregate the qualified consequents to produce a crisp

    output. (This step is called defuzzification.) 16

  • 8/4/2019 Fuzzy Modeling

    17/65

    - Example

    ()

    ()

    MamdaniType 1 Type2

    TSK

    17

  • 8/4/2019 Fuzzy Modeling

    18/65

    Each of this if then rules can be represented as an adaptive network:

    Nodes with adaptiveparameters

    Nodes fixedoperation

    Centers and width ofmembership functions & &

    18

  • 8/4/2019 Fuzzy Modeling

    19/65

    Example of a FIS with two inputand three membership function

    for each of the inputs

    19

  • 8/4/2019 Fuzzy Modeling

    20/65

    Training procedure:

    Forward pass Backward pass

    Premise parameters Fixed Gradient descent

    Consequent

    parameters

    Least square

    estimateFixed

    signals Node output Error rates

    twopasses in the hybrid learning procedure for ANFIS

    20

  • 8/4/2019 Fuzzy Modeling

    21/65

    Why we can use the least squares algorithm for consequentparameters: (for example for TSK model on page 18)

    () ()

    () ()

    21

    Linear regressionproblem

  • 8/4/2019 Fuzzy Modeling

    22/65

    - In backward pass the gradient descent algorithm isused to optimize the premise parameter while the error

    propagate backward through the network.(like backpropagation in neural networks)

    22

  • 8/4/2019 Fuzzy Modeling

    23/65

    Remark1: since the consequent parameters are optimizedin each iteration with least squares algorithm, in backwardpass the nonlinear optimization problem can be solvedmore efficiently and problems such as being trapped inlocal minima or slow convergence are less problematic.

    23

  • 8/4/2019 Fuzzy Modeling

    24/65

    - remark2: TSK model is more popular in ANISstructure since it has more adjustable parameters in

    consequent of rules. This will reduce the training timeand effort, because these parameters will be linear withrespect to output error and can be estimated veryefficiently through least-squares algorithm

    24

  • 8/4/2019 Fuzzy Modeling

    25/65

    - Remark3: sometimes optimizing the premise parameter(input membership functions) will deteriorate theinterpretability of the rule base.

    25

  • 8/4/2019 Fuzzy Modeling

    26/65

    Example: 0.6 sin 0.3 sin 3 0.1 sin 5 & [1,1]

    26

    3 membershipfunction for each

    output(9rules)

  • 8/4/2019 Fuzzy Modeling

    27/65

    27

    4 membershipfunction for eachoutput(16rules)

  • 8/4/2019 Fuzzy Modeling

    28/65

    28

    5 membershipfunction for eachoutput(25rules)

    Loss ofinterpretability

  • 8/4/2019 Fuzzy Modeling

    29/65

    FUREGA

    Fuzzy Rule Extraction using

    Genetic Algorithm

    29

  • 8/4/2019 Fuzzy Modeling

    30/65

    FUREGA:1- start a grid base network using prior knowledge

    2- selection of rule by genetic algorithm

    3-least squares for output parameter optimization

    4- constrain nonlinear optimization of membershipfunction

    30

  • 8/4/2019 Fuzzy Modeling

    31/65

    Properties :

    Hopeful to have the best solution (accuracy)

    Time consuming training

    Curse of dimensionality

    Interpretability ?

    31

  • 8/4/2019 Fuzzy Modeling

    32/65

    Local Linear Model Tree

    LOLIMOT

    32

  • 8/4/2019 Fuzzy Modeling

    33/65

    What are local models ?

    33

  • 8/4/2019 Fuzzy Modeling

    34/65

    Example:

    34

  • 8/4/2019 Fuzzy Modeling

    35/65

    LOLIMOT algorithm:-The algorithm has an outer loop (upper level) thatdetermines the input partitions (structure) where thelocal linear models are valid and an inner loop (lowerlevel) that estimates the parameters of those local linear

    models by efficient weighted least squares algorithm.

    Consequent parameter estimation:

    . (, , )= :local linear model parameters : inputs vector: normalized Gaussian weighting function for the ith model withcenter coordinates and standard deviations

    35

  • 8/4/2019 Fuzzy Modeling

    36/65

    , , =

    Where:

    exp( 12 (

    ))

    - Assume the weighting functions would have been alreadydetermined. Then the parameters of each linear model areestimated separately by a weighted least squares technique.

    With the data matrixX (inputs of model-known) the

    diagonal weighting matrix Q, (each entry is theweighting function value of the corresponding input data)and desired outputsythe optimal parameters of the model are:

    36

  • 8/4/2019 Fuzzy Modeling

    37/65

    - Input space partitioning

    1- Set the first hyper-rectangle in such a way that is containsall data points. Estimate a global linear model.

    2- For all input dimensions j := l...n:

    2a. Cut the hyper-rectangle into two halves alongdimension j.

    2b. Estimate local linear models for each half.

    2c. Calculate the global approximation error (output error)

    for the model with this cut.

    3- Determine which cut has led to the smallestapproximation error.

    37

  • 8/4/2019 Fuzzy Modeling

    38/65

    4- Perform this cut. Place a weighting function within each

    center of both hyper-rectangles. Set standard deviations ofboth weighting functions proportional to the extension of thehyper-rectangle in each dimension. Apply the correspondingestimated local linear models(from 2b).

    5- Calculate the local error measures Jon basis of a parallelrunning model for each hyper-rectangle.

    6-Choose the hyper-rectangle with the largest local error

    measureJ.

    7-If the global approximation error on a parallel model

    (output error) is too large go to step 2.

    8- Convergence. Stop. 38

  • 8/4/2019 Fuzzy Modeling

    39/65

    LOLIMOT

    39

  • 8/4/2019 Fuzzy Modeling

    40/65

    Example:

    40

  • 8/4/2019 Fuzzy Modeling

    41/65

    properties:

    High interpretability of rules

    Automatically partitioning of the input spaceaccording to the system properties

    Different objective function for modeling error andstructure optimization

    Low sensitivity to user selected parameters

    No curse of dimensionality for high-dimensionalproblems

    41

  • 8/4/2019 Fuzzy Modeling

    42/65

    Implementing Hierarchical Fuzzy

    Clustering in Fuzzy IdentificationUsing weighted fuzzy C-means

    42

  • 8/4/2019 Fuzzy Modeling

    43/65

    Clustering- Definitionto divide the data-set in such way that objects belonging tothe same cluster are as similar as possible and objectsbelonging to different clusters are as dissimilar as possible

    - types

    1- Crisp

    2- Fuzzy

    - Properties

    1-Unsupervised learning task

    2- Nonlinear optimization

    3- Computational economy

    4- Needs user defined parameters 43

  • 8/4/2019 Fuzzy Modeling

    44/65

    Fuzzy C_means (FCM)

    Cost function

    m ---> 1 clusters ---> crispm ---> clusters ---> fuzzy

    Iterative training

    44

  • 8/4/2019 Fuzzy Modeling

    45/65

    Example of fuzzy C_means

    45

  • 8/4/2019 Fuzzy Modeling

    46/65

    Weighted fuzzy C-means (WFCM) Some points are more important

    46

  • 8/4/2019 Fuzzy Modeling

    47/65

    self organizing map(SOM):

    The most famous neural network base clustering

    K-means (crisp C-means) with sequential training

    47

    ( )

  • 8/4/2019 Fuzzy Modeling

    48/65

    SOM algorithm:1- Choose initial values for the C neuron vectors , 1, . . . , . Thiscan be done by picking randomlyCdifferent data samples.2. Choose one sample for the data set(u). This can be done eitherrandomly or by systematically going through the hole data set.

    3. Calculate the distance of the selected data sample to all neuronvectors. Typically, the Euclidean distance measure is used. The neuronwith the vector closest to the data sample is called thewinner neuron.

    4. Update the vector of the winner neuron in a way that moves it towardthe selected data sample u:

    ( )5. If any neuron vector has been moved significantly, in the previousstep then go to Step 2; otherwise stop.

    48

  • 8/4/2019 Fuzzy Modeling

    49/65

    fuzzy clustering for fuzzy identification

    It is a unsupervised learning task so it does not need no additionaldata.

    Input space term-sets derived from a direct result of the clusteringprocess

    Computational economy

    49

  • 8/4/2019 Fuzzy Modeling

    50/65

    Application of clustering in fuzzy modeling

    1- applying clustering algorithms to input data only

    2- applying clustering algorithms to output data only

    3- applying clustering algorithms to a vector composedof input and output data.

    50

  • 8/4/2019 Fuzzy Modeling

    51/65

    FCM for input space partitioning

    FCM requires a priori knowledge of the number ofclusters

    - determining the number of clusters in an iterative manner

    - using optimal fuzzy clustering methods

    dependence of FCM on the initialization- hierarchical clustering

    interpretability of the final fuzzy model

    - Model simplification methods

    51

  • 8/4/2019 Fuzzy Modeling

    52/65

    Algorithm:

    52

  • 8/4/2019 Fuzzy Modeling

    53/65

    Algorithm:

    1- apply SOM algorithm to classify N data samples into ncrisp clusters( , 1 . . ).

    2- select the n cluster center(

    , 1 . . ) from previous

    step and assign a weight for each of them according totheir relative cardinality.

    3-apply WFCM to classify the n cluster center ( , )into C new clusters.

    53

  • 8/4/2019 Fuzzy Modeling

    54/65

    4- The centers of the Gaussian membership functions in

    premise 0f the fuzzy rules are obtained by simplyprojecting the final cluster centersinto each axis. Tocalculate the respective standard deviations utilize thefuzzy covariance matrix.[5]

    5- use weighted least squares to optimize the consequentparameters and steepest descent for premiseparameters.(Formulas[5])

    6- merge similar member functions for interpretability.

    similarity measure: , 7- optimize the consequent parameters again.

    54

  • 8/4/2019 Fuzzy Modeling

    55/65

    Example I:

    55

  • 8/4/2019 Fuzzy Modeling

    56/65

    Example :

    1 2 3 4 5

    DS1\10

    1

    2

    3

    4

    5

    DS1\11SOM

    WFCM

    56

    1 2 3 4 5X1

    1.5

    2.5

    3.5

    4.5

    5.5

    X2

    green w=2/51

    red w=4/51

    black w=10/51

    dark blue w=8/51

    blue w=3/51

  • 8/4/2019 Fuzzy Modeling

    57/65

    Example (cont.):

    1 1.5 2 2.5 3 3.5 4 4.5 50

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x1

    1 1.5 2 2.5 3 3.5 4 4.5 50

    0.5

    1

    final term-sets for x1

    1 1.5 2 2.5 3 3.5 4 4.5 50

    0.5

    1

    simplified term-sets for x1

    medium

    small

    large

    1 1.5 2 2.5 3 3.5 4 4.5 50

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x2

    1 1.5 2 2.5 3 3.5 4 4.5 5

    0

    0.5

    1

    final term-sets for x2

    1 1.5 2 2.5 3 3.5 4 4.5 50

    0.5

    1

    simplified term-sets for x2

    small

    large

    R1: if x1 is small and x2 is small then y=17.3-2.6x1+1.4x2R2: if x1 is medium and x2 is large then y=7.5-2.9x1-0.02x2R3: if x1 is large and x2 is small then y=4.7+2.7x1-7.8x2R4: if x1 is large and x2 is large then y=2.8-0.2x1-0.2x2

    J=0.1801

    J=0.0018

    J=0.0154

    57

  • 8/4/2019 Fuzzy Modeling

    58/65

    Example II:

    0 50 100 150 200 250 300 350 400 450 5000.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    1.1

    1.2

    1.3

    t

    x

    Inputs

    x(t-18) x(t-12)x(t-6) x(t)

    output

    x(t+6)

  • 8/4/2019 Fuzzy Modeling

    59/65

    Example II(cont.):

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x(t-18)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    final term-sets for x(t-18)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    simplified term-sets for x(t-18)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x(t-12)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    final term-sets for x(t-12)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    simplified term-sets for x(t-12)

  • 8/4/2019 Fuzzy Modeling

    60/65

    Example II(cont.):

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x(t-6)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    final term-sets for x(t-6)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    simplified term-sets for x(t-6)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.2

    0.4

    0.6

    0.8

    1

    initial term-sets for x(t)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    final term-sets for x(t)

    0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.30

    0.5

    1

    simplified term-sets for x(t)

    J=0.0166

    J=0.0072

    J=0.0128

  • 8/4/2019 Fuzzy Modeling

    61/65

    Benefits to Similar approaches::

    It does not need any additional data

    Low sensitivity to user selected parameters andinitial condition

    Computational economy curse of dimensionality

    interpretability

    Sensitivity to data distribution

    61

  • 8/4/2019 Fuzzy Modeling

    62/65

    universal

    approximator

    62

  • 8/4/2019 Fuzzy Modeling

    63/65

    Proof:[6]

    63

  • 8/4/2019 Fuzzy Modeling

    64/65

    References:1- Babuka, R. and Verbuggen, H. (2003). Neuro-fuzzy methods for nonlinear systemidentification, Review. Annual reviews in control, 27, 73-85.

    2- Haykin, S.(1998), Neural Networks: A Comprehensive Foundation. Prentice Hall.

    4- Jang, J.-S.R. (1993). ANFIS: Adaptive-network-based fuzzy inference systems. IEEETransactions on Systems, Man & Cybernetics, 23(3), 665685.

    3- Nelles, O. and Isermann, R. (1996). Basis function networks for interpolation oflocal linear models. In: IEEE Conference on Decision and Control (CDC), 470475.

    4- Nelles, O. (2002). Nonlinear System Identification. Springer Verlag, Berlin.

    5- Oliveira, J. V. and Pedrycz, W. (2007).Advances in Fuzzy Clustering and itsApplications,John Wiley & Sons, chapter 12.

    6- Espinosa, J., Vandewalle, J., Wertz, V. (2004). Fuzzy logic, identification andpredictive control. Springer Verlag, Berlin.

    64

  • 8/4/2019 Fuzzy Modeling

    65/65

    Questions and Discussion

    Thanks for your attention