Learning Law in Neural Networks.doc

Embed Size (px)

Citation preview

  • 8/14/2019 Learning Law in Neural Networks.doc

    1/19

    Learning

    In Neural Network, it is an ability to learn from its environment and to improve its performance

    through learning.

    Learning implies that the processing element somehow changes its input/output behavior inresponse to tbhe environment. For example, if the processing element originally gives an output

    of +1 in response to a particular input pattern, it might have an output of -1 to that same input

    pattern after learning takes place. The processing element has somehow changed it mind about

    what the correct response to that input should be. What does the processing element do to makethis change?

    The output is computed as a result of a transfer function of the weighted input. The net input forthis simple case is computed by multiplying the value of each individual input by itscorresponding weight, or equivalently, taking the dot product of the input and weight vectors.

    The processing element then takes this input value and applies the transfer function to it to

    compute the resulting output.

    Activation function- A function by which new output of the basic unit is derived from a

    combination of the net inputs and the current state of the unit (the total input).

    axon - The part of a nerve cell through which impulses travel away from the cell body; the

    electrically active parts of a nerve-cell.

    back-propagation - A learning algorithm for a multilayer network in which the weights are

    modified via the propagation of an error signal "backward" from the outputs to the inputs.

    connection- A pathway between processing elements, either positive or negative, that links theprocessing elements into a network.

    dendrite - The branched part of a nerve cell that caries impulses toward the cell body. The

    electrically passive parts of a nerve cell.

    learning- The phase in a neural network when new data is introduced into the network, causingthe weights on the processing elements to be adjusted.

    neuron - The structural and functional unit of the nervous system, consisting of the nerve cell

    body and all its processes, including an axon and one or more dendrons.

  • 8/14/2019 Learning Law in Neural Networks.doc

    2/19

    perceptron- A large class of simple neuron-like networks with only an input layer and an output

    layer. Developed in 1957 by Frank Rosenblatt, this class of neural network had no hidden layer.

    summation function- A function that combines the various input activations into a singleactivation.

    synapse- The point of contact between adjacent neurons where nerve impulses are transmitted

    from one to another.threshold- A minimum level of excitation energy.

    training - A process whereby a network learns to associate an input pattern with the correct

    answer.

    weight- The strength of an input connection expressed by a real number. Processing elementsreceive input via interconnects. Each interconnect has a weight attached to it. The sum of the

    weights make up a value that updates the processing element. The output value of a processing

    element is described by a level of excitation that causes interconnects to be either on (i.e.

    excitatory output) or off (i.e. inhibitory output).

    Architecture of neural networks

    Feed-forward networks

    Feed-forward ANNs (figure 1) allow signals to travel one way only; from input to

    output. There is no feedback (loops) i.e. the output of any layer does not affect that

    same layer. Feed-forward ANNs tend to be straight forward networks that associate

    inputs with outputs. They are extensively used in pattern recognition. This type of

    organisation is also referred to as bottom-up or top-down.

    Feedback networks

    Feedback networks (figure 1) can have signals travelling in both directions by

    introducing loops in the network. Feedback networks are very powerful and can get

    extremely complicated. Feedback networks are dynamic; their 'state' is changing

    continuously until they reach an equilibrium point. They remain at the equilibrium

    point until the input changes and a new equilibrium needs to be found. Feedback

    architectures are also referred to as interactive or recurrent, although the latter term is

    often used to denote feedback connections in single-layer organisations.

  • 8/14/2019 Learning Law in Neural Networks.doc

    3/19

    Competitive learningis a form ofunsupervised learning inartificial neural networks,in which nodes

    compete for the right to respond to a subset of the input data. A variant ofHebbian learning,competitive

    learning works by increasing the specialization of each node in the network. It is well suited to

    findingclusters within data.

    Models and algorithms based on the principle of competitive learning includevector quantization andself-

    organising maps (Kohonen maps).

    http://en.wikipedia.org/wiki/Unsupervised_learninghttp://en.wikipedia.org/wiki/Artificial_neural_networkshttp://en.wikipedia.org/wiki/Hebbian_learninghttp://en.wikipedia.org/wiki/Cluster_analysishttp://en.wikipedia.org/wiki/Vector_quantizationhttp://en.wikipedia.org/wiki/Self-organising_maphttp://en.wikipedia.org/wiki/Self-organising_maphttp://en.wikipedia.org/wiki/Self-organising_maphttp://en.wikipedia.org/wiki/Self-organising_maphttp://en.wikipedia.org/wiki/Vector_quantizationhttp://en.wikipedia.org/wiki/Cluster_analysishttp://en.wikipedia.org/wiki/Hebbian_learninghttp://en.wikipedia.org/wiki/Artificial_neural_networkshttp://en.wikipedia.org/wiki/Unsupervised_learning
  • 8/14/2019 Learning Law in Neural Networks.doc

    4/19

    Competitive Learning is usually implemented with Neural Networks that contain a hidden layer which is

    commonly known as competitive layer.[2]

    Every competitive neuron i is described by a vector of

    weights and calculates the similarity measure between the

    input data and the weight vector .

    For every input vector, the competitive neurons compete with each other to see which one of them is the

    most similar to that particular input vector. The winner neuron m sets its output and all the other

    competitive neurons set their output .

    Usually, in order to measure similarity the inverse of the Euclidean distance is

    used: between the input vector and the weight vector .

    Example

    Here is a simple competitive learning algorithm to find three clusters within some input data.

    1. (Set-up.) Let a set of sensors all feed into three different nodes, so that every node is connected to

    every sensor. Let the weights that each node gives to its sensors be set randomly between 0.0 and 1.0.

    Let the output of each node be the sum of all its sensors, each sensor's signal strength being multiplied

    by its weight.

    2. When the net is shown an input, the node with the highest output is deemed the winner. The input is

    classified as being within the cluster corresponding to that node.

    3. The winner updates each of its weights, moving weight from the connections that gave it weaker

    signals to the connections that gave it stronger signals.

    Thus, as more data are received, each node converges on the centre of the cluster that it has come torepresent and activates more strongly for inputs in this cluster and more weakly for inputs in other

    clusters.

    http://en.wikipedia.org/wiki/File:Competitive_neural_network_architecture.pnghttp://en.wikipedia.org/wiki/Competitive_learning#cite_note-2http://en.wikipedia.org/wiki/Competitive_learning#cite_note-2http://en.wikipedia.org/wiki/Competitive_learning#cite_note-2http://en.wikipedia.org/wiki/Competitive_learning#cite_note-2http://en.wikipedia.org/wiki/File:Competitive_neural_network_architecture.png
  • 8/14/2019 Learning Law in Neural Networks.doc

    5/19

    LEARNING TYPES

    Supervised Or Active Learning - learning with an external teacher or a supervisor who

    presents training set to the network.

    Unsupervised Or Self-Organized Learning does not require an external teacher. During thetraining session, the neural network receives a number of different input patterns, discovers

    significant features in these patterns and learns how to classify input data into appropriatecategories.

    Unsupervised learning can be used in real-time.

    Re-inforcement learning:- The output will be come with the help of feedback. If the output is

    match than the result will be +1 otherwise 0 or -1.

    Unsupervised Learning Types:-

    Adaptive Resonance Theory (ART)is a theory developed by Stephen Grossberg and Gail Carpenter onaspects of how the brain processes information. ART networks consist of an input layer and an outputlayer.

    Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Carpenter and Grossberg (1987) On-line clustering algorithm Recurrent ANN Competitive output layer Data clustering applications Stability-plasticity dilemma

    Stability: system behaviour doesnt change after irrelevant eventsPlasticity: System adapts its behaviour according to significant events

    Dilemma: how to achieve stability without rigidity and plasticity without chaos? Ongoing learning capability Preservation of learned knowledge

  • 8/14/2019 Learning Law in Neural Networks.doc

    6/19

    ART Architecture

    Bottom-up weights bij Top-down weights tij Store class template

    Input nodes Vigilance test Input normalisation

    Output nodes Forward matching

    Long-term memory ANN weights

  • 8/14/2019 Learning Law in Neural Networks.doc

    7/19

    Short-term memory ANN activation pattern

    ART Types

    ART1: Unsupervised Clustering of binary input vectors.

    ART2: Unsupervised Clustering of real-valued input vectors.

    ART3: Incorporates "chemical transmitters" to control the searchprocess in a hierarchical ART structure.

    ARTMAP: Supervised version of ART that can learn arbitrarymappings of binary patterns.

    Fuzzy ART: Synthesis of ART and fuzzy logic.

    Fuzzy ARTMAP: Supervised fuzzy ART

    dART and dARTMAP: Distributed code representations in the F2 layer(extension of winner take all approach).

    Gaussian ARTMAP

    ART 1- Analog Adaptive Resonance Theory

    ART 1is the simplest variety of ART networks, accepting only binary inputs.

    The basic structure of an ART1 neural network involves:

    an input processing field (called the F1layer) which happens to consist of two parts:o an input portion (F1(a))

  • 8/14/2019 Learning Law in Neural Networks.doc

    8/19

    o an interface portion (F1(b))

    the cluster units (the F2layer)

    and a mechanism to control the degree of similarity of patterns placed on the same cluster

    a reset mechanism

    weighted bottom-up connections between the F1and F2layers

    weighted top-down connections between the F2and F1layers

    Reset Module

    Fixed connection weights

    Implements the vigilance test

    Excitatory connection from F1(b)

  • 8/14/2019 Learning Law in Neural Networks.doc

    9/19

    Inhibitory connection from F1(a)

    Output of reset module inhibitory to output layer

    Disables firing output node if match with pattern is not close enough

    Duration of reset signal lasts until pattern is present

    Gain module

    Fixed connection weights

    Controls activation cycle of input layer

    Excitatory connection from input lines

    Inhibitory connection from output layer

    Output of gain module excitatory to input layer

    2/3 rule for input layer

    ART1 Example : character recognition

    ART 2- Binary Adaptive Resonance Theory

    ART 2extends network capabilities to support continuous inputs.

    Unsupervised Clustering for :

    Real-valued input vectors

    Binary input vectors that are noisy

    Includes a combination of normalization and noise suppression

  • 8/14/2019 Learning Law in Neural Networks.doc

    10/19

    Fast Learning

    Weights reach equilibrium in each learning trial

    Have some of the same characteristics as the weight found by ART1

    More appropriate for data in which the primary information is contained in the pattern of

    components that are small or large

    Slow Learning

    Only one weight update iteration performed on each learning trial

    Needs more epochs than fast learning

    More appropriate for data in which the relative size of the nonzero components is

    important

    ART 2-A- Binary Adaptive Resonance Theory

    ART 2-Ais a streamlined form of ART-2 with a drastically accelerated runtime, and with qualitative results

    being only rarely inferior to the full ART-2 implementation.

    Applications ART

    Natural language processing

    Document clustering

    Document retrieval

    Automatic query

    Image segmentation

  • 8/14/2019 Learning Law in Neural Networks.doc

    11/19

    Character recognition

    Data mining

    Data set partitioning

    Detection of emerging clusters

    Fuzzy partitioning

    Condition-action association

    Hebbian learning

    In 1949, Donald Hebb proposed one of the key ideas in biological learning, commonly known as Hebbs

    Law. Hebbs Law states that if neuron iis near enough to excite neuron jand repeatedly participates in

    its activation, the synaptic connection between these two neurons is strengthened and neuron jbecomes

    more sensitive to stimuli from neuron i.

    Hebbs Law can be represented in the form of two rules:

    1. If two neurons on either side of a connection are activated synchronously, then the weight of

    that connection is increased.

    2. If two neurons on either side of a connection are activated asynchronously, then the weight of

    that connection is decreased.

    Hebbs Law provides the basis for learning without a teacher. Learning here is a local phenomenon

    occurring without feedback from the environment.

    i j

    InputSignals

    OutputSignal

    s

  • 8/14/2019 Learning Law in Neural Networks.doc

    12/19

  • 8/14/2019 Learning Law in Neural Networks.doc

    13/19

  • 8/14/2019 Learning Law in Neural Networks.doc

    14/19

    Competitive learning

    In competitive learning, neurons compete among themselves to be activated.

    While in Hebbian learning, several output neurons can be activated simultaneously, in

    competitive learning, only a single output neuron is active at any time.

    The output neuron that wins the competition is called the winner-takes-al l neuron.

    The basic idea of competitive learning was introduced in the early 1970s.

    In the late 1980s, Teuvo Kohonen introduced a special class of artificial neural networks called

    self-organising feature maps. These maps are based on competitive learning.

    SOM

    What is a self-organising feature map?

    Our brain is dominated by the cerebral cortex, a very complex structure of billions of neurons and

    hundreds of billions of synapses. The cortex includes areas that are responsible for different human

    activities (motor, visual, auditory, somatosensory, etc.), and associated with different sensory inputs. Wecan say that each sensory input is mapped into a corresponding area of the cerebral cortex. The cortex

    is a self-organising computational map in the human brain.

  • 8/14/2019 Learning Law in Neural Networks.doc

    15/19

  • 8/14/2019 Learning Law in Neural Networks.doc

    16/19

    The Kohonen network

    n The Kohonen model provides a topological mapping. It places a fixed number of input patterns

    from the input layer into a higher-dimensional output or Kohonen layer.

    n Training in the Kohonen network begins with the winners neighbourhood of a fairly large size.

    Then, as training proceeds, the neighbourhood size gradually decreases.

    n The lateral connections are used to create a competition between neurons. The neuron with the

    largest activation level among all neurons in the output layer becomes the winner. This neuron is

    the only neuron that produces an output signal. The activity of all other neurons is suppressed in

    the competition.

    n The lateral feedback connections produce excitatory or inhibitory effects, depending on the

    distance from the winning neuron. This is achieved by the use of a Mexican hat functionwhich

    describes synaptic weights between neurons in the Kohonen layer.

    Supervised Learning Types

    Hop Field Network:-

    Input

    layer

    OutputSignals

    InputSignals

    x1

    x2

    Output

    layer

    y1

    y2

    y3

  • 8/14/2019 Learning Law in Neural Networks.doc

    17/19

    PERCEPTRON

    In machine learning, the perceptronis an algorithm for supervised classification of an input into one of

    several possible non-binary outputs. It is a type of linear classifier, i.e. a classification algorithm that

    makes its predictions based on a linear predictor function combining a set of weights with the featurevector describing a given input using the delta rule. The learning algorithm for perceptrons is an online

    algorithm, in that it processes elements in the training set one at a time.

    The perceptron algorithm was invented in 1957 at the Cornell Aeronautical Laboratory by Frank

    Rosenblatt.

    The perceptron is a binary classifier which maps its input (a real-valuedvector)to an output

    value (a singlebinary value):

    http://en.wikipedia.org/wiki/Vector_spacehttp://en.wikipedia.org/wiki/Binary_functionhttp://en.wikipedia.org/wiki/Binary_functionhttp://en.wikipedia.org/wiki/Vector_space
  • 8/14/2019 Learning Law in Neural Networks.doc

    18/19

    where is a vector of real-valued weights, is thedot product (which here computes a weighted

    sum), and is the 'bias', a constant term that does not depend on any input value.

    The value of (0 or 1) is used to classify as either a positive or a negative instance, in the case of

    a binary classification problem. If is negative, then the weighted combination of inputs must produce a

    positive value greater than in order to push the classifier neuron over the 0 threshold. Spatially, thebias alters the position (though not the orientation) of thedecision boundary.The perceptron learning

    algorithm does not terminate if the learning set is notlinearly separable.If the vectors are not linearly

    separable learning will never reach a point where all vectors are classified properly. The most famous

    example of the perceptron's inability to solve problems with linearly nonseparable vectors is the Boolean

    exclusive-or problem. The solution spaces of decision boundaries for all binary functions and learning

    behaviors are studied in the reference.

    In the context ofartificial neural networks,a perceptron is anartificial neuron using theHeaviside step

    function as the activation function. The perceptron algorithm is also termed the single-layer perceptron,

    to distinguish it from a multilayer perceptron, which is a misnomer for a more complicated neural network.

    As a linear classifier, the single-layer perceptron is the simplest feedforward neural network.

    ADALINE(Adaptive Linear Neuronor later Adaptive Linear Element) is an early single-layer neural

    network and the name of the physical device that implemented this network .[1]

    It was developed by

    Professor Bernard Widrow and his graduate student Ted Hoff at Stanford University in 1960. It is based

    on the McCullochPitts neuron. It consists of a weight, a bias and a summation function.

    The difference between Adaline and the standard (McCullochPitts) perceptron is that in the learning

    phase the weights are adjusted according to the weighted sum of the inputs (the net). In the standard

    perceptron, the net is passed to the activation (transfer) function and the function's output is used for

    adjusting the weights.

    There also exists an extension known asMadaline

    .

    http://en.wikipedia.org/wiki/Dot_producthttp://en.wikipedia.org/wiki/Decision_boundaryhttp://en.wikipedia.org/wiki/Linearly_separablehttp://en.wikipedia.org/wiki/Artificial_neural_networkhttp://en.wikipedia.org/wiki/Artificial_neuronhttp://en.wikipedia.org/wiki/Heaviside_step_functionhttp://en.wikipedia.org/wiki/Heaviside_step_functionhttp://en.wikipedia.org/wiki/ADALINE#cite_note-1http://en.wikipedia.org/wiki/ADALINE#cite_note-1http://en.wikipedia.org/wiki/ADALINE#cite_note-1http://en.wikipedia.org/wiki/Madalinehttp://en.wikipedia.org/wiki/Madalinehttp://en.wikipedia.org/wiki/ADALINE#cite_note-1http://en.wikipedia.org/wiki/Heaviside_step_functionhttp://en.wikipedia.org/wiki/Heaviside_step_functionhttp://en.wikipedia.org/wiki/Artificial_neuronhttp://en.wikipedia.org/wiki/Artificial_neural_networkhttp://en.wikipedia.org/wiki/Linearly_separablehttp://en.wikipedia.org/wiki/Decision_boundaryhttp://en.wikipedia.org/wiki/Dot_product
  • 8/14/2019 Learning Law in Neural Networks.doc

    19/19

    Adaline is a single layer neural network with multiple nodes where each node accepts multiple inputs and

    generates one output. Given the following variables:

    x is the input vector

    w is the weight vector

    n is the number of inputs

    some constant

    y is the output

    then we find that the output is . If we further assume that

    then the o/p reduces to the dot product of x and w