35
Last lecture summary

Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Embed Size (px)

Citation preview

Page 1: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Last lecture summary

Page 2: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

SOM

• supervised x unsupervised• regression x classification• Topology?• Main features?• Codebook vector?• Output from the neuron?

Page 3: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Compettive learning

• BMU• Scaling• Which neuron gets updated?• How it will be updated?• Topology preservation• Neighborhood

Page 4: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?
Page 5: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?
Page 6: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Variable parameters– NS (neighborhood strength)– Neighborhood size– Learning rate

1 NS , 1j j jt t t d t t t w w x w

Page 7: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Multidimensional data

• IRIS (attributes: sepal length, sepal widht, petal length, petal width)

Page 8: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Since we have class labels, we can assess the classification accuracy of the map.

• So first we train the map using all 150 patterns.

• And then we present input patterns individually again and note the winning neuron.– The class to which the input belongs is the class

associated with this BMU codebook vector (see previous slide, Class panel).

– Only the winner decides classification.

Page 9: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Vers (2) – 100% accuracySet (1) – 86%Virg (3) – 88%Overall accuracy = 91.3%

Vers (2) – 100% accuracySet (1) – 90%Virg (3) – 94%Overall accuracy = 94.7%

Sandhya Samarasinghe, Neural Networks for Applied Sciences and Engineering, 2006

Only winner decides the classification

Neighborhood of size 2 decides the classification

Page 10: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

U-matrix• Distances between the neighboring codebook

vectors can highlight different cluster regions in the map and can be a useful visualization tool

• Two neurons: w1 = {w11, w21, … wn1}, w2 = {w12, w22, … wn2}

• Euclidean distance between them

• The average of the distance to the nearest neighbors – unified distance, U -matrix

2 2 2

12 11 12 21 22 1 2n nd w w w w w w

Page 11: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

The larger the distance between neurons, the larger (i.e., lighter color) is the U value.

Large distance between this cluster (Iris versicolor) and the middle cluster (Iris setosa). Large distances between codebook vectors indicate a sharp boundary between the clusters.

Page 12: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Surface graphThe height represents the distance.

3rd row – large height = separation

Other two clusters are not separated.

Page 13: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Quantization error• Measure of the distance between codebook

vectors and inputs.• If for input vector x the winner is wc, then

distortion error e can be calculated as

• Comput e for all input vectors and get average – quantization error, average map distortion error E.

, NS ,cii

e d e d c ix w x w

1 1, NS ,ci

n n i

E d E dN N

n ni ix w x w

Page 14: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Iris quantization error

High distortion error indicates areas where the codebook vector is relatively far from the inputs. Such information can be used to refine the map to obtain a more uniform distortion error measure if a more faithful reproduction of the input distribution from the map is desired.

Page 15: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Genetic algorithms(new stuff)

Page 16: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Optimization in DM

f’(x) = 0f’’(x) > 0 … minimumf’’(x) < 0 … maximum

Page 17: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Optimization in DM

• traditional methods (exact)– e.g. gradient based methods

• heuristics (approximate)– deterministic– stochastic (chance)• e.g. genetic algorithms, simulated annealing, ant colony

optimization, particle swarm optimization

Page 18: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Optimization in DM

• Applications of optimization techniques in DM are numerous.

• Optimize parameters to obtain the best performance.

• Optimize weights in NN• From many features, find the best (small)

subset giving the best performance (feature selection).

• …

Page 19: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Biology Inspiration• Every organism has a set of rules describing how that

organism is built up from the tiny building blocks of life. These rules are encoded in genes.

• Genes are connected together into long strings called chromosomes.

locus

• gene for color of teeth• allele for blue teeth

• Genes + alleles = genotype.• Physical expression of the

genotype = phenotype.

http://biology.unm.edu/ccouncil/Biology_124/Images/chromosome.gif

Page 20: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• When two organisms mate they share their genes. The resultant offspring may end up having half the genes from one parent and half from the other. This process is called recombination (crossover).

• Very occasionally a gene may be mutated.

http://members.cox.net/amgough/Chromosome_recombination-01_05_04.jpg

Page 21: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Life on earth has evolved through the processes of natural selection, recombination and mutation.

• The individuals with better traits will survive longer and produce more offsprings.– Their survivability is given by their fitness.

• This continues to happen, with the individuals becoming more suited to their environment every generation.

• It was this continuous improvement that inspired John Holland in 1970’s to create genetic algorithms.

Page 22: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

GA step by step

• Objective: find the maximum of the function O(x1, x2) = x1

2 + x22

– This function is called objective function.– And it will be use to evaluate the fitness.

Adopted from Genetic Algorithms – A step by step tutorial, Max Moorkap, Barcelona, 29th November 2005

Page 23: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Encoding

• A model parameters (x1, x2) are encoded into binary strings.

• How to encode (and decode back) a real number as a binary string?– For each real valued variable x we need to know:• the domain of the variable x ϵ [xL,xU]• length of the gene k

Page 24: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

5-bit x1 x2xL -1 0xU 1 3.1

Stepsize 0.0645 0.1

x1 ϵ [-1, 1] x2 ϵ [0, 3.1]

c1 = (0101110011) → (01011) = -1 + 11 * 0.0645 = -0.29

(10011) = 0 + 19 * 0.1 = 1.9chromosome

gene

stepsize=𝑥𝑈−𝑥 𝐿

2𝑛−1

number=𝑥 𝐿+stepsize×(decoded value of string )

Page 25: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• At the start a population of N random models is generated

• c1 = (0101110011) → (01011) = -1 + 11 * 0.0645 = -0.29

(10011) = 0 + 19 * 0.1 = 1.9• c2 = (1111010110) → (11110) = -1 + 30 * 0.0645 = 0.935

(10110) = 0 + 22 * 0.1 = 2.2• c3 = (1001010001) → (10010) = -1 + 18 * 0.0645 = 0.161

(10001) = 0 + 17 * 0.1 = 1.7• c4 = (0110100001) → (01101) = -1 + 13 * 0.0645 = -0.161

(00001) = 0 + 1 * 0.1 = 0.1

Page 26: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• For each member of the population calculate the value of the objective function O(x1, x2) = x1

2 + x22

O1 = O(-0.29, 1.9) = 3.69

O2 = O(0.935, 2.2) = 5.71

O3 = O(0.161, 1.7) = 2.92

O4 = O(-0.161, 0.1) = 0.04

genotype phenotype

Page 27: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Chromosome with bigger fitness has higher probability to be selected for breeding.

• We will use the following formula

O1 = 3.69O2 = 5.71O3 = 2.92O4 = 0.04

∑Oj = 12.36

P1 = 0.30P2 = 0.46P3 = 0.24P4 = 0.003

𝑃 𝑖=𝑂𝑖

∑𝑗=1

𝑛

𝑂 𝑗

Page 28: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

Roulette wheel

p2 (46%)

p3 (24%)

p1(30%)

P4 (0.3%)

Page 29: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Now select two chromosomes according to roulette wheel.– Allow the same chromosome to be selected more

than once for breeding.• These two chromosomes will:

1. cross over2. mutate

• Let’s say c2 = (1111010110) and c3 = (1001010001) chromosomes were selected.

• With probability Pc these two chromosomes will exchange their parts at the randomly selected locus (crossover point).

Page 30: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

1111010110

1001010001

1111010001

1001010110

Pc

1111010001

1001010110

Pm

Page 31: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

1111010110

1001010001

1111010001

1001010110

Pc

1110010001

1001010110

Pm

Page 32: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

1111010110

1001010001

1111010001

1001010110

Pc

1110010001

1001011110

Pm

Page 33: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Crossover point is selected randomly.• Pc generally should be high, about 80%-95%– If the crossover is not performed, just clone two parents

into new generation.

• Pm should be low, about 0.5%-1%– Perform mutation on each of the two offsprings at each

locus.

• Very big population size usually does not improve performance of GA.– Good size: 20-30, sometimes 50-100 reported as

best– Depends on size of encoded string

Page 34: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

• Repeat previous steps till the size of new population reaches N. – The new population replaces the old one.

• Each cycle throught this algorithm is called generation.

• Check whether termination criteria have been met.– Change in the mean fitness from generation to

generation.– Preset the number of generation.

Page 35: Last lecture summary. SOM supervised x unsupervised regression x classification Topology? Main features? Codebook vector? Output from the neuron?

1. [Start] Generate random population of N chromosomes (suitable solutions for the problem)

2. [Fitness] Evaluate the fitness f(x) of each chromosome x in the population 3. [New population] Create a new population by repeating following steps

until the new population is complete 1. [Selection] Select two parent chromosomes from a population

according to their fitness (the better fitness, the bigger chance to be selected)

2. [Crossover] With a crossover probability cross over the parents to form a new offspring (children). If no crossover was performed, offspring is an exact copy of parents.

3. [Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome).

4. [Accepting] Place new offspring in a new population 4. [Replace] Use new generated population for a further run of algorithm 5. [Test] If the end condition is satisfied, stop, and return the best solution in

current population 6. [Loop] Go to step 2.