# Estimation Of Distribution Algorithm based on Markov Random Fields

• View
36

0

Tags:

• #### new populationreplace

Embed Size (px)

DESCRIPTION

Estimation Of Distribution Algorithm based on Markov Random Fields. Siddhartha Shakya School Of Computing The Robert Gordon University. Outline. From GAs to EDAs Probabilistic Graphical Models in EDAs Bayesian networks Markov Random Fields - PowerPoint PPT Presentation

### Text of Estimation Of Distribution Algorithm based on Markov Random Fields

• Estimation Of Distribution Algorithm based on Markov Random FieldsSiddhartha ShakyaSchool Of ComputingThe Robert Gordon University

Siddhartha Shakya

• OutlineFrom GAs to EDAsProbabilistic Graphical Models in EDAsBayesian networksMarkov Random FieldsFitness modelling approach to estimating and sampling MRF in EDAGibbs distribution, energy function and modelling the fitnessEstimating parameters (Fitness modelling approach)Sampling MRF (several different approaches)Conclusion

Siddhartha Shakya

• Genetic Algorithms (GAs)Population based optimisation techniqueBased on Darwin's theory of EvolutionA solution is encoded as a set of symbols known as chromosomeA population of solution is generatedGenetic operators are then applied to the population to get next generation that replaces the parent population

Siddhartha Shakya

• Simple GA simulation

Siddhartha Shakya

• GA to EDA

Siddhartha Shakya

• Simple EDA simulation0.50.51.01.00.500111111011010101111

Siddhartha Shakya

• Joint Probability Distribution (JPD)Solution as a set of random variables

Joint probability Distribution (JPD)

Exponential to the number of variables, therefore not feasible to calculate in most casesNeeds Simplification!!

Siddhartha Shakya

• Factorisation of JPDUnivariate model: No interaction: Simplest model

Bivariate model: Pair-wise interaction

Multivariate Model: interaction of more than two variables

Siddhartha Shakya

• Typical estimation and sampling of JPD in EDAs Learn the interaction between variables in the solutionLearn the probabilities associated with interacting variablesThis specifies the JPD: p(x)Sample the JPD (i.e. learned probabilities)

Siddhartha Shakya

• Probabilistic Graphical ModelsEfficient tool to represent the factorisation of JPDMarriage between probability theory and Graph theoryConsist of Two componentsStructureParametersTwo types of PGMDirected PGM (Bayesian Networks)Undirected PGM (Markov Random Field)

Siddhartha Shakya

• Directed PGM (Bayesian networks)Structure: Directed Acyclic Graph (DAG)

Independence relationship:A variable is conditionally independent of rest of the variables given its parents

Parameters: Conditional probabilities

Siddhartha Shakya

• Bayesian networksThe factorisation of JPD encoded in terms of conditional probabilities is

JPD for BN

Siddhartha Shakya

• Estimating a Bayesian networkEstimate structure

Estimate parameters

This completely specifies the JPD

JPD can then be Sampled

Siddhartha Shakya

• BN based EDAsInitialise parent solutionsSelect a set from parent solutionsEstimate a BN from selected setEstimate structure Estimate parametersSample BN to generate new populationReplace parents with new set and go to 2 until termination criteria satisfies

Siddhartha Shakya

• How to estimate and sample BN in EDAsEstimating structureScore + Search techniques Conditional independence test Estimating parametersTrivial in EDAs: Dataset is completeEstimate probabilities of parents before childSamplingProbabilistic Logical Sampling (Sample parents before child)

Siddhartha Shakya

• BN based EDAsWell established approach in EDAsBOA, EBNA, LFDA, MIMIC, COMIT, BMDAReferencesLarraiaga and Lozano 2002Pelikan 2002

Siddhartha Shakya

• Markov Random Fields (MRF)Structure: Undirected Graph

Local independence: A variable is conditionally independent of rest of the variables given its neighbours

Global Independence: Two sets of variables are conditionally independent to each other if there is a third set that separates them.

Parameters: potential functions defined on the cliques

X1X3X2X4X6X5

Siddhartha Shakya

• Markov Random FieldThe factorisation of JPD encoded in terms of potential function over maximal cliques is

JPD for MRF

Siddhartha Shakya

• Estimating a Markov Random fieldEstimate structure from data

Estimate parametersRequires potential functions to be numerically defined

This completely specifies the JPD

JPD can then be SampledNo specific order (not a DAG) so a bit problematic

Siddhartha Shakya

• MRF in EDAHas recently been proposed as a estimation of distribution technique in EDAShakya et al 2004, 2005Santana et el 2003, 2005

Siddhartha Shakya

• MRF based EDAInitialise parent solutionsSelect a set from parent solutionsEstimate a MRF from selected setEstimate structureEstimate parameters Sample MRF to generate new populationReplace parent with new solutions and go to 2 until termination criteria satisfies

Siddhartha Shakya

• How to estimate and sample MRF in EDALearning StructureConditional Independence test (MN-EDA, MN-FDA)Linkage detection algorithm (LDFA)Learning ParameterJunction tree approach (FDA)Junction graph approach (MN-FDA)Kikuchi approximation approach (MN-EDA)Fitness modelling approach (DEUM)SamplingProbabilistic Logic Sampling (FDA, MN-FDA)Probability vector approach (DEUMpv)Direct sampling of Gibbs distribution (DEUMd)Metropolis sampler (Is-DEUMm)Gibbs Sampler (Is-DEUMg, MN-EDA)

Siddhartha Shakya

• Fitness modelling approachHamersley Clifford theorem: JPD for any MRF follows Gibbs distribution

Energy of Gibbs distribution in terms of potential functions over the cliques

Assuming probability of solution is proportional to its fitness:

From (a) and (b) a Model of fitness function - MRF fitness model (MFM) is derived

Siddhartha Shakya

• MRF fitness Model (MFM)Properties:Completely specifies the JPD for MRFNegative relationship between fitness and Energy i.e. Minimising energy = maximise fitnessTask:Need to find the structure for MRFNeed to numerically define clique potential function

Siddhartha Shakya

• MRF Fitness Model (MFM)Let us start with simplest model: univariate model this eliminates structure learning :)

For univariate model there will be n singleton clique

For each singleton clique assign a potential function

Corresponding MFM

In terms of Gibbs distribution

Siddhartha Shakya

• Estimating MRF parameters using MFMEach chromosome gives us a linear equationApplying it to a set of selected solution gives us a system of linear equationsSolving it will give us the approximation to the MRF parametersKnowing MRF parameters completely specifies JPDNext step is to sample the JPD

Siddhartha Shakya

• General DEUM frameworkDistribution Estimation Using MRF algorithm (DEUM)Initialise parent population PSelect set D from P (can use D=P !!)Build a MFM and fit to D to estimate MRF parametersSample MRF to generate new populationReplace P with new population and go to 2 until termination criterion satisfies

Siddhartha Shakya

• How to sample MRFProbability vector approachDirect Sampling of Gibbs DistributionMetropolis samplingGibbs sampling

Siddhartha Shakya

• Probability vector approach to sample MRFMinimise U(x) to maximise f(x)To minimise U(x) Each ixi should be minimumThis suggests: if i is negative then corresponding xi should be positiveWe could get an optimum chromosome for the current population just by looking on However not always the current population contains enough information to generate optimum We look on sign of each i to update a vector of probability

Siddhartha Shakya

• DEUM with probability vector (DEUMpv)

Siddhartha Shakya

• Updating RuleUses sign of a MRF parameter to direct search towards favouring value of respective variable that minimises energy U(x)Learning rate controls convergence

Siddhartha Shakya

• Simulation of DEUMpv011111010100101010000.40.60.60.60.60.50.50.50.50.5432101111101014321

Siddhartha Shakya

• ResultsOneMax Problem

Siddhartha Shakya

• ResultsF6 function optimisation

Siddhartha Shakya

• ResultsTrap 5 functionDeceptive problemNo solution found

Siddhartha Shakya

• Sampling MRFProbability vector approachDirect sampling of Gibbs distributionMetropolis samplingGibbs sampling

Siddhartha Shakya

• Direct Sampling of Gibbs distributionIn the probability vector approach, only the sign of MRF parameters has been used

However, one could directly sample from the Gibbs distribution and make use of the values of MRF parameters

Also could use the temperature coefficient to manipulate the probabilities

Siddhartha Shakya

• Direct Sampling of Gibbs distribution

Siddhartha Shakya

• Direct Sampling of Gibbs distributionThe temperature coefficient has an important roleDecreasing T will cool probability to either 1 or 0 depending upon sign and value of alphaThis forms the basis for the DEUM based on direct sampling of Gibbs distribution (DEUMd)

Siddhartha Shakya

• DEUM with direct sampling (DEUMd)1. Generate initial population, P, of size M 2. Select the N fittest solutions, N M3. Calculate MRF parameters 4. Generate M new solutions by sampling univariate distribution

5. Replace P by new population and go to 2 until complete

Siddhartha Shakya

• DEUMd simulation011111011101101010104432

Siddhartha Shakya

• Experimental resultsOneMax Problem

Siddhartha Shakya

• F6 function optimization

Siddhartha Shakya

• Plateau Problem (n=180)

Siddhartha Shakya

• Checker Board Problem (n=100)

Siddhartha Shakya

• Trap function of order 5 (n=60)

Siddhartha Shakya

• Ex ##### Specifying Gaussian Markov Random Fields with Specifying Gaussian Markov Random Fields with Incomplete
Documents ##### Conditional Random Fields - University at Buffalo Hidden Markov Model (HMM) Conditional Random Fields
Documents ##### Horde of Bandits using Gaussian Markov Random Fields Horde of Bandits using Gaussian Markov Random Fields
Documents ##### Reference Fields Analysis of a Markov Random Reference Fields Analysis of a Markov Random Field Model
Documents ##### Markov Random Neural Fields for Face Sketch Markov Random Neural Fields photo patch nueral node sketch
Documents ##### One-Shot Marginal MAP Inference in Markov Random One-Shot Marginal MAP Inference in Markov Random Fields
Documents ##### Belief Propagation on Markov Random Fields Aggeliki Tsoli
Documents Documents