Paper Review: An exact mapping between the Variational Renormalization Group and Deep Learning

Preview:

Citation preview

An exact mapping between the VariationalRenormalization Group and Deep Learning

Kai-Wen Zhao, kv

Physics, National Taiwan University

kelispinor@gmail.com

December 1, 2016

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 1 / 18

Outline

Overview

Renormalization Group

Physical world with various length scales

Symmetry and Scale Invariance

Restricted Boltzman Machine

Generative, Energy-based Model, Unsupervised Learning Algorithm

Richard Feynman: What I Cannot Create, I Do Not Understand.

Mapping

Unsupervised Deep Learning Implements the Kadanoff Real SpaceVariational Renormalization Group

HRGλ [{hj}] = HRBM

λ [{hj}]

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 2 / 18

Overview of Variational RG

Statistical Physics

An ensemble of N spins {vi}, take value ±1, i is position index in somelattice. Boltzman distribution and partition function

P({vi}) =e−H({vi})

Z, where Z = Trvi e

−H({vi}) =∑

v1,v2,...=±1e−H({vi})

Typically, Hamiltonian depends on a set of couplings {Ks}

H[{vi}] = −∑i

Kivi −∑ij

Kijvivj −∑ijk

Kijkvivjvk + ...

Free energy of spin system

F = − logZ = − log(Trvi e−H({vi}))

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 3 / 18

Overview of Variational RG

Overview of Variational Renormalization Group

Idea behind RG: To finde a new coarsed-grained description of spinsystem, where one has integrated out short distance fluctuations.

N Physical spins: {vi}, couplings {K}M Coarse-grained spins: {hj}, couplings {K̃}, where M < N

Renormalization transformation is often represented as a mapping

{K} 7→ {K̃}

Coarse-grained Hamiltonian

HRG [{hj}] = −∑i

K̃ihi −∑ij

K̃ijhihj −∑ijk

K̃ijkhihjhk + ...

Now, we do not distinguish vi and {vi} if no ambiguity

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 4 / 18

Overview of Variational RG

Overview of Variational Renormalization Group

Variational RG scheme (Kadanoff)

Coarse graining procedure: Tλ(vi , hj) couples auxiliary spins hj to physicalspins vi

Naturally, we marginalize over the physical spins

exp (−HRGλ (hj)) = Trvi exp (Tλ(vi , hj)− H(vi ))

The free energy of coarse grained system

F hλ = −log(Trhj e

−HRGλ (hj ))

Choose parameters λ to ensure long-distrance observables are invariant.Minimize free energy difference

∆F = F hλ − F v

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 5 / 18

Overview of Variational RG

Overview of Variational Renormalization Group

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 6 / 18

RBMs and Deep Neural Networks

Restricted Boltzman Machine

Binary data probability distribution P(vi ). Energy function

E (vi , hj) =∑ij

wijvihj +∑i

civi +∑j

bjhj

where we denote parameters λ = {w , b, c}. Joint probability

pλ(vi , hj) =e−E(vi ,hj )

Z

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 7 / 18

RBMs and Deep Neural Networks

Restricted Boltzman Machine

Variational distribution of visible variables

pλ(vi ) =∑hj

p(vi , hj) = Trhjpλ(vi , hj) :=e−H

RBMλ (vi )

Z

pλ(hj) =∑vi

p(vi , hj) = Trvipλ(vi , hj) :=e−H

RBMλ (hj )

Z

Kullback-Leibler divergence

DKL(P(vi )||pλ(vi )) =∑vi

P(vi ) logP(vi )

pλ(vi )

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 8 / 18

Exact Mapping VRG to DL

Mapping Variational RG to RBM

In RG scheme, the couplings between visible and hidden spins are encodesby the operators T . Analogous role, in RBM, is played by joint energyfunction.

T (vi , hj) = −E (vi , hj) + H(vi )

To derive equivalent statement from coarse-grained Hamiltonian

e−HRGλ (hj )

Z=

Trvi eTλ(vi ,hj )−H(vi )

Z

= Trvie−E(vi ,hj )

Z= pλ(hj)

=e−H

RBMλ (hj )

Z

Subsituting the right-hand side yields

HRGλ [{hj}] = HRBM

λ [{hj}] (1)

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 9 / 18

Exact Mapping VRG to DL

Mapping Variational RG to RBM

The operator Tλ can be viewed as a variational approximation forconditional probability

eT (vi ,hj ) = e−E(vi ,hj )+H(vi )

=pλ(vi , hj)

pλ(vi )eH(vi )−HRBM

λ (vi )

= pλ(hj |vi )eH(vi )−HRBMλ (vi )

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 10 / 18

Examples

Examples: 2D Ising Model

Two dimensional nearest neighbor Ising model with ferromagnetic coupling

H({vi}) = −J∑<ij>

vivj

Phase transition occurs when J/(kBT ) = 0.4352.Experiment Setup

20,000 samples, 40x40 periodic lattice

RBM’s architecture 1600-400-100-25

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 11 / 18

Examples

Examples: 2D Ising Model

Figure: Top layer

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 12 / 18

Examples

Examples: 2D Ising Model

Figure: Middle layer

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 13 / 18

Examples

Examples: 2D Ising Model

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 14 / 18

Conclusion

Conclusion and Discussion

One-to-one mapping between RBM-based DNN and variational RG

Suggest learning implements RG-like scheme to extract importantfeatures from data

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 15 / 18

Relate to us

Relate to us: Auto-Encoder and Convolutional AE

z is the codes extracted by machine

φ : X → Z ψ : Z → X

arg min ||X − (ψ ◦ φ)X ||2

Figure: Scheme of Auto-Encoder

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 16 / 18

Relate to us

Relate to us: Auto-Encoder and Convolutional AE

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 17 / 18

Relate to us

Thanks

Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 18 / 18

Recommended