38
Diffusion Wavelets for multiscale analysis on manifolds and graphs: constructions and applications Mauro Maggioni EPFL, Lausanne Dec. 19th, 2005

Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

'

&

$

%

Diffusion Wavelets for multiscale analysis on manifolds andgraphs: constructions and applications

Mauro Maggioni

EPFL, LausanneDec. 19th, 2005

Page 2: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 1

'

&

$

%

Overview

1 - A fast, strongly biased tour on some connections between Signal Processingand classical Harmonic Analysis;

2 - Applications demand to generalize signal processing to pointclouds, metric spaces, “data sets”. We use heat kernel and diffusion tocreate multiscale analyses and wavelets;

3 - Fourier global analysis for global parametrization and data organization;

4 - Multiscale analysis for organization at different scales or resolutions.

5 - Examples.

6 - Current and future work.

Page 3: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 2

'

&

$

%

Signal Processing and Harmonic Analysis

Core tools in signal processing:

1 - Analysis through “waveforms”: these are “template” functions (e.g. puresinusoidal sounds) out of which real-life complicate functions can be built.

2 - Function spaces and approximation: the analysis above becomes quantitativewhen we study (for example) how well, for a given number of bits, we canapproximate a given complicate function in terms of waveforms. One definesfunction spaces which are models for different degrees and modalities ofcomplexity. Analysis through waveforms allows to compute to whichfunction space a complicate function belongs very easily.

3 - Both of the above (plus more!) enable signal processing tasks, such ascompression, filtering etc...For tasks involving stochasticity (e.g. denoising),the effect of the stochastic components on the waveforms is fundamental.

The choice and construction of the “alphabet” or “dictionary” of building blocks,or “waveforms” is fundamental for the success of the above.

Page 4: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 3

'

&

$

%

“Waveforms”

1D example: audio signal processing. Classical waveforms includingsin(ξ·), cos(ξ·), e2πi·, which are pure frequencies: Fourier analysis. Example offunction spaces and approximation on [0, 1]: a (properly normalized) dictionaryis ek := e2πix, which is in fact an o.n. basis in L2([0, 1]): a function f ∈ L2([0, 1])(i.e.

∫ 1

0|f |2 < +∞) can be written as

f =+∞∑

k=0

〈f, ek〉ek .

In fact f ∈ L2([0, 1]) iff∑

k |ek|2 < +∞, and f (s) ∈ L2([0, 1]) iff∑k(1 + |k|2s)|ek|2 < +∞.

Unfortunately transient behaviors (very important in many applications) arepoorly characterized in this basis. Other bases have been constructed that betteranalyse transient properties: Gabor, local cosines, wavelets.

2D example: images. Tensor product of 1D bases, such as Fourier, Gabor,wavelets. “Genuinely 2D waveforms”: non-tensor wavelet bases, steerablepyramids, ridgelets, curvelets.

Page 5: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 4

'

&

$

%

Classical Multi-Resolution Analysis and Wavelets

A Multi-Resolution Analysis for L2(R) is a sequence of subspaces {Vj}j∈Z,Vj−1 ⊆ Vj , with ∩Vj = {0}, ∪Vj = L2(R), with an orthonormal basis{ϕj,k}k∈Z := {2 j

2 ϕ(2j · −k)}k∈Z for Vj . Then there exist ψ such that{ψj,k}k∈Z := {2 j

2 ψ(2j · −k)}k∈Z spans Wj , the orthogonal complement of Vj−1 inVj .

ϕj,k is essentially supported in {|ξ| ≤ 2j}, and ψj,k is essentially supported in theL.P.-annulus 2j−1 ≤ |ξ| ≤ 2j .

Because Vj−1 ⊆ Vj , ϕj−1,0 =∑

k′ αk′ϕj,k′ : refinement eqn.s, FWT.

One of the big novelties here is that each elements have a characteristic scale,and the dictionary includes elements at “all” scales. Great for characterizing allsorts of transients, and complexity properties at different levels of resolution.

Page 6: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 5

'

&

$

%

Fast Algorithms

The ideas developed in Harmonic Analysis have had huge impact on the designof fast algorithms for the solution of integral and differential equations,geometric analysis, and more.

These algorithms take advantage of the new structures discovered in theharmonic analysis of these problems. Many of these structures are multiscale.

Example of such algorithms are fast multipole methods [Greengard,Rohklin,...]for the solution of certain integral equations from mathematical physics (e.g.Dirichlet problem, wave equation, Helmoltz equation) in domains oflow-dimensional Euclidean space, multi-grid methods [Brandt,Hackbusch,...],non-standard wavelet representation for certain integral and pseudo-differentialoperators [Beylkin, Coifman, Rohklin], the wavelet-element methods [Dahmen,DeVore, Cohen,...], and many others.

Page 7: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 6

'

&

$

%

Ambient Space

Classically one studies Euclidean spaces R,Rn.

From there, we may recognize three directions towards more general structures:

• algebraic/highly symmetric structures (Lie groups),

• smooth geometric structures (Riemannian manifolds), global analysis

• less smooth geometric structures (Lipschitz curves).

Harder, and “more fundamental”:

• classes of point sets in Rn,

• classes of metric spaces, for example measure-metric-energy spaces or graphs.

New type of harmonic analysis needed.

Page 8: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 7

'

&

$

%

New type of Harmonic Analysis needed

We would like to have function spaces, operators, bases interacting as in classicalHarmonic Analysis on more general structures such as point sets in Rn andmetric spaces.

Classical definitions take advantage of symmetries, geometric transformations ofthe space (and associated representations), and other ingredients which aresimply not available here. Need new definitions, new interactions betweengeometry, function spaces, operators and bases.

Additional difficulties: incorporate noise into the picture.

This has many applications: organization of data sets, learning on data sets,nonlinear image denoising, organization of bodies of documents, astronomicaland satellite (hyperspectral) imagery, study of dynamical systems...

Page 9: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 8

'

&

$

%

An example from Molecular Dynamics...

The dynamics of a small protein ini a bath of water molecules is approximatedby a Fokker-Planck system of stochastic equations x = −∇U(x) + w .

The alanine molecule

The set of states of the proteinis a noisy set of points in R36, since wehave 3 coordinates for each of the 12 atoms.This set is a priori very complicated. Howeverwe expect for physical reasons that theconstraints on the molecule to force this setto be essentially lower-dimensional. We canexplore the space of states by running longsimulations, for different initial conditions.

Page 10: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 9

'

&

$

%

...continued

In fact, this set of points is much lower-dimensional. We are able to discover thislower-dimensional set, estimate its local dimensionality (it is not constant!),

Embedding of the set of states of the molecule.

considernatural classes of (diffusion) operatorson this set, and build Fourier andwavelet bases on it. For example itturns out that some of the parametersdiscovered by chemists on the basisof experiments and chemical-physicalconsiderations can be discoveredempirically as being Fourier-likefunctions on the set of states!

Page 11: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 10

'

&

$

%

The Heat Kernel and the Laplacian

The heat kernel and the Laplacian are present almost everywhere in classicalHarmonic Analysis. In Euclidean space the Laplacian is very natural because ofits invariance under the natural symmetries of the space. The associated heatkernel is of great importance at the very least because of its connections withBrownian motion, potential theory and the heat equation.

These objects can be defined in a natural way also in much general settings, suchas metric-measure-energy spaces and graphs. They are affected by geometricproperties of the space they are defined on, and can be used (for examplethrough its spectral theory) to define function spaces, operators, and bases thatare natural generalization of their classical counterparts.

Page 12: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 11

'

&

$

%

Laplacian on manifolds I

The Laplace-Beltrami operator ∆BL can be defined naturally on a Riemannianmanifold, and is a well-studied object in global analysis. The corresponding heatkernel e−t∆ is the Green’s function for the heat equation on the manifold,associated with Brownian motion “restricted” to the manifold. Spectraldecomposition ∆φi = λiφi yields

Ht(x, y) := e−t∆(x, y) =∑

i

e−tλi︸ ︷︷ ︸µt

i

φi(x)φi(y) .

The eigenfunctions φi of the Laplacian generalize Fourier modes: Fourier analysison manifolds, global analysis.

Surprisingly, they also allow to analyse the geometry of the manifold, andprovide embeddings of the manifold (“diffusion maps”): for m = 1, 2, . . . , t > 0and x ∈M, define

Φ(t)m (x) = (µ

t2i φi(x))i=1,...,m ∈ Rn .

Page 13: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 12

'

&

$

%

Diffusion vs. Geodesic Distance

dgeod.(A,B) ∼ dgeod.(C,B), however d(t)(A,B) >> d(t)(C, B).Picture courtesy of S. Lafon

Page 14: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 13

'

&

$

%

Diffusion Maps and Diffusion Distance

The map is an approximate isometry (it is an isometry for m = +∞) toEuclidean Rm from M with the diffusion metric (not the Riemannian metric!)defined by

d(t)(x, y) = ||Ht/2(x, ·)−Ht/2(y, ·)||L2(M)

=√〈δx − δy, Ht(δx − δy)〉

=√∑

i

µti(φi(x)− φi(y))2

Thisdistance is sensitive to all connections andflows between a pair of points. Less sensitiveto noise; very natural in several contexts(e.g. Markov chains, social networks...).In fact very natural in general; one can showthat eigenfunctions provide local coordinatesystems [P.W. Jones, R. Schul, MM].

Page 15: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 14

'

&

$

%

Diffusion Map for a Dumbell

Eigenfunctions on a dumbell-shaped manifold, and corresponding diffusion map; pictures courtesy of Stephane Lafon.

Page 16: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 15

'

&

$

%

Laplacian on Graphs, I

Given a weighted graph (G,W,E), the combinatorial Laplacian is defined byL = D −W , where (D)ii =

∑j Wij , and the normalized Laplacian is defined by

L = D− 12 (D −W )D− 1

2 .

These are self-adjoint positive-semi-definite operators, let λi and φi be theeigenvalues and eigenvectors. Fourier analysis on graphs. The heat kernel is ofcourse defined by Ht = e−tL; the natural random walk is D−1W .

If

• points are drawn from a manifold according to some (unknown) probabilitydistribution [M. Belkin, P. Niyogi; RR Coifman, S. Lafon], or

• points are generated by a stochastic dynamical system driven by aFokker-Planck equation [RR Coifman, Y. Kevrekidis, S. Lafon, MM, B.Nadler]

the Laplace-Beltrami operator and, respectively the Fokker-Planck operator, canbe approximated by a graph Laplacian on the discrete set of points with certainappropriately estimated weights.

Page 17: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 16

'

&

$

%

Example from Molecular Dynamics revisited

The dynamics of a small protein in a bath of water molecules is approximated bya Fokker-Planck system of stochastic equations x = −∇U(x) + w. Many millionsof points in R36 can be generated by simulating of the stochastic ODE, U isneeded only “on the fly” and only at the current positions (not everywhere inR36).

Embedding of the set of states of the molecule.

Then a graphLaplacian on this set of points canbe constructed, that approximatedthe Fokker-Planck operator, and theeigenfunctions of this approximationyield a low-dimensionaldescription and parametrizationof the set, as well as a subspacein which the long-term behaviorof the system can faithfully projected.

Page 18: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 17

'

&

$

%

Graph associated with data sets

A deluge of data: documents, web searching, customer databases, hyper-spectralimagery (satellite, biomedical, etc...), social networks, gene arrays, proteomicsdata, financial transactions, traffic statistics (automobilistic, computernetworks)...

Assume we know how to assign local similarities: map data set to weightedgraph. Global distances are not to be trusted!

Data often given as points in high-dimension, but constraints (natural,physical...) force it to be intrinsically low-dimensional.

Model the data as a weighted graph (G,E, W ): vertices represent data points(correspondence could be stochastic), edges connect similar data points, weightsrepresent a similarity measure. Example: have an edge between web pagesconnected by a link; or between documents with very similar word frequencies.When points are in high-dimensional Euclidean space, weights may be a functionof Euclidean distance, and/or the geometry of the points. How to define thesimilarity between very similar objects in each category is important but notalways easy. That’s the place where field-knowledge goes.

Page 19: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 18

'

&

$

%

Weights from a local similarity kernel

The similarity between points of a set X can be summarized in a kernel K(x, y)on X ×X. Usually we assume the following properties of K:

K(x, y) = K(y, x) (symmetry)

K(x, y) ≥ 0 (positivity − preserving)

〈v, Kv〉 ≥ 0 (positive semi− definite)

If X ⊆ Rn, then choices for K include e−(||x−y||

δ )2 , δδ+||x−y|| ,

〈x,y〉||x||||y|| .

If some “model” for X is available, the kernel can be designed to be consistentwith that model.

In several applications, one starts by applying a map to X (projections ontolower-dimensional subspaces, nonlinear maps, etc...) before constructing thekernel.

Page 20: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 19

'

&

$

%

A simple example

X is a set of images of a symbol “3D” under various lighting conditions. Eachpoint is a vector in R322

, the i-th coordinate being the intensity of the i-th pixelin the image. We use the kernel e−(

||x−y||δ )2 , for a reasonable choice of δ, which

restricted to X yields the weights on the graph of points. We compute the graphLaplacian and its low-frequency eigenfunctions, and use them to embed the dataset in two-dimensions. The eigenfunctions actually discover the naturalparametrization by the lighting vector.

Picture courtesy of Stephane Lafon

Page 21: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 20

'

&

$

%

Several Applications

Many successful applications of spectral kernel methods. For Laplacianeigenfunctions, the following works in particular:

• Classifiers in the semi-supervised learning context [M. Belkin, P. Nyogi]

• fMRI data [F. Meyer, X. Shen]

• Art data [W Goetzmann, PW Jones, MM, J Walden]

• Hyperspectral Imaging in Pathology [MM, GL Davis, F Warner, F.Geshwind, A Coppi, R. DeVerse, RR Coifman]

• Molecular dynamics simulations [RR. Coifman, G.Hummer, I. Kevrekidis, S.Lafon, MM, B. Nadler]

• Text documents classification [RR. Coifman, S. Lafon, A. Lee, B. Nadler; RRCoifman, MM]

Page 22: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 21

'

&

$

%

Application to Hyper-spectral Pathology

For each pixel in a hyper-spectral image we have a whole spectrum (a128-dimensional vector for example). We view the ensemble of all spectra in ahyper-spectral image as a cloud in R128, induce a Laplacian on the point set anduse the eigenfunctions for classification of spectra into different classes, whichturn out to be biologically distinct and relevant.

On the left, we have mapped the values of the top 3 eigenfunctions to RGB.

Page 23: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 22

'

&

$

%

Text Document Organization

Consider about 1000 Science News articles, from 8 different categories. For eachwe compute about 10000 coordinates, the i-th coordinate of document d

representing the frequency in document d of the i-th word in a fixed dictionary.The diffusion map gives the embedding below.

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

−0.14−0.12

−0.1−0.08

−0.06−0.04

−0.020

0.020.04

−0.3

−0.2

−0.1

0

0.1

3

4

5

AnthropologyAstronomySocial SciencesEarth SciencesBiologyMathematicsMedicinePhysics

−0.14−0.12−0.1−0.08−0.06−0.04−0.0200.020.04

−0.5

0

0.5

−0.1

−0.05

0

0.05

0.1

0.15

45

6

AnthropologyAstronomySocial SciencesEarth SciencesBiologyMathematicsMedicinePhysics

Embedding Ξ(0)6 (x) = (ξ1(x), . . . , ξ6(x)): on the left coordinates 3, 4, 5, and on the right

coordinates 4, 5, 6.

Page 24: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 23

'

&

$

%

Summary for the “Fourier part”

• it is useful to consider only local similarities between data points;

• it is possible to organize this local information by diffusion;

• parametrizations can be found by looking at the eigenvectors of a diffusionoperator (Fourier modes);

• these eigenvectors yield a nonlinear embedding into low-dimensionalEuclidean space;

• the eigenvectors can be used for global Fourier analysis on the set/manifold.

Problem: Either very local information or very global information: in manyproblems the intermediate scales are very interesting! Would like multiscaleinformation!

Possibility 1: proceed bottom-up: repeatedly cluster together in a multi-scalefashion, in a way that is faithful to the operator: diffusion wavelets.

Possibility 2: proceed top-bottom: cut greedily according to global information,and repeat procedure on the pieces: recursive partitioning, local cosines...

Possibility 3: do both!

Page 25: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 24

'

&

$

%

Multiscale Analysis, I

We would like to construct multiscale bases, generalizing classical wavelets, onmanifolds, graphs, point clouds.

The classical construction is based on geometric transformations (such asdilations, translations) of the space, transformed into actions (e.g. viarepresentations) on functions. There are plenty of such transformations on Rn,certain classes of Lie groups and homogeneous spaces (with automorphisms thatresemble “anisotropic dilations”), and manifolds with large groups oftransformations.

Here the space is in general highly non-symmetric, not invariant under ”natural”geometric transformation, and moreover it is “noisy”.

Idea: use diffusion and the heat kernel as dilations, acting on functions on thespace, to generate multiple scales.

This is connected with the work on diffusion or Markov semigroups, andLittlewood-Paley theory of such semigroups (a la Stein).

We would like to have constructive methods for efficiently computing themultiscale decompositions and the wavelet bases.

Page 26: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 25

'

&

$

%

Multiscale Analysis, II

Suppose for simplicity we have a weighted graph (G,E,W ), with correspondingLaplacian L and random walk P . Let us renormalize, if necessary, P so it hasnorm 1 as an operator on L2: let T be this operator. Assume for simplicity thatT is self-adjoint, and high powers of T are low-rank: T is a diffusion, so range ofT t is spanned by smooth functions of increasingly (in t) smaller gradient.

A “typical” spectrum for the powers of T would look like this:

0 5 10 15 20 25 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

σ(T)

ε

V1 V

2 V

3 V

4 ...

σ(T3)

σ(T7)

σ(T15)

Page 27: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 26

'

&

$

%

Classical Multi-Resolution Analysis

A Multi-Resolution Analysis for L2(R) is a sequence of subspaces {Vj}j∈Z,Vj−1 ⊆ Vj , with ∩Vj = {0}, ∪Vj = L2(R), with an orthonormal basis{ϕj,k}k∈Z := {2 j

2 ϕ(2j · −k)}k∈Z for Vj . Then there exist ψ such that{ψj,k}k∈Z := {2 j

2 ψ(2j · −k)}k∈Z spans Wj , the orthogonal complement of Vj−1 inVj .

ϕj,k is essentially supported in {|ξ| ≤ 2j}, and ψj,k is essentially supported in theL.P.-annulus 2j−1 ≤ |ξ| ≤ 2j .

Because Vj−1 ⊆ Vj , ϕj−1,0 =∑

k′ αk′ϕj,k′ : refinement eqn.s, FWT.

We would like to generalize this construction to graphs. The frequency domain isthe spectrum of e−L. Let Vj := 〈{φi : λ2j

i ≥ ε}〉. Would like o.n. basis ofwell-localized functions for Vj , and to derive refinement equations anddownsampling rules in this context.

Page 28: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 27

'

&

$

%

Construction of Diffusion Wavelets

Page 29: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 28

'

&

$

%

Diffusion Wavelets on the Sphere

Some diffusion wavelets and wavelet packets on the sphere, sampled randomly uniformly at 2000 points.

Page 30: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 29

'

&

$

%

Signal Processing on Manifolds

From left to right: function F ; reconstruction of the function F with top 50 best basis packets;

reconstruction with top 200 eigenfunctions of the Beltrami Laplacian operator.

0 5 10 15 20 25 30 35 40 45 500

10

20

30

40

50

60

0 200 400 600 800 1000 1200 1400 1600 1800 20000.5

1

1.5

2

2.5

3

3.5

0 20 40 60 80 100 120 140 160 180 2000.4

0.5

0.6

0.7

0.8

0.9

1

Best BasisTop Laplace eigenfunctions

Left to right: 50 top coefficients of F in its best diffusion wavelet basis, distribution coefficientsF in

the delta basis, first 200 coefficients of F in the best basis and in the basis of eigenfunctions.

Page 31: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 30

'

&

$

%

Diffusion Wavelets on Dumbell manifold

Page 32: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 31

'

&

$

%

Summary of the Algorithm

Input: A diffusion operator represent on some orthonormal basis (e.g.:δ-functions), and a precision ε.

Output: Multiscale orthonormal scaling function bases Φj and wavelet bases Ψj ,encoded through the corresponding multiscale filters Mj , as well as T 2j

represented (compressed) on Φj .

One can prove that Ψj is like a Littlewood-Paley block in frequency. Currentlyworking out the implications on characterization of function spaces andcorresponding approximation properties.

Can also construct wavelet packets for a more flexible space-frequency analysis.

Allows for a fast wavelet transform, best basis algorithms, signal processing ongraphs and manifolds, efficient application of T 2j

, and direct inversion of theLaplacian (next slide).

Page 33: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 32

'

&

$

%

Potential Theory, Efficient Direct Solvers

The Laplacian L = I − T has an inverse (on ker(L)⊥) whose kernel is the Green’sfunction, that if known would allow the solution of the Dirichlet or Neumannproblem (depending on the boundary conditions imposed on the problem on L).If ||T || < 1, one can write the Neumann series

(I − T )−1f =∞∑

k=1

T kf =∞∏

k=0

(I + T 2k

)f .

Since we have compressed all the dyadic powers T 2k

, we have also computed theGreen’s operator in compressed form, in the sense that the product above can beapplied directly to any function f (or, rather, its diffusion wavelet transform).Hence this is a direct solver, and potentially offers great advantages, especiallyfor computations with high precision, or many computations on a commongeometry, over iterative solvers.

Page 34: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 33

'

&

$

%

Text Document Organization, revisited

−0.05

0

0.05

0.1

0.15

−0.15−0.1−0.0500.05

−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

φ3,10

φ

3,15

φ3,5

φ3,4

φ3,2

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

−0.15

−0.1

−0.05

0

0.05

−0.3

−0.2

−0.1

0

0.1

0

0.05

0.1

0.15

0.2

φ4,2

φ4,3

φ4,4

φ4,5

−0.05

0

0.05

0.1

0.15

−0.15

−0.1

−0.05

0

0.05−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

φ5,2

φ5,3

−0.05

0

0.05

0.1

0.15

−0.15

−0.1

−0.05

0

0.05−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0

0.05

0.1

0.15

0.2

φ5,4

φ5,5

−0.05

0

0.05

0.1

0.15

−0.15

−0.1

−0.05

0

0.05−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

−0.04

−0.02

0

0.02

0.04

0.06

0.08φ7,3

−0.05

0

0.05

0.1

0.15

−0.15

−0.1

−0.05

0

0.05−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

−0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

φ7,4

Scaling functions at different scales represented on the set embedded in R3 via (ξ3(x), ξ4(x), ξ5(x)).

φ3,4 is about Mathematics, but in particular applications to networks, encryption and number

theory; φ3,10 is about Astronomy, but in particular papers in X-ray cosmology, black holes, galaxies;

φ3,15 is about Earth Sciences, but in particular earthquakes; φ3,5 is about Biology and Anthropology,

but in particular about dinosaurs; φ3,2 is about Science and talent awards, inventions and science

competitions.

Page 35: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 34

'

&

$

%

Nonlinear Analysis of Images

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

Scale 2

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

102030

Scale 3

102030

102030

Scaling functions on the graph of patches extracted from an image of a white full circle on black background, with

noise.

Page 36: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 35

'

&

$

%

Applications to Markov Decision Processes

The state spaces of MDPs are often varifolds or graphs; it is crucial to representcertain functions (value function=potential of certain rewards) and operators(large-time expectation operators ∼ Green’s operators) efficiently. Excellentresults obtained so far [S. Mahadevan, MM] with eigenfunctions of certainLaplacians and diffusion wavelets.

Eigenfunctions of the Laplacian in a 3 room environment (left), diffusion wavelets in a 2-room environment (right).

Page 37: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 36

'

&

$

%

Current & Future work

Understand the complex relationships between the geometric measure theoryaspects of point clouds in high dimensions and related multiscale geometricalgorithms, the construction (or estimation) of the heat kernel, the multiscalefunctional construction of diffusion wavelets, and potential theory aspects.

Small steps at a time: I am working on better constructions of diffusion wavelets(from the computational perspective); top-bottom constructions; robustness ofthese constructions to noise; approximation theory and characterization offunction spaces through diffusion wavelets.

Applications: signal processing on manifolds and graphs and its applications(e.g. linear and nonlinear “image” denoising); classification algorithms (e.g. textclassification, protein and gene functional classification, target recognition inhyper-spectral imaging); learning; signal processing on the Earth; application toMarkov decision processes; mesh compression.

Page 38: Difiusion Wavelets for multiscale analysis on manifolds and …signallake.com/signallake.com/innovation/Talk_EPFL.pdf · 2013-03-16 · EPFL Multiscale Analysis and Difiusion Wavelets

EPFL Multiscale Analysis and Diffusion Wavelets - Mauro Maggioni, Yale U. 37

'

&

$

%

Collaborators

• R.R. Coifman, P.W. Jones (Yale Math) [Diffusion geometry; Diffusion wavelets; Uniformization

via eigenfunctions], S.W. Zucker (Yale CS) [Diffusion geometry];

• G.L. Davis, F.J. Warner (Yale Math), F.B. Geshwind , A. Coppi, R. DeVerse (Plain Sight

Systems) [Hyperspectral Pathology];

• S. Mahadevan (U.Mass CS) [Markov decision processes];

• Y. Kevrekidis (Princeton Eng.), S. Lafon (Google), B. Nadler (Weizman) [stochastic dynamics];

• A.D. Szlam (Yale) [Diffusion wavelet packets, top-bottom multiscale analysis, linear and

nonlinear image denoising, classification algorithms based on diffusion];

• J.C. Bremer (Yale) [Diffusion wavelet packets, biorthogonal diffusion wavelets];

• R. Schul (UCLA) [Uniformization via eigenfunctions; nonhomogenous Brownian motion];

• H. Mashkar (LA State) [polynomial frames of diffusion wavelets, characterization of function

spaces];

• M. Mahoney (Yahoo Research), F. Meyer (UC Boulder), X. Shen (UC Boulder) [Randomized

algorithms for hyper-spectral and fMRI imaging]

• W. Goetzmann (Yale, Harvard Business School), J. Walden (Berkley Business School), P.W.

Jones (Yale Math) [Applications to finance]

Talks, papers, Matlab code (currently working on a Matlab toolbox) available atwww.math.yale.edu/∼mmm82

Thank you, and Happy Holidays!