46
Parametrization of Datasets with Low-distortion Embeddings François Meyer Center for the Study of Brain, Mind and Behavior, Program in Applied and Computational Mathematics Princeton University [email protected] http://www.princeton.edu/fmeyer Random Shapes, Tutorials, IPAM 2007 François Meyer (Princeton) March 14, 2007 1 / 41

Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

  • Upload
    others

  • View
    31

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Parametrization of Datasets withLow-distortion Embeddings

François Meyer

Center for the Study of Brain, Mind and Behavior,Program in Applied and Computational Mathematics

Princeton University

[email protected]://www.princeton.edu/∼fmeyer

Random Shapes, Tutorials, IPAM 2007

François Meyer (Princeton) March 14, 2007 1 / 41

Page 2: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Acknowledgments

R.R. Coifman, S. Lafon, M. Maggioni

M. Ramírez-Vélez, D.S. Barth

National Institutes of Health

IPAM MGA program

IPAM Graduate Summer School: Intelligent Extraction ofInformation from Graphs and High Dimensional Data

François Meyer (Princeton) March 14, 2007 2 / 41

Page 3: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 3 / 41

Page 4: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Introduction

Explosion of high-dimensional datasets:web, biology, medicine, etc.New tools for data exploration and analysis address issues:

I signal of interest: complicated geometryI data corrupted by noise,I algorithm complexity

François Meyer (Princeton) March 14, 2007 4 / 41

Page 5: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Example 1: Neuroimaging

dogma: each brain region is responsible for a specific function

goal: delineation of functional anatomy in terms of spatialand temporal organizationmethod:

I very simple cognitive or sensory input stimulusI measure the output signal xi at each voxel i inside the brainI detect significant changes in the signal

François Meyer (Princeton) March 14, 2007 5 / 41

Page 6: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Example 1: fMRI of Natural Stimuli

challenge: study the response to complex stimuli (“real life”)

example: subject watches a movie in the MRI scanner

discover neuronal networks involved in complex tasks

how is the analysis performed ?

size of the problem: 200,000 time series in R1000

François Meyer (Princeton) March 14, 2007 6 / 41

Page 7: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Example 2: Prediction of seizure from EEG

Electroencephalogram: electrical recordings on the scalp

seizure → time-frequency changes in the signal

goal: predict the seizure before the onset

best existing method: brain = nonlinear dynamical system

neuronal synchrony: fewer independent variables needed todescribe the EEG recordings ?

Is the brain during a seizure a low dimensional dynamical system ?

size of the problem: 100,000 brain states in R100

François Meyer (Princeton) March 14, 2007 7 / 41

Page 8: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 8 / 41

Page 9: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Assumptions about the data

dataset: large number of internal microscopic variables→ many degrees of freedom

at a macroscopic scale: many variables are coupled→ set of all possible configurations for the signals islow dimensional

signals varies smoothly as a function of “hidden” variables→ well defined low dimensional structure

François Meyer (Princeton) March 14, 2007 9 / 41

Page 10: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 10 / 41

Page 11: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Our definition of the problem

1 T ×N dataset X =[x0|x1| · · · |xN−1

]2 goal: construction of a new parameterizationφ : RT → RK , with K � T ,

xi 7→ φ(xi ),

3 similar signals are mapped to the same region of the atlas:I ‖φ(xi ) − φ(xj )‖ ≈ ‖xi − xj ‖

François Meyer (Princeton) March 14, 2007 11 / 41

Page 12: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 12 / 41

Page 13: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Construction of the graph

replace the dataset by a graph G,

vertex i of the graph = xi

edges: k nearest neighbors according to‖xi − xj ‖ = (

∑Tt=1(xi (t) − xj (t))2)1/2

weight wi ,j on the edge {i , j }: proximity between i and j ,

for instance,

wi ,j =

e−‖xi − xj ‖2/σ2, if xi is connected to xj ,

0 otherwise.

François Meyer (Princeton) March 14, 2007 13 / 41

Page 14: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

From the dataset to the graph

time

sign

altime

sign

al

time

sign

al

��

��

��

��

��������������

������

��������

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

������

������

������

������

��� ���

��� ���

������

������

������

������

��� ���

��� ���

��� ���

������

������

������

������

��

��

��

��

��

��

������

������

������

������

��������

��

��

��

��

��������

��� ���

������

������

��� ���

��

��

��

��

��������

��������

��

��

��

��

������

������

François Meyer (Princeton) March 14, 2007 14 / 41

Page 15: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

From the dataset to the graph

François Meyer (Princeton) March 14, 2007 15 / 41

Page 16: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

A random walk on the graph

weighted graph G, matrix W, Wi ,j = wi ,j

random walk on the graph with transition probability P,

Pi ,j = wi ,j /di ,

di =∑

j wi ,j : degree of the vertex i , D diagonal matrix,

Dii = di =∑j

Wi ,j . (1)

π = 1∑i,j wi,j

[d1, d2, · · · , dN ] stationary distribution

François Meyer (Princeton) March 14, 2007 16 / 41

Page 17: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 17 / 41

Page 18: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

A good distance on the graph

similarity measure between any two vertices i and j

distinguish between strongly connected vertices andweakly connected vertices

solution: average commute time, κ(i , j ) = H (j , i) + H (i , j )

symmetric version of the average hitting time from i to j ,

H (i , j ) = Ei [Tj ] with Tj = min{n > 0; Zn = j }.

Ei : random walk is started at iκ is a distance:

1 κ(i , j ) = 0 =⇒ i = j ,2 κ(i , j ) 6 κ(i , k) + κ(k , j ),3 κ(i , j ) = κ(j , i).

François Meyer (Princeton) March 14, 2007 18 / 41

Page 19: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Commute time in Paris

D E F G HCBA

D E F G HCBA

4

3

2

1

5

6

7

8

4

3

2

1

5

6

7

8

Funiculaire deMontmartre

Tarif ication spéciale

RER: au delà de cette limite, en direction de la banlieue, la tarification dépend de la distance. Les tickets Métro Urbain ne sont pas valables.

Correspondances

Fin de lignes en correspondance

Correspondance Métro–RER ou Métro–SNCF avec trajet par la voie publique

Pôle d’échange multimodal, métro, RER, tramway

Légende

Meudonsur-Seine

Parcde St-Cloud

Les Coteaux

Les Milons

SuresnesLongchamp

Belvédère

Puteaux

Muséede Sèvres

Brimborion LesMoulineaux

Hôtel de Villede La Courneuve

HôpitalDelafontaine Cosmonautes

La Courneuve6 Routes

Cimetièrede St-Denis

Marchéde St-Denis

Basilique deSt-Denis

ThéâtreGérard Philipe

Hôtel de Villede Bobigny

La Ferme

Libération

Escadrille Normandie–Niémen

Auguste Delaune

Pontde Bondy

PetitNoisy

Jean Rostand

Gaston Roulaud

Hôpital Avicenne

La Courneuve–8 Mai 1945

Maurice Lachâtre

Danton

Stade Géo André

Drancy–Avenir

T

ram

wa

y –

ta

rifi

cati

on

bu

s

Tra

mw

ay

– t

ari

fica

tio

n b

us

T

ram

w

ay –

tari

f ica

tion b

us

Tramway –

t

arif i

ca

tion b

us Jacques-HenriLartigue

Tramway - tari f ication bus

Tramw

ay - tarif ication bus

Poternedes Peupliers

StadeCharléty

Montsouris

JeanMoulin

DidotBrancion

Desnouettes

GeorgesBrassens

ColonelFabien

Stade de FranceSaint-Denis

Saint-DenisPorte de Paris

Basiliquede St-Denis

Portede Saint-Ouen

LamarckCaulaincourt

JulesJoffrin

Saint-Denis

GuyMôquet

Porte de Clichy

ChâteletLes Halles

Stalingrad

MarcadetPoissonniers

Hôtel de Ville

Arts etMétiers

ChâteauRouge

StrasbourgSaint-Denis

Saint-Mandé

Maisons-AlfortAlfortville

Hoche

MarxDormoy

La CourneuveAubervilliers

Le Bourget

Jaurès

Placedes FêtesBelleville

Jourdain Télégraphe

Ourcq Portede Pantin

JacquesBonsergent Saint-Fargeau

Pelleport

Portede Bagnolet

Danube

BotzarisButtes

Chaumont

République

Parmentier

SèvresBabylone

RichelieuDrouot

Palais RoyalMusée du

Louvre

RéaumurSébastopol

Raspail

Pyramides

MontparnasseBienvenüe

Mabillon

ClunyLa Sorbonne

MalakoffRue Étienne Dolet

Chausséed’Antin

La Fayette

BonneNouvelle

GrandsBoulevards

Bourse Sentier

BarbèsRochechouart La Chapelle

Pigalle

Poissonnière

Liège

Notre-Damede-Lorette

Saint-Georges

Abbesses

Anvers

La Fourche

Placede Clichy

Pereire–Levallois Blanche

Villiers

Opéra

Auber

HavreCaumartin

Trinitéd’Estienne

d’Orves

FranklinD. Roosevelt

Neuilly–Porte Maillot

Avenue Foch

Bir-Hakeim

Invalides

Pasteur

Pontde l’Alma

JavelAndréCitroën

Javel

MichelAnge

Molitor

La Muette

MichelAnge

Auteuil

BoulainvilliersChamp de MarsTour Eiffel

ChampsÉlysées

Clemenceau

TrocadéroAvenue

Henri Martin

Avenuedu Pdt Kennedy

Commerce

Félix Faure

Ruede la Pompe Iéna

Ranelagh

Jasmin

Exelmans

ChardonLagache

Églised’Auteuil

Duroc

La MottePicquetGrenelle

Assemblée Nationale

Varenne

St-MichelNotre-DameSolférino

Musée d’Orsay

Concorde Les Halles

Jussieu

Odéon

ÉtienneMarcel

Vavin

SaintGermaindes-Prés

Saint-Sulpice

St-Placide

PlaceMonge

Pont NeufTuileries

Notre-Damedes-Champs

Luxembourg

Port-Royal

LouvreRivoli

St-Michel

Bastille

Daumesnil

Reuilly–Diderot

PèreLachaise

Oberkampf

Quai dela Rapée

SaintMarcel Bercy

Fillesdu Calvaire

Chemin Vert

St-SébastienFroissart

Ruedes Boulets

Charonne

SèvresLecourbe

Cambronne

Pont Marie

SullyMorland

PhilippeAuguste

AlexandreDumas

Avron

Voltaire

QuatreSeptembre

Saint-Ambroise

RueSaint-Maur

Porte deVincennes

Vincennes

Maraîchers

Porte de Montreuil

Robespierre

Croix de Chavaux

Buzenval

Pyrénées

Laumière

Châteaud’Eau

Porte Dorée

Porte de Charenton

Ivrysur-Seine

Portede Choisy

Ported’Italie

Ported’Ivry

Créteil–Université

Créteil–L’Échat

Maisons-AlfortLes Juilliottes

Maisons-Alfort–Stade

DenfertRochereau

MaisonBlanche

CensierDaubenton

Tolbiac

Le KremlinBicêtre

VillejuifLéo Lagrange

VillejuifPaul Vaillant-Couturier

Glacière

Corvisart

Nationale

Chevaleret

CitéUniversitaire

Gentilly

OrlyOuestAntony

MalakoffPlateau de Vanves

MoutonDuvernet

Gaîté

EdgarQuinet

Bd Victor

Issy

Meudon–Val-Fleury

Chaville–Vélizy

Portede St-Cloud

Pantin

Magenta

LaplaceVitrysur-Seine

Les Ardoines

Le Vertde Maisons

Saint-Ouen

Les Grésillons

Ledru-Rollin

Ménilmontant

Couronnes

Montgallet

Michel Bizot

Charenton–Écoles

Liberté

FaidherbeChaligny

École Vétérinairede Maisons-Alfort

St-Paul

MaubertMutualité

CardinalLemoine

Temple

ChâteauLandon

Bolivar

Église de Pantin

Bobigny–PantinRaymond Queneau

Riquet

Crimée

Corentin Cariou

Porte de la Villette

Aubervilliers–PantinQuatre Chemins

Fortd’Aubervilliers

LesGobelins

CampoFormio

Quaide la Gare

CourSt-Émilion

Pierre Curie

Dugommier

Bel-Air

Picpus

Saint-Jacques

Dupleix

Passy

Alésia

Pernety

Plaisance

Porte de Vanves

Convention

Vaugirard

Porte de Versailles

Corentin Celton

Volontaires

Falguière

Lourmel

Boucicaut

Ségur

Ruedu Bac

Rennes

SaintFrançoisXavier

Vaneau

ÉcoleMilitaire

La TourMaubourg

AvenueÉmile Zola

CharlesMichels

MirabeauPorte

d’AuteuilBoulogne

Jean Jaurès

Billancourt

Marcel Sembat

Cité

AlmaMarceau

Boissière

KléberGeorge V

Argentine

Victor Hugo

Les Sablons

Porte Maillot

Pont de Neuilly

Esplanadede La Défense

Saint-Philippedu-Roule

Miromesnil

Saint-Augustin

Courcelles

Ternes

Monceau

RomeMalesherbes

Wagram

Porte de Champerret

Anatole France

Louise Michel

Europe

Le Peletier

Cadet

Pereire

Brochant

Mairiede Clichy

Garibaldi

Mairie de Saint-Ouen

CarrefourPleyel

Simplon

Bérault

RichardLenoir

Goncourt

BréguetSabin

Rambuteau

La PlaineStade de France

Arcueil–Cachan

Bourg-la-Reine

Bagneux

Madeleine

Portede Clignancourt

Garede Saint-Denis

Orry-la-Ville–Coye

Garede l’Est

Garede Lyon

Saint-Denis–Université

Portede la Chapelle

La Courneuve8 Mai 1945

AéroportCharles de Gaulle

Mitry–Claye

BobignyPablo Picasso

PréSt-Gervais Mairie des Lilas

Mairiede Montreuil

Gallieni

LouisBlanc

Porte des Lilas

Gambetta

Gare du Nord

Charlesde Gaulle

Étoile

Pont de LevalloisBécon

PorteDauphine

St-Germainen-Laye La Défense

BoulognePont de St-Cloud

Châtelet

Gared’Austerlitz

Nation

Châteaude Vincennes

Créteil–Préfecture

Olympiades

Mairie d’Ivry

Malesherbes Melun

Massy–PalaiseauVersailles–ChantiersRobinson

DourdanSaint-Martin-d’Étampes

Saint-Rémylès-Chevreuse

Placed’Italie

Villejuif–Louis Aragon

OrlySud

Mairie d’Issy

Châtillon–Montrouge

Pont de Sèvres IssyVal de Seine

Gare Saint-Lazare

GareMontparnasse

ChellesGournay

Noisy-le-Sec

HaussmannSaint-Lazare

Gabriel PériAsnières–Gennevilliers

Saint-Quentin-en-Yvelines

Ported’Orléans

Balard

Versailles–Rive Gauche

Marne-la-Vallée

Boissy-Saint-Léger

Pontoise

Poissy

Cergy

Saint-Lazare

Tournan

Pontdu Garigliano

BibliothèqueFrançois Mitterrand

Avril 2007

Parcs Disneyland

CDG

Château de Versailles

Grande Arche

Orly

Paris

www.ratp.fr

Pro

prié

té d

e la

RAT

P -

Age

nce

Car

togr

aph

iqu

e - P

M1

07.2

006-

015

-C.C

C -

Des

ign

: bdc

con

seil

01 5

3 02

02

20 -

Rep

rodu

ctio

n in

terd

ite

François Meyer (Princeton) March 14, 2007 19 / 41

Page 20: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

How does κ(i , j ) compare to δ(i , j ) ?

κ(i , j ) can be compared to the standard distance δ on the graph

TheoremIf i and j are at a distance δ(i , j ) on the graph, then

2δ(i , j ) 6 κ(i , j ) 6 Cδ(i , j ),

where C = maxi ,j1

πiPi,j=

∑i,j wi,j

mini,j wi,j

Markov chain is reversible, πiPi ,j = πjPj ,i

C can be large

François Meyer (Princeton) March 14, 2007 20 / 41

Page 21: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Your worst commute time in L.A.: Sepulveda Blvd or 405 ?

François Meyer (Princeton) March 14, 2007 21 / 41

Page 22: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Maximum commute time: lost in the city...

among all graphs with N vertices,what is the graph with the largest κ(i , j ) ?

lollipop graph: path with (N − 1)/3 vertices,complete subgraph with (2N + 1)/3 vertices

(N−1)/3

i

j

(2N+1)/3

κ(i , j ) = 427N

3 + O(N )

δ(i , j ) = 23N , C = 2N (2N + 1)/18

[Jonasson, 2000]

François Meyer (Princeton) March 14, 2007 22 / 41

Page 23: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Maximum commute time: lost in the city...

among all graphs with N vertices,what is the graph with the largest κ(i , j ) ?

lollipop graph: path with (N − 1)/3 vertices,complete subgraph with (2N + 1)/3 vertices

(N−1)/3

i

j

(2N+1)/3

κ(i , j ) = 427N

3 + O(N )

δ(i , j ) = 23N , C = 2N (2N + 1)/18

[Jonasson, 2000]

François Meyer (Princeton) March 14, 2007 22 / 41

Page 24: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Maximum commute time: lost in the city...

among all graphs with N vertices,what is the graph with the largest κ(i , j ) ?

lollipop graph: path with (N − 1)/3 vertices,complete subgraph with (2N + 1)/3 vertices

(N−1)/3

i

j

(2N+1)/3

κ(i , j ) = 427N

3 + O(N )

δ(i , j ) = 23N , C = 2N (2N + 1)/18

[Jonasson, 2000]

François Meyer (Princeton) March 14, 2007 22 / 41

Page 25: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Maximum commute time: lost in the city...

among all graphs with N vertices,what is the graph with the largest κ(i , j ) ?

lollipop graph: path with (N − 1)/3 vertices,complete subgraph with (2N + 1)/3 vertices

(N−1)/3

i

j

(2N+1)/3

κ(i , j ) = 427N

3 + O(N )

δ(i , j ) = 23N , C = 2N (2N + 1)/18

[Jonasson, 2000]

François Meyer (Princeton) March 14, 2007 22 / 41

Page 26: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 23 / 41

Page 27: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The spectral connection

Fundamental matrix Z = (I − (P − Π))−1 = I +∑

k>1 Pk − Π

with ΠT = [π1| · · · |πN ]

Z is the Green function of the Laplacian, I − L

Theorem[Bremaud, 1999] Hitting time Ei [Tj ] = (Zj ,j − Zi ,j )/πj .

Ei [Tj ] = 1 +∑

k ;k 6=j Pi ,kEi [Tk ]

eigenfunctions φ1, · · · , φN of

D12 PD− 1

2 , (2)

with eigenvalues −1 6 λN · · · 6 λ2 < λ1 = 1.

François Meyer (Princeton) March 14, 2007 24 / 41

Page 28: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The spectral connection

commute time:

κ(i , j ) =

N∑k=2

11 − λk

(φk (i)√πi

−φk (j )√πj

)2

. (3)

define an embedding

i 7→ Ik (i) =1√

1 − λk

φk (i)√πi

, k = 2, · · · , N (4)

Euclidean distance on the image of the embedding= commute time

κ(i , j ) = ‖I (i) − I (j )‖2 =

N∑k=2

|Ik (i) − Ik (j )|2

François Meyer (Princeton) March 14, 2007 25 / 41

Page 29: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Outline

1 IntroductionAssumptions about the dataOur definition of the problem

2 Constructing a new parametrizationA random walk on the datasetA new way to measure distances

3 The spectral connectionSpectral graph theoryFrom commute time to spectral geometry

4 Classification of EEG recordings

François Meyer (Princeton) March 14, 2007 26 / 41

Page 30: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The Laplacian connection

φk is also an eigenfunction of the Laplacian

L = I − D12 PD− 1

2 ,

with the eigenvalues βk = 1 − λk .

φk minimizes the “distortion”

minφ,‖φ‖=1

∑[i ,j ] wi ,j (φ(i) − φ(j ))2∑

i diφ2(i)

with φk orthogonal to {φ0, φ1, · · · , φk−1}.

Laplacian eigenmaps [Belkin and Niyogi, 2003]

François Meyer (Princeton) March 14, 2007 27 / 41

Page 31: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The diffusion distance connection

[Lafon, 2004, Coifman and Lafon, 2006]

diffusion distance,

D2t (i , j ) =

N∑k=2

λ2tk

(φk (i)√πi

−φk (j )√πj

)2

. (5)

commute time = sum of the diffusion distance at all scale t

∞∑t=0

D2t/2(i , j ) =

N∑k=2

11 − λk

(φk (i)√πi

−φk (j )√πj

)2

= κ(i , j )

François Meyer (Princeton) March 14, 2007 28 / 41

Page 32: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The spectral geometry connection

data sampled on a n-dimensional manifold M

M embedded by its heat kernel KM(t , x , y)

Theorem[Bérard et al., 1994]

ψt :M 7→ l2(R) (6)

x 7→{√

2(4π)n/4t(n+2)/4e−λj t/2φk (x )}

k>1(7)

∀t > 0, the map ψt is an embedding of M into l2(R).

scale parameter t : similar to diffusion distance

François Meyer (Princeton) March 14, 2007 29 / 41

Page 33: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The spectral geometry connection

ψt : composition of1 embedding of M by the heat kernel:

each point on M is mapped to a bump function.

M 7→ L2(M) (8)

x 7→ KM(t/2, x , .) (9)

2 isometry given by the choice of basis, {Φ1,Φ2, · · · }, of L2(M),each function of L2(M) is expanded into the basis ofeigenfunctions of the Laplace-Beltrami operator

L2(M) 7→ l2(R) (10)

f 7→ {< f ,φk >}k>1 (11)

François Meyer (Princeton) March 14, 2007 30 / 41

Page 34: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Algorithm 1: Construction of the embedding

Input:I xi (t), t = 0, · · · , T − 1, i = 1, · · · , N ,I σ ; nn number of nearest neighbors.I K : number of eigenfunctions.

Algorithm:1 construct the graph defined by the nn nearest (according to‖xi − xj ‖) neighbors of each xi

2 compute P3 find the first K eigenfunctions, φk , of D

12 PD− 1

2

Output: For all xi

I new co-ordinates of xi :{

11−λk

φk (i)√πi

}, k = 2, · · · , N

François Meyer (Princeton) March 14, 2007 31 / 41

Page 35: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

A toy example

−3

−2

−1

0

1

2

3

−3

−2

−1

0

1

2

3

−1.5

−1

−0.5

0

0.5

1

1.5

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

−0.03−0.02

−0.010

0.010.02

0.03

−0.04

−0.03

−0.02

−0.01

0

0.01

0.02

0.03

The 2nd EigenVector

The 3rd EigenVector

The

4th

Eig

enV

ecto

r

François Meyer (Princeton) March 14, 2007 32 / 41

Page 36: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Classification of EEG recordings

classification of EEG recordings into baseline and ictal states

Hypothesis: we can find a lower dimensional representation forclassification

More details can be found in[Ramírez-Vélez et al., 2006, Meyer and Shen, 2007]

François Meyer (Princeton) March 14, 2007 33 / 41

Page 37: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Human dataset

scalp electroencephalograms

55 electrode channels, lowpass filtered at 256 Hz.

baseline, pre-ictal, ictal,and post-ictal time segments

each node of the graph is in R55

François Meyer (Princeton) March 14, 2007 34 / 41

Page 38: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

The graph of the brain dynamicsti

t0

t j

��������

��

��

��

��������

������

��������

������

������

������

������

������

������

������

������

��

��

��

��

��

��

��� ���

��� ���

������

������

������

������

��� ���

��� ���

��� ���

������

������

������

������

������

������

������

������

������

������

��������

��������

��������

��� ���

������

������

��� ���

��������

��������

��������

��������

������

������

François Meyer (Princeton) March 14, 2007 35 / 41

Page 39: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Embedding

−0.1−0.05

00.05

−0.2

0

0.2

−0.1

−0.05

0

0.05

0.1

φ3

φ2

φ 4

baseline

pre−ictal

ictal

post−ictal

François Meyer (Princeton) March 14, 2007 36 / 41

Page 40: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Embedding (2)

−0.1

−0.05

0

0.05 −0.2

−0.1

0

0.1

0.2

−0.1

−0.05

0

0.05

0.1

φ 4baseline

pre−ictal

ictal

post−ictal

François Meyer (Princeton) March 14, 2007 37 / 41

Page 41: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Classification

d = 10 dimensions

10-fold cross validation

Table: Classification error (kernel ridge regression)

Baseline Ictal TotalRaw Data 93.20 70.40 81.80

PCA 81.60 78.80 82.20Random walk 100 82.80 91.40

François Meyer (Princeton) March 14, 2007 38 / 41

Page 42: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Classification

−0.04−0.03

−0.02−0.01

00.01

0.020.03

−0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

−0.06

−0.04

−0.02

0

0.02

0.04

φ2

φ3

φ 4

IctalBaseline

Estimated labels: red=ictal, blue=baselineFrançois Meyer (Princeton) March 14, 2007 39 / 41

Page 43: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Computational details

Curse of dimensionality:

Fast nearest neighbors in high dimension

Eigensolvers for large (N = 105 − 106) sparse matrices

François Meyer (Princeton) March 14, 2007 40 / 41

Page 44: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Open questions

much faster eigensolvers are needed...Matlab blows up forN > 10, 000

real time update φk with new incoming data ?

φk : sensitive to σ and nn

how many new co-ordinates (local dimension) ?

François Meyer (Princeton) March 14, 2007 41 / 41

Page 45: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Belkin, M. and Niyogi, P. (2003).Laplacian eigenmaps for dimensionality reduction and datarepresentation.Neural Computations, 15:1373–1396.

Bérard, P., Besson, G., and Gallot, S. (1994).Embeddings Riemannian manifolds by their heat kernel.Geometric and Functional Analysis, 4(4):373–398.

Bremaud, P. (1999).Markov Chains.Springer Verlag.

Coifman, R. and Lafon, S. (2006).Diffusion maps.Applied and Computational Harmonic Analysis, 21:5–30.

Jonasson, J. (2000).Lollipop graphs are extremal for commute times.Random Structures and Algorithms, 16(2):131–142.

François Meyer (Princeton) March 14, 2007 41 / 41

Page 46: Parametrization of Datasets with Low-distortion Embeddingshelper.ipam.ucla.edu/publications/rstut/rstut_6961.pdf · Parametrization of Datasets with Low-distortion Embeddings François

Lafon, S. (2004).Diffusion maps and geometric harmonics.PhD thesis, Yale University, New Haven.

Meyer, F. and Shen, X. (2007).Exploration of high dimensional biomedical datasets withlow-distortion embedding.In Proc. Data Mining for Biomedical Informatics Workshop,7th SIAM International Conference on Data Mining.To appear.

Ramírez-Vélez, M., Staba, R., Barth, D., and Meyer, F. (2006).Nonlinear classification of EEG data for seizure detection.In Proc. IEEE International Symposium on BiomedicalImaging: Macro to Nano, pages 956–959.

François Meyer (Princeton) March 14, 2007 41 / 41