52
Algorithms on graphs and networks Leonid Zhukov

Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Algorithms on graphs and networks

Leonid Zhukov

Page 2: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Networks

• Explicit: graphs & networks – Power grids, phone lines– Internet, Web graph– Social networks (friends, business etc)

• Implicit: transaction – Email, Messenger, ICQ– Online auctions, E-store (Amazon) – Foreign trade

• Constructed: affinity between data points– Music (online radio)– Books, videos, …

Page 3: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr – social network

Page 4: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr – a graph

Page 5: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Adjacency matrix

Page 6: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Adjacency matrix

Page 7: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr: reverse Cuthill-McKee

580,000 users, 3,500,000 links

Page 8: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr: some stats

• # nodes = 584,207 • # edges (nnz) = 3,555,115

• max in-degree = 3531• max out-degree = 8976• < in-degree > = < out-degree > = 6• diameter = 18

• # strongly connected components = 152,324• largest strongly connected components = 274,649 374 186 155 11

• # connected component = 43,189• largest connected components = 404,893 378 112 108 103• highest core number = 249 (size 668)

Page 9: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr: scale free graph

in-degrees

out-degrees

Page 10: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Graph partitioning

Page 11: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Graph partitioning

11

1 32

54

76

8

Graph separators:

A B

Normalized cut:

Page 12: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Spectral graph partitioning

12

• assign each node indicator p= {-1,-1,-1,-1, +1, +1, +1}

• smallest cut:

• combinatorial optimization, NP hard, relax:

1 32

54

76

8

=>

Page 13: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Spectral: solution

13

• Quadratic optimization:

• Eigenvalue problem:

• Rounding off

Page 14: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Example: normalized cuts

14

1 325

4

76 8

L = x =

-1 +1

Page 15: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Spectral: ordering

15

Adjacency matrix Adjacency matrix – re-ordered

Eigenvector Eigenvector – sorted

perm = [ 1 2 6 7 3 8 4 5 ]

Page 16: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Spectral: graph cut

16

Eigenvector – sortednode layout

Cut values

Page 17: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Clustering

17

Page 18: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Clustering

18

Page 19: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Recursive partitioning tree

19

Page 20: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Flickr: “10-core” spectral ordering

Page 21: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Partition tree

21

Page 22: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Sponsored search

22

View Bids Tool

Page 23: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Sponsored search data

23

accessory desk .83 Office Maxalfred hitchcock .01 Videoflicks.comeducational software .13 Buy.comeducational software .4 eBAY International AGeducational software .17 OfficeMaxepson .28 Buy.comepson .4 eBAY International AGfax .13 Buy.comfax .4 eBAY International AG fax .38 OfficeMaxgame .02 Net Businessgame .25 Buy.comgeorge harrison .15 eBAY International AGgeorge harrison .05 MusicStackgeorge harrison .01 Videoflicks.comjackson michael .05 MusicStackjackson michael .01 Videoflicks.com

bidded term $ bid advertiser

Page 24: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Sponsored search bid graph

24

Terms

Advertisers

accessory desk

educational software

epson

fax

game

george harrison

Office Max

eBAY International AG

Buy.com

Net Business

MusicStack

Videoflicks.com

jackson michael

alfred hitchcock

Page 25: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Graph partitioning

25

Terms

Advertisers

accessory desk

educational software

epson

fax

game

george harrison

Office Max

eBAY International AG

Buy.com

Net Business

MusicStack

Videoflicks.com

jackson michael

alfred hitchcock

Page 26: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Sponsored search bid matrix

26

Office Max

Buy.com

Net Business

MusicStack

eBAY

Videoflicks.com

1 2 3 4 5 6 7 8

1. accessory desk2. educational software3. epson4. fax

5. game6. george harrison7. jackson michael8. alfred hitchcock

Page 27: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Overture bid graph : real life

27

Page 28: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Bi-partite graph partitioning

28

Terms Advertisers

Page 29: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Bi-partite formulation

29

• Eigensystem:

• SVD:

• Solution:

Page 30: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Back to data

30

Page 31: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Clusters

31

Page 32: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Sample clusters

32

'A.S.C. Incorporated''Atlantic Telecom'

'AudioLink''Cost Plus Electronics'

'Headset Express''Hello Direct'

'PDA Mountain''PK TECH INC‘

…………………..

'accessory phone''cordless headset'

'cordless headset phone''free hands'

'headset''headset microphone'

'headset phone''headset plantronics'

'headset wireless''isdn'

'plantronics‘…………….

'Berent Associates Center forShyness and Social Therapy'

'Dr. Puff''New York Psychotherapy

Collective''Ruby Shoes'

'Self-employed psychologist''www.kabbalah.com‘

……………

'health mental''help self'

'improvement self''parenting‘

Page 33: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Yahoo! Music – LAUNCHcast radio

33

user profileplayer

Page 34: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Launch cast data

34

190389 1034047 100190389 1034060 20190389 1034129 50190389 1034276 100190389 1034388 80190389 1034801 100190389 1034831 100190389 1034882 0190389 1034883 40190389 1035010 20190389 1035473 60190389 1035840 0190389 1035926 30190389 1036157 30190389 1036454 60190389 1036736 30190389 1036737 100190389 1036740 70190389 1036746 80190389 1036762 100

1037729 Coral1037730 Rick Braun1037731 Britney Spears1037732 Gary Motley1037733 Candido Fabre1037734 K'Stalia Y Los Salchichas1037735 Donny Hoffa1037736 Rafael Mendez1037737 Lithops1037738 The Paris Ensemble1037739 The Sunset Orchestra1037740 The Mariachis Of Chiapas1037741 Jenny Simpson1037742 Doc West & The Yard Dogs1037743 Geoff Bartley1037744 Twilight Circus Dub Sound...1037745 Tekneek

user ID artist ID ratingsartist ID artist

Page 35: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Ratings: data representation

a2a1

u1

u3

u2

Vector space model:w_ij = cos(a_i,a_j) =

(a_i * a_j) /(||a_i|| ||a_j||)

a3

u4

u4

u1

u2

u3

a1

a2

a3

ratings: users - artists

Page 36: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Similarity graph

0 3 1 0 4 51 0 0 0 0 50 4 0 0 0 0

2 0 2 0 4 00 5 0 0 0 51 0 3 0 0 4

similarities

artists

user

s

artist-artistsim graph

Compute

1 0.3 0.8 0.0 0.5 0.3 1 0.4 0.0 0.2 0.8 0.4 1 0.3 0.0 0.0 0.0 0.3 1 0.0 0.5 0.2 0.0 0.0 1

artists

artis

ts

Page 37: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Yahoo! Music, artist-artist

Page 38: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Artist clusters

Page 39: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Rank Label

1 Mozart

2 Royal Philharmonic Orchestra

3 Yo-Yo Ma

4 Scottish Chamber Orchestra

5 New York Philharmonic

6 Gilbert Johnson

7 New York Pro Music Antiqua

8 New Philharmonia Orchestra

9 Izzy

10 London Symphony Orchestra

11 Philadelphia Orchestra

12 Roberto Michelucci

13 Columbia Symphony Orchestra

14 San Francisco Symphony

15 London Philharmonic Orchestra

16 Berlin Philharmonic Orchestra

Rank Label

1 Enrique Iglesias

2 Avril Lavigne

3 Backstreet Boys

4 Kylie Minogue

5 Good Charlotte

6 Simple Plan

7 T.A.T.U.

8 Hilary Duff

9 Paula Abdul

10 No Doubt

11 Britney Spears

12 Westlife

13 O-Town

14 LFO (Lyte Funky Ones)

15 98 Degrees

16 Shakira

cluster # 6 cluster # 79

Sample clusters

Page 40: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Ordering: graph embedding in 1D

40

• every node –coordinate x(i)

-1 1 x

1 32

54

76

8

1 2 6 7 3 8 4 5

0create ordering/permutation

Page 41: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Graph embedding : 2D

• Quadratic optimization problem

• Replace

• Quadratic optimization problem

• Embedding: x = ev(2) , y = ev(3)

Page 42: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

2D spectral embedding

Page 43: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

intuition

-1 1 x

1 32

54

76

8

1 2 6 7 3 8 4 5

0

Page 44: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Embedding 2D - better

• Change constraints: average -> “per node”

• Optimization:

Page 45: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Semi Definite Programming (SDP)

• Minimum bisection SDP relaxation

• Standard form primal SDP

• Factor:

Page 46: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

SDP embedding

• Spectral:

• SDP:

-1 1

r=1

Page 47: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

SDP spherical embedding

Page 48: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

SDP sphere unrolled

Page 49: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Clusters

Page 50: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

Visualization pipeline

Solver

SDPMap 3D to 2D

Visualize3D mapping

0 3 1 0 4 51 0 0 0 0 50 4 0 0 0 0

2 0 2 0 4 00 5 0 0 0 51 0 3 0 0 4

similarities

artists

user

s

artist-artistsim graph

Compute

Page 51: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100
Page 52: Algorithms on graphs and networks...34. 190389 1034047 100 190389 1034060 20 190389 1034129 50 190389 1034276 100 190389 1034388 80 190389 1034801 100 190389 1034831 100

David Gleich Leonid Zhukov Matthew Rasmussen Kevin LangStanford University Yahoo!, Inc. MIT Yahoo! Research

The World of Music: SDP layout of high-dimensional data