74
Tools and Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University [email protected] http://www.cs.cmu.edu/~htong 1

Tools and Algorithms for Querying and Mining Large Graphs Hanghang Tong Machine Learning Department Carnegie Mellon University [email protected] htong

Embed Size (px)

Citation preview

Tools and Algorithms for Querying and Mining Large Graphs

Hanghang TongMachine Learning Department

Carnegie Mellon [email protected]

http://www.cs.cmu.edu/~htong

1

Thesis Committee

• Christos Faloutsos• William Cohen• Jeff Schneider• Philip S. Yu

2

Graphs are everywhere!

3

Motivating Questions: (high level)• Given a large graph, we want to

4

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Heikki Mannila

Christos Faloutsos

Padhraic Smyth

Corinna Cortes

15 1013

1 1

6

1 1

4 Daryl Pregibon

10

2

11

3

16

CePS on DBLP [Tong+ KDD 06] T3 on CIKM [Tong+ CIKM 08]

+Task A: Querying +Task B: Mining

Will return to this later…

Motivating Questions (in details)

• Querying [Goal: query complex relationship]– Q.1. Find complex user-specific patterns;– Q.2. Link Prediction & Proximity Tracking;– Q.3. Answer all the above questions quickly.

• Mining [Goal: find interesting patterns]– M.1. Spot Anomalies; – M.2. Mine time & space;– M.3. Detect communities.

5

Thesis Overview

6

Q1

Q3

Q2Q2

Q3

M1

M2

M3M3

M1

M2

Thesis Overview

7

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)

Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Tasks Impact, ApplicationsQ1 Identify master-mind criminal; money launder ring;

interactive search & summarization

Q2 Predict who-calls-whom; Trend analysis on graph level

Q3 Scale all the above app.s to large, disk resident, graphs

M1 Efficient anomaly detection in an intuitive, dynamic way

M2 Mine time/space in complex settings

M3 Detect community w/ optional constraints

Thesis Overview: Impact

Qu

erying

Min

ing

8

Footnote: Our work for Q1 has been transferred into IBM product (Cyano)

Roadmap• Introduction• Completed Work

–Querying–Mining

• Proposed Work

9

• Preliminary

• Q1

• Q2

• Q3

Preliminary: Proximity Measurement

10

A BH1 1

D1 1

E

F

G1 11

I J1

1 1

a.k.a Relevance, Closeness, ‘Similarity’…

Thesis Overview

11

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Competed work on Q1

• Goal: Find complex user-specific patterns, – Q1.1. Center-Piece Subgraph Discovery,

– e.g., master-mind criminal given some suspects X, Y and Z?

– Q1.2. Best Effort Pattern Match, – e.g., Money-laundry ring

– Q1.3 Interactive querying (e.g. Negation)– e.g., find most similar conferences wrt KDD, but not like

ICML?

12

Q1.1 Center-Piece Subgraph Discovery [Tong+ KDD 06]

A C

B

A C

B

Original GraphCePS

Q: How to find hub for the black nodes?

CePS Node

Input Output

Red: Max (Prox(A, Red) x Prox(B, Red) x Prox(C, Red))

CePS: Example (AND Query)

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Heikki Mannila

Christos Faloutsos

Padhraic Smyth

Corinna Cortes

15 1013

1 1

6

1 1

4 Daryl Pregibon

10

2

11

3

16

14

DBLP co-authorship network: - 400,000 authors, 2,000,000 edges

K_SoftAND: Relaxation of AND

Asking AND query? No Answer!

Disconnected Communities

Noise

15

R. Agrawal Jiawei Han

V. Vapnik M. Jordan

H.V. Jagadish

Laks V.S. Lakshmanan

Umeshwar Dayal

Bernhard Scholkopf

Peter L. Bartlett

Alex J. Smola

1510

13

3 3

5 2 2

327

4

CePS: 2 SoftAND

Stat.

DB

16

Output

Data Graph

Query Graph

Matching SubgraphAccountant

CEO

Manager

SEC

Q: How to find matching subgraph?

Q1.2. Best-Effort Pattern Match [Tong+ KDD 2007 b]

Input

Interception

G-Ray: How to?

matching node

matching node

matching node

matching node

Goodness = Prox (12, 4) x Prox (4, 12) x Prox (7, 4) x Prox (4, 7) x Prox (11, 7) x Prox (7, 11) x Prox (12, 11) x Prox (11, 12)

details

Observation: , etc. 18

Effectiveness: star-query

Query Result

Databases

Bio-medicalIntelligent Agent

19

Effectiveness: line-query

Query

Result

Databases Learning Bio-medicalTheory

20

Q1.3: Interactive Querying

21

User Feedback

User Feedback

User Feedback

User Feedback

Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. Stat (Red)

Negative feedback on ICML will exclude other stats confs (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 22

Q1.3 ProSIN for Interactive Querying

[Tong+ ICDM 08]

Q1.3 ProSIN for Interactive Querying

[Tong+ ICDM 08]Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. Stat (Red)

Negative feedback on ICML will exclude other stats confs (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 23

Initial Results No to `ICML’ Yes to `SIGIR’

'ICDM' 'ICML' 'SDM' 'VLDB' 'ICDE'

'SIGMOD' 'NIPS''PKDD''IJCAI'

'PAKDD'

'ICDM' 'SDM''PKDD''ICDE''VLDB'

'SIGMOD''PAKDD''CIKM''SIGIR'

'WWW'

'SIGIR''TREC''CIKM''ECIR''CLEF''ICDM''JCDL''VLDB''ACL''ICDE'

two main sub-communities in KDD: DBs (green) vs. Stat (Red)

Negative feedback on ICML will exclude other stats confs (NIPS, IJCAI)

Positive feedback on SIGIR will bring more IR (brown) conferences.

what are most related conferences wrt KDD?(DBLP author-conference bipartite graph) 24

Q1.3 ProSIN for Interactive Querying

[Tong+ ICDM 08]

Thesis Overview

25

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Q2.1 Link Prediction: direction [Tong+ KDD 07 a]

• Q: Given the existence of the link,

what is the direction of the link?

• A: (DAP) Compare Prox(ij) and Prox(ji)>70%

Prox (ij) - Prox (ji)

density

i

j

i

i

i

26

?

Web Link - 4, 000 nodes - 10, 000 edges

Q2.2 pTrack/cTrack: Challenge[Tong+ SDM 08]

• Observations (CePS, GRay, ProSIN…)– All for static graphs– Proximity: main tool

• Graphs are evolving over time!– New nodes/edges show up; – Existing nodes/edges die out; – Edge weights change…

Q: How to make everything incremental? A: Track Proximity! 27

pTrack/cTrack: Trend analysis on graph level

M. Jordan

G.HintonC. Koch

T. Sejnowski

Year

Rank of Influence

28

pTrack: Problem Definitions

• [Given] – (1) a large, skewed time-evolving bipartite graphs, – (2) the query nodes of interest

• [Track] – (1) top-k most related nodes for each query node

at each time step t; – (2) the proximity score (or rank of proximity)

between any two query nodes at each time step t

29

pTrack: Philip S. Yu’s Top-5 conferences up to each year

ICDE

ICDCS

SIGMETRICS

PDIS

VLDB

CIKM

ICDCS

ICDE

SIGMETRICS

ICMCS

KDD

SIGMOD

ICDM

CIKM

ICDCS

ICDM

KDD

ICDE

SDM

VLDB

1992 1997 2002 2007

DatabasesPerformanceDistributed Sys.

DatabasesData Mining

DBLP: (Au. x Conf.) - 400k aus, - 3.5k confs - 20 yrs

30

KDD’s Rank wrt. VLDB over yearsProx. Rank

Year

Data Mining and Databases are getting closer & closer

31

(Closer)

cTrack:10 most influential authors in NIPS community up to each year

Author-paper bipartite graph from NIPS 1987-1999. 1740 papers, 2037 authors, spreading over 13 years

T. Sejnowski

M. Jordan

32

Thesis Overview

33

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Proximity is the main tool• Q.1: CePS, G-Ray, ProSIN• Q.2: DAP, pTrack/cTrack

34

Q: What is a `good’ Score?

A BH1 1

D1 1

E

F

G1 11

I J1

1 1

a.k.a Relevance, Closeness, ‘Similarity’…

Random walk with restart [Pan+ KDD 2004]

Node 4

Node 1Node 2Node 3Node 4Node 5Node 6Node 7Node 8Node 9Node 10Node 11Node 12

0.130.100.130.220.130.050.050.080.040.030.040.02

1

4

3

2

56

7

910

811

120.13

0.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

Ranking vector More red, more relevant

Nearby nodes, higher scores

4r

2c 3cQ c ...W 2W 3W

Why RWR is a good score?

all paths from i to j with length 1

all paths from i to j with length 2

all paths from i to j with length 3

W : adjacency matrix. c: damping factor

1( )Q I cW ,( , ) i jQ i j r

i

j

RWR summarizes all the weighted paths from i to j

Computing RWR• OntheFly

– No Pre-Computation; – Light Storage Cost (W)– Slow On-Line Response: O(mE)

• Pre-Compute– Fast On-Line Response – Prohibitive Pre-Compute Cost: O(n3)– Prohibitive Storage Cost: O(n2)

37

~

1( )Q I cW

[ 1] [ ] (1 )i i ir t cWr t c e

Q: How to Balance?

On-line Off-line

38

Goal: Efficiently Get (elements) of 1( )Q I cW

B_Lin: Basic Idea[Tong+ ICDM 2006]

1

43

2

5 6

7

9 10

811

120.130.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

1

4

3

2

56

7

910

811

12

Find Community

Fix the remaining

Combine1

43

2

5 6

7

9 10

811

12

1

43

2

5 6

7

9 10

811

12

56

7

910

811

12

1

43

2

5 6

7

9 10

811

12

1

4

3

2

5 6

7

910

811

12

1

4

3

2

39

+~~

B_Lin: details

Cross community

details

40

+=

B_Lin: details

W~I – c ~~ I – c – cUSVW1~

-1 -1

Easy to be inverted LRA difference

Sherman–Morrison Lemma!

details

41If Then

B_Lin: summary

• Pre-Compute Stage• Q: • A: A few small, instead of ONE BIG, matrices inversions

• On-Line Stage• Q: Efficiently recover one column of Q• A: A few, instead of MANY, matrix-vector multiplications

Efficiently compute and store Q

42

Query Time vs. Pre-Compute Time

Log Query Time

Log Pre-compute Time

•Quality: 90%+ •On-line:

•Up to 150x speedup•Pre-computation:

•Two orders saving

43

Our Results

More on Scalability Issues for Querying(the spectrum of ``FastProx’’)

• B_Lin: one large linear system – [Tong+ ICDM06, KAIS08]

• BB_Lin: the intrinsic complexity is small – [Tong+ KAIS08]

• FastUpdate: time-evolving linear system – [Tong+ SDM08, SAM08]

• FastAllDAP: multiple linear systems – [Tong+ KDD07 a]

• Fast-ProSIN: dealing w/ on-line feedback– [Tong+ ICDM 2008]

44

Roadmap• Introduction• Completed Work

–Querying–Mining

• Proposed Work

45

• M1: Spotting Anomalies

• M2: Mining Time

Thesis Overview

46

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Motivation [Tong+ KDD 08 b]

• Q: How to find patterns?– e.g., communities, anomalies, etc.

• A: Low-Rank Approximation (LRA) for Adjacency Matrix of the Graph.

A L

M RX X

~~47

LRA for Graph Mining: Example

John

KDD

Tom

Bob

Carl

Van

RoyRECOMB

ISMB

ICDM

Author Conf.

L M R

~~X X

Adj. matrix: A

Au. clusters

Conf. Cluster

Interaction

Recon. error is high ‘Carl’ is abnormal

48

Challenges: How to get (L, M, R)?

• Efficiently • both time and space

• Intuitively• easy for interpretation

• Dynamically • track patterns over time

49

None of Existing Methods Fully Meets Our Wish List!

Why Not SVD and CUR/CX?

• SVD: Optimal in L2 and LF

– Efficiency• Time:• Space: (L, R) are dense

– Interpretation• Linear Combination of

many columns

– Dynamic: Not Easy

50

2 2(min( , ))O n m nm

• CUR: Example-based– Efficiency

• Better than SVD• Redundancy in L

– Interpretation• Actual Columns from A

xxxx

– Dynamic: Not Easy

Solutions: Colibri [Tong+ KDD 08 b]

• Colibri-S: for static graph– Basic idea: remove linear redundancy– Same accuracy as CUR/CX– Significant savings in both time & space

• Up to 53x speed-up

• Colibri-D: for dynamic graph– Basic idea: leverage smoothness between time – Same accuracy as CUR/CMD

• Up to 112x speed-up

51

details

A Pictorial Comparison (for static graphs)

52

1st singular vector

2nd singular vector

SVD CUR

CMD Colibri-S

details

Comparison SVD, CUR vs. Colibri

s

Wish List SVD [Golub+ 1989]

CUR/CX[Drineas+ 2005]

Colibri[Tong+ 2008]

Efficiency

Interpretation

Dynamics53

details

Performance of Colibri-S

Time Space

Ours

CUR CUR

CMD

OursCMD

• Accuracy• Same 91%+

• Time• 12x of CMD• 28x of CUR

• Space• ~1/3 of CMD• ~10% of CUR

54Data set: Network traffic

- 21,837 sources/destinations, 158,805 edges

Performance of Colibri-D

Time

# of changed cols

CMD

Colibri-S

Colibri-D achieves up to 112x speedups

Colibri-D

55

Network traffic

- 21,837 nodes

- 1,220 hours

- 22,800 edge/hr

Thesis Overview

56

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

M2: How to mine time in some complex context?

[Tong+ CIKM 08]

57

A Motivating Example: InputsTime Event(e.g., Session) EntityOct. 26 Link Analysis Tom, Bob

Clustering Bob, AlanOct. 27 Classification Bob, Alan

Anomaly Detection Alan, BeckOct. 28 Party Beck, DanOct. 29 Web Search Dan, Jack

Advertising Jack, PeterOct. 30 Enterprise Search Jack, PeterOct. 31 Q & A Peter, Smith

58

Time Cluster, rep. entities: b7,b6, b8A Motivating Example: Outputs

JackOct. 29

Oct. 30Oct. 30

Oct. 28

Oct. 26

Oct. 27

Time Cluster Rep. Entities:

``Jack’’, ``Peter’’, ``Smith’’

Abnormal Time Rep. Entities:

``Beck’’ , ``Dan’’

Time Cluster Rep. Entities:

``Tom’’, ``Bob’’, ``Alan’’

Problem Definitions (How to mine time in such complex context)

• Given data sets collected at different time stamps;

• We want to find +1: Time Clusters+2: Abnormal Time stamps+3: Interpretations+4: Right time granularity

60

T3

MT3

Our Solutions

Data Sets• CIKM: from CIKM proceedings

• Time: Publication year (1993-2007, 15)• Event: Paper-published (952)• Entities: Author (1895) & Session (279)• Attribute: Keyword (158)

• DeviceScan: from MIT Reality Mining• Time: the day scanning happened (1/1/2004-

5/5/2005, 294)• Event: blue tooth device scanning person (114, 046)• Entities: Device (103) & Person (97)• Attribute: NA

61

T3 on `CIKM’ Data Set Rep. Authors Rep. Keywords

James. P. CallanW. Bruce Croft

James AllanPhilip S. Yu

George KarypisCharles Clarke

WebCluster

ClassificationXML

LanguageStream

Rep. Authors Rep. KeywordsElke Rundensteiner

Daniel MirankerAndreas Henrich

Il-Yeol SongScott B Huffman

Robert J. Hall

KnowledgeSystem

UnstructuredRule

Object-orientedDeductive 62

MT3 on `DeviceScan’ Data Set

Aggregate by Month

Apr. 2004 is anomaly

Aggregate by Day

Work day

Semester Break & Holiday

63

Roadmap• Introduction• Completed Work

–Querying–Mining

• Proposed Work–P1: Community detection–P2: Mining Space–P3: Diffusion Wavelets

64

Thesis Overview

65

CePS, G-Ray, ProSIN (KDD06, KDD07 a, ICDM08)Q1

FastProx (ICDM06, KAIS07, KDD07 b, ICDM08)Q3

pTrack/cTrack (SDM08, SAM08)Q2

DAP(KDD07 b)Q2

FastProx(SDM08, SAM08)Q3P3

Colibri-D(KDD08 b)

M1

T3/MT3 (CIKM08)

M2

P1M3P1M3

Colibri-S(KDD08 b)M1 P3

P3

Completed Proposed

Questions That We Ask

P2M2 P3

Detecting Communities

• Observations: two seemingly opposite efforts in community detection– E1: parameter-free (no user intervention)– E2: cluster w/ constraints (listen to users)

• Challenge: How to fill the gap?• Idea: MDL-based method, encoding the

constraints in descriptions.

66

P1

Mining Space

67

P2

Diffusion Wavelets

68

P3

Time Line• Dec. ‘08: Thesis Proposal• Jan. – Feb., ‘09:

– Research on Community Detection

• Mar. – Apr. ‘09: – Research on Mining Space

• May – Jul. ‘09: – Research on Diffusion Wavelets

• Aug. ‘09: Thesis Write-up• Sep. ‘09: Defense

69

P3

P1

P2

Selected References• H. Tong & C. Faloutsos. (2006) Center-piece subgraphs: problem definition and fast

solutions. In KDD, 404-413, 2006.• H. Tong, C. Faloutsos, & J.Y. Pan. (2006) Fast Random Walk with Restart and Its

Applications. In ICDM, 613-622, 2006. (b.p. award)• H. Tong, Y. Koren, & C. Faloutsos. (2007) Fast direction-aware proximity for graph

mining. In KDD, 747-756, 2007.• H. Tong, B. Gallagher, C. Faloutsos, & T. Eliassi-Rad. (2007) Fast best-effort pattern

matching in large attributed graphs. In KDD, 737-746, 2007.• H. Tong, S. Papadimitriou, P.S. Yu & C. Faloutsos. (2008) Proximity Tracking on Time-

Evolving Bipartite Graphs. in SDM 2008. (b.p. award)• H. Tong, S. Papadimitriou, J. Sun, P.S. Yu & C. Faloutsos. (2008) Fast Mining of Static

and Dynamic Graphs. KDD 2008• H. Tong, Y. Sakurai, T. Eliassi-Rad, and C. Faloutsos. Fast Mining of Complex Time-

Stamped Events CIKM 08• H. Tong, H. Qu, and H. Jamjoom. Measuring Proximity on Graphs with Side Information.

ICDM 2008

70

My other work during Ph.D study• GhostEdge (w/ Brian, Christos and Tina, in KDD 08)

– Classification in Sparsely Labeled Network• GMine (w/ Junio, Agma, Christos and Jure, in VLDB 06)

– Interactive Graph Visualization and Mining• Graphite (w/ Polo, Christos, Jason, Brian and Tina, in ICDM 08)

– Visual Query System for Attributed Graphs • TANGENT (w/ Kensuke and Christos)

– ``surprise-me’’ recommendation • PaCK (w/ Jingrui, Spiros, Tina, Jaime and Christos)

– Community detection for heterogonous graphs

71

Acknowledgements

• Christos Faloutsos, Jia-Yu Pan, Yehuda Koren, Spiros Papadimitriou, Philip S. Yu, Jimeng Sun, Huiming Qu, Hani Jamjoom, Tina Eliassi-Rad, Brian Gallagher, Yasushi Sakurai,

• Kensuke Oonuma, Duen Horng (Polo) Chau, Jason I. Hong, Jingrui He, Jaime Carbonell, José Fernando Rodrigues Jr., Jure Leskovec Agma J. M. Traina,

• Charalampos (Babis) Tsourakakis, Meng Su72

(the old way)

CePSProSINGray

DAP

pTrackcTrack

BLin

BBLin

FastUpdateFastDAP

Fast-ProSIN

Colibri

P1

P3

GhostEdge

Graphite

Pack

TANGENT

GMine

T3/MT3P2

MiningQ1

Q2

Q3

M2M3

M1

A Graph Miner’s Way: My Collaboration Graph (During Ph.D Study)

Legends:Green: QueryingBlue: MiningPurple: Others : Completed : Proposed

Q & A

Thank you!

74