QUINT: On Query-Specific Optimal Networkskeg.cs.tsinghua.edu.cn/jietang/publications/KDD16-slides... · 2016. 8. 14. · Arizona State University Conclusion: QUINT § Goals: Learn

Arizona State University

QUINT:OnQuery-SpecificOptimalNetworks

Presenter: Liangyue LiJoint work with

- 1 -

Yuan Yao(NJU)

Jie Tang(Tsinghua)

Hanghang Tong(ASU)

Wei Fan(Baidu)


Node Proximity: What?§ Node proximity: the closeness (a.k.a.,

relevance, or similarity) between two nodes

- 2 -

1

4

3

2

5 6

7

9 10

811

120.13

0.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03

What is the closest node to 4?


Node Proximity: Why?

- 3 -

Biology [Ni+] Social Network [Lerman+]

E-commerce [Chen+] Disaster Mgtm [Zheng+]


Node Proximity: How?§ Random Walk with Restart (RWR)

– Idea: summarize multiple weighted relationships btw nodes

– Variants: • Electric networks: SAEC[Faloutsos+]• Katz [Katz], [Huang+]

• Matrix-Forest-based Alg [Chobotarev+]

- 4 -

A BH1 1

D1 1

EF

G1 11

I J11 1 Prox (A, B) =

Score (Red Path) + Score (Green Path) + Score (Blue Path) +

Score (Purple Path) + …


Node Proximity: RWR

- 5 -

1

4

3

2

56

7

910

811

12


Node Proximity -- RWR§ Detail: a random walker starts from s

– (a) transmit to one neighbor with – (b) go back to s with prob

§ Formulation

§ Assumption– How to best leverage the fixed input graph

- 6 -

(1� c)

rs = cArs + (1� c)es

Ranking vector Adjacent matrix Restart prob Starting vector

p ⇠ cAij

A


Node Proximity: Learning RWR§ Goal

– Use side information to learn better graph– Side info: user feedback, node attributes

§ Key Idea: Infer optimal edge weights

§ Limitation: Fixed topology

- 7 -

J. Tang, T. Lou and J. Kleinberg. Transfer Link Prediction across Heterogeneous Social Networks. TOIS, 2015. L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. WSDM, 2011.A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. KDD, 2006.

minw

kwk2 + �

X

x2P,y2Nh(Q(y, s)�Q(x, s))

Q = (I� cA)�1

Map edge attributes to weights

Match user preferences


Algorithmic Questions§ Q1: optimal weights or optimal topology?§ Q2: one-fits-all or one-fits-one?§ Q3: offline learning or online learning?

- 8 -


Q1: Optimal Weights or Topology?§ Observation: real network is noisy and

incomplete§ Challenge: learn optimal weights and

topology

- 9 -

1

4

3

2

5 6

7

9 10

811

120.13

0.10

0.13

0.13

0.05

0.05

0.08

0.04

0.02

0.04

0.03Missing edge

Noisyedge


Q2: One-fits-all, or one-fits-one?

- 10 -

§ Observation: optimal network for different queries might be different

§ Challenge:– How to tailor learning for each query

1

43

2

5 6

7

9 10

8 11

12

PositiveNodes

NegativeNodes

QueryNode

P

N1

43

2

5 6

7

9 10

8 11

12

NegativeNodes

PositiveNodes

QueryNode

P

N


Q3: Offline or Online Learning§ Observation:

– Learning RWR: costly iterative sub-routine to compute a single gradient vector

– Learning topology: parameter space expands to

– One-fits-one: one optimal network for each query

§ Challenge: – How to perform query-specific online learning?

- 11 -

O(n2)


Query-specific Optimal Network Learning

- 12 -

1

4

3

2

56

7

910

811

12

PositiveNodes

NegativeNodes

QueryNode

Given: An input network , a query node , positive nodes and negative nodesLearn: An optimal network specific to the query

P

Ns

sP N

A

As


Roadmap§ Motivations§ Proposed Solutions: QUINT§ Empirical Evaluations§ Conclusions

- 13 -


QUINT - Formulations§ Optimization Formulation (hard version)

§ Remarks– Larger parameter space – Query-specific Optimal Network– No exception is allowed in the constraint

- 14 -

Matching Input Network Positive nodes

Negativenodes

Matching Preference(hard)

O(n2)

argminAs

kAs �Ak2Fs.t., Q(x, s) > Q(y, s), 8x 2 P, 8y 2 N

Q = (I� cA)�1


QUINT - Formulations§ Optimization Formulation (soft version)

§ Remarks– Characteristic

– Wilcoxon-Mann-Whitney (WMW) loss

- 15 -

Loss function

Q(y, s) < Q(x, s) ) g(·) = 0

Q(y, s) > Q(x, s) ) g(·) > 0

argminAs

L(As

) = �kAs

�Ak2F

+P

x2P,y2Ng(Q(y, s)�Q(x, s))

Q = (I� cA)�1

Penalty to the violation of preferences


QUINT -- Optimization§ Gradient Descent Based Solution

– Gradient

– Derivative of an Inverse

- 16 -

@L(As

)@A

s

= 2�(As

�A) +P

x2P,y2N

@g(Q(y,s)�Q(x,s))@A

s

= 2�(As

�A) +Px,y

@g(dyx

)@d

yx

(@Q(y,s)@A

s

� @Q(x,s)@A

s

)

@Q@As(i,j)

= �Q@(I�cAs)@As(i,j)

Q = cQJijQ

@Q(x, s)

@As(i, j)= cQ(x, i)Q(j, s)

Differentiable

Q = (I� cA)�1


QUINT -- Optimization§ Intuition

§ Complexity

§ Observation– Usually– Complexity: quadratic

- 17 -

s x

ij

Query node

Positive node@Q(x, s)

@As(i, j)

Q(j, s)⇥Q(x, i)

/

Neighbor of Neighbor ofs x

@Q(x, s)

@As(i, j)= cQ(x, i)Q(j, s)

O(T1|P| · |N |(T2m+ n2))

T1, T2, |P|, |N | ⌧ m,n

Q: how to scale up?

Q = (I� cA)�1


QUINT – Scale-up§ Key idea: Optimal network is rank-one

perturbation to original network§ Details:

§ Optimization: alternating gradient descent§ Complexity:

- 18 -

argminf ,g

L(f ,g) = �kfg0k2F

+ �(kfk2 + kgk2)

+P

x2P,y2Ng(Q(y, s)�Q(x, s))

O(T1|P| · |N |(T2m+ n))

Q = (I� cA)�1


QUINT – Variant #1§ Key idea: apply Taylor Approximation for§ Details:

§ Complexity: using 1st order Taylor

§ Benefit: accessing faster

- 19 -

Q = (I� cA)�1

⇡ I+Pk

i=1 ckAk

Q

O(T1|P| · |N |n)

Q(i, j)


QUINT – Variant #2§ Key idea: Only update neighborhood of

the query node and the pos/neg nodes (Localized Rank-One Perturbation)

§ Complexity

§ Benefit: usually sub-linear to n

- 20 -

O(T1|P| · |N |max(|N(s)|, |N(P,N )|))N(s) :Neighbors of sN(P,N ) : Neighbors of pos/neg nodesmax(|N(s)|, |N(P,N )|) ⌧ n



- 21 -


Datasets

- 22 -

10+ diverse networks


Effectiveness: MAP (Higher is better)

- 23 -

00.10.20.30.40.50.60.70.80.9

Astro-Ph GR-QC Hep-TH Hep-PH Protein Airport Oregon NBA Email Gene Last.fm

Admic/Adar Common Nbr SRWRWR wiZAN_Dual ProSINQUINT-Basic QUINT-Basic1st QUINT-rankOne

MAP: Mean Average Precision


Effectiveness: HLU (Higher is better)

- 24 -

0102030405060708090


HLUAdmic/Adar Common Nbr SRW RWR wpZAN_doubleProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne

HLU: Half-life Utility


Effectiveness: AUC (Higher is better)

- 25 -

00.10.20.30.40.50.60.70.80.9

1


AUCAdmic/Adar Common Nbr SRW RWR wpZAN_doubleProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne


Effectiveness: Precision@20(Higher is better)

- 26 -

00.10.20.30.40.50.60.70.80.9

1


Precision@20Admic/Adar Common Nbr SRW RWR wpZAN_doubleProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne


Effectiveness: Recall@5 (Higher is better)

- 27 -

00.020.040.060.08

0.10.120.140.160.18

0.2


Recall@5Admic/Adar Common Nbr SRW RWR wpZAN_doubleProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne


Effectiveness: MPR (Lower is better)

- 28 -

00.05

0.10.15

0.20.25

0.30.35

0.40.45

0.5


MPRAdmic/Adar Common Nbr SRW RWR wpZAN_doubleProSIN QUINT-Basic QUINT-Basic1st QUINT-rankOne

MPR: Mean Percentile Ranking


Efficiency -- Twitter

0 5 10 15x 108

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

# Edges

Runn

ing Ti

me (s

econ

d)

QUINT−rankOne

0 5 10 15x 108

10−1

100

101

102

103

# Edges

Runn

ing

Tim

e (s

econ

d)

QUINT−Basic1stQUINT−rankOne

- 29 -

0 0.5 1 1.5 2 2.5 3 3.5 4x 107

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

# Nodes

Runn

ing Ti

me (s

econd

)

QUINT−rankOne

0 0.5 1 1.5 2 2.5 3 3.5 4x 107

10−1

100

101

102

103

# Nodes

Runn

ing

Tim

e (s

econ

d)


⇥107 ⇥108

QUINT-rankOne scales sub-linearly

1s



- 30 -


Conclusion: QUINT§ Goals: Learn Optimal network (for Node Proximity)

§ Algorithms: VERY efficient way to compute

– Rank-1 approx + Taylor approx + local search§ Results:

– consistently better on 10+ networks & 6 metrics– sublinear scalability, near real-time response on billion-

scale networks

- 31 -

s x

ij

Query node

Positive node@Q(x, s)

@As(i, j)

Q(j, s)⇥Q(x, i)

/

Neighbor of Neighbor ofs x

Q1 Q2 Q3

Existing Optimal weights One-fit-all offline

QUINT Optimal topology One-fit-one online

@Q(x, s)

@As(i, j)

00.10.20.30.40.50.60.70.80.9


Admic/Adar Common Nbr SRWRWR wiZAN_Dual ProSINQUINT-Basic QUINT-Basic1st QUINT-rankOne

0 5 10 15x 108

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

# Edges

Runni

ng Tim

e (seco

nd)

QUINT−rankOne

0 5 10 15x 108

10−1

100

101

102

103

# Edges

Runn

ing T

ime (

seco

nd)


Documents

QUINT: On Query-Specific Optimal Networkskeg.cs.tsinghua.edu.cn/jietang/publications/KDD16-slides... · 2016. 8. 14. · Arizona State University Conclusion: QUINT § Goals: Learn