Maximizing Time-decaying Influence
in Social Networks
2016/9/22 ECML-PKDD 2016
Naoto Ohsaka (UTokyo)
Yutaro Yamaguchi (Osaka Univ.)
Naonori Kakimura (UTokyo)
Ken-ichi Kawarabayashi (NII)
ERATO Kawarabayashi Large Graph Project
Given
βΆ Social graph πΊ = π, πΈβΆ Diffusion model β³βΆ Integer π
Goal
βΆ maximize πβ³ π ( π = π)πβ³ π β π[size of cascade initiated by π under β³]
Influence maximization[Kempe-Kleinberg-Tardos. KDD'03]
2
Viral marketing application[Domingos-Richardson. KDD'01]
incentive
incentive
incentive
Formulation by[Kempe-Kleinberg-Tardos. KDD'03]
3
Used classical diffusion models
independent cascade [Goldenberg-Libai-Muller. Market. Lett.'01]
linear threshold [Granovetter. Am. J. Sociol.'78]
οΌ Tractable
Monotonicity & submodularity of π β [Kempe-Kleinberg-Tardos. KDD'03]
Near-linear time approx. algorithms[Borgs-Brautbar-Chayes-Lucier. SODA'14] [Tang-Shi-Xiao. SIGMOD'15]
οΌ Too simple
No time-decay effects, no time-delay propagation
βπ β π, π£ β ππ π βͺ π£ β π π β₯ π π βͺ π£ β π π
The power of influence (rumor)
depends on arrival time
decreases over time
Our aspect : Time-decay effects
4
Our question
What if incorporate time-decay effects?Monotonicity? Submodularity? Scalable algorithm?
Time-delay propagations have been studied[Saito-Kimura-Ohara-Motoda. ACML'09] [Goyal-Bonchi-Lakshmanan. WSDM'10]
[Saito-Kimura-Ohara-Motoda. PKDD'10] [Rodriguez-Balduzzi-SchΓΆlkopf. ICML'11]
[Chen-Lu-Zhang. AAAI'12] [Liu-Cong-Xu-Zeng. ICDM'12]
Time-decay effects have not much β¦[Cui-Yang-Homan. CIKM'14] No guarantee for influence maximization
u
Incentive
vDay 1
v
s uDay 2
Day 1
Incentive
Our contribution
5
Generalize independent cascade & linear threshold with
Time-decay effects & Time-delay propagationNon-increasing probability Arbitrary length distribution
Diffusion process of TV-ICReachability on a random
shrinking dynamic graph=βΆ Characterization
βΆ Monotonicity & submodularity of πTVIC β
βΆ Scalable and accurate greedy algorithmsbased on reverse influence set [Tang-Shi-Xiao. SIGMOD'15]
πͺ π πΈ + πΈ 2
π
log2 |π|
π2time & 1 β eβ1 β π -approx.
Essential
Time-varying independent cascade (TV-IC) model
πΊ = π, πΈ Social graph
πππ Non-increasing probability function
πππ Arbitrary length distribution
6
Objective function
πTVIC π β π[# active vertices initiated by π]
0. π’ became active at π‘π’1. Sample a delay πΏπ’π£ from πππ
v
Incentive
u
s
time π‘π’
0. π’ became active at π‘π’1. Sample a delay πΏπ’π£ from πππ2A. Success w.p. πππ ππ + πΉππ
π£ becomes active at π‘π’ + πΏπ’π£
Time-varying independent cascade (TV-IC) model
πΊ = π, πΈ Social graph
πππ Non-increasing probability function
πππ Arbitrary length distribution
7
v
Incentive
u
s
time π‘π’ time π‘π’ + πΏπ’π£
Objective function
πTVIC π β π[# active vertices initiated by π]
Success
0. π’ became active at π‘π’1. Sample a delay πΏπ’π£ from πππ2B. Failure w.p. π β πππ ππ + πΉππ
π£ remains inactive
Time-varying independent cascade (TV-IC) model
πΊ = π, πΈ Social graph
πππ Non-increasing probability function
πππ Arbitrary length distribution
8
v
Incentive
u
s
time π‘π’
Objective function
πTVIC π β π[# active vertices initiated by π]
Failure
Characterization of the IC model[Kempe-Kleinberg-Tardos. KDD'03]
9
π influences π§in the IC process
π can reach π§in a random subgraph πΊβ²β
S z S z
No need to consider time
βΆ Dynamic β¦
πΊt changes over time
β΅ ππ’π£ changes over time
βΆ Shrinking β¦
πΊt has only edge deletions
β΅ ππ’π£ is non-increasing
Characterization of the TV-IC model
10
π influences π§in the TV-IC process
π can reach π§ in a random
shrinking dynamic graph πΊtβ
S z S z
S z
S z
time 1
time 2
time 3
Monotonicity & submodularity of πTVIC(β )
11
β Greedy algorithm is 1 β eβ1 -approx.[Nemhauser-Wolsey-Fisher. Math. Program.'78]
πTVIC π = ππΊt # vertices reachable from π in πΊt
Our results : Monotone & submodularShrinking is a MUST
π influences π§in the TV-IC process
π can reach π§ in a random
shrinking dynamic graph πΊtβ
Reverse influence set [Tang-Shi-Xiao. SIGMOD'15]
π π β {vertices that would have influenced π§π}
π π β π[# sets in β intersecting π]
Efficient approximation of πTVIC β
12
β chosen u.a.r.
BFS-like alg. for the IC [Borgs-Brautbar-Chayes-Lucier. SODA'14]
Dijkstra-like alg. for the continuous IC [Tang-Shi-Xiao. SIGMOD'15]
Cannot be applied to our model
β =
Greedy algorithm requires estimations of πTVIC β
a
bz1
ca
f
z3
az2
π 1 π 2 π 3
Naive reverse influence set generation
under the TV-IC model
13
π£ β π ππ£ can reach π§π in a random
shrinking dynamic graphβ
β Ξ© π β πΈ time
Check each vertex separately
Efficient reverse influence set generation
under the TV-IC model
14
Computing π π£β π£ must pass through π£,π€ for some π€ to reach π§π β
π π£ = maxπ£,π€ βπΈ
min π π€ β πΏπ£π€, ππ£π€β1 π₯π£π€
π£ β π π π π β₯ πβ
Key = latest activation time π π£max. π π£ s.t. (π£ gets active in π π£ ) β (π§π gets active)
Dynamic programming yields
+β = π π§π β₯ π π£1 β₯ β― β₯ π π£π β₯ 0
v
w1
w2
w3
zi
Performance analysis
15
Lemma 2 Correctness
Lemma 3 Running time = πͺ πΈ β OPT
πlog π in exp.
Theorem 4
Our reverse influence set generation method +
IMM (Influence Maximization via Martingales) framework[Tang-Shi-Xiao. SIGMOD'15]
1 β eβ1 β π -approx. w.h.p.
πͺ π πΈ + πΈ 2
π
log2 |π|
π2expected time
πΈ
πis small for real graphs
Experiments: Setting
βΆComparative algorithms
This work
IMM-CTIC[Tang-Shi-Xiao. SIGMOD'15]
IMM-IC[Tang-Shi-Xiao. SIGMOD'15]
LazyGreedy[Minoux. Optim. Tech.'78]
Degree
Efficient algorithmsunder different models
Accurately estimateπTVIC β by simulations
Simple baseline
βΆππ’π£ π‘ = inβdeg π£ β π β π‘ β1
βΆππ’π£ β = (Weibull distribution) [Lawless. '02]
Experiments: Solution quality
17
LazyGreedy > This work > IMM-IC > IMM-CTIC > Degree
ego-Twitter |π|=81K πΈ =2,421Kca-GrQc |π|=5K πΈ =29K
πTVIC β of seed sets produced by each method
Bett
er
Accuracy guarantee Ignore time-decay effects
Dataset : Stanford Large Network Dataset Collection. http://snap.stanford.edu/data
Environment : Intel Xeon E5540 2.53 GHz CPU + 48 GB memory & g++v4.8.2 (-O2)
Ob
ject
ive
va
lue
Seed size k Seed size k
Ob
ject
ive
va
lue
Experiments: Scalability
18
Degree β« This work β IMM-IC > IMM-CTIC β« LazyGreedy
Running time for selecting seed sets of size π for each method
Quadratic timeNear-linear time
Just sorting
Bett
er
Dataset : Stanford Large Network Dataset Collection. http://snap.stanford.edu/data
Environment : Intel Xeon E5540 2.53 GHz CPU + 48 GB memory & g++v4.8.2 (-O2)
Seed size k Seed size k
ego-Twitter |π|=81K πΈ =2,421Kca-GrQc |π|=5K πΈ =29K
Summary & future work
19
ClassicalConstant probability
ε³
Monotone & submodular
[Kempe+'03]
Influence maximization
πͺ π π + πΈlog π
π2time
1 β eβ1 β π -approx.[Tang+'15]
Our studyTime-decay probability
ε³
Monotone & submodular
Influence maximization
πͺ π πΈ + πΈ 2
π
log2 |π|
π2time
1 β eβ1 β π -approx.
More generalTime-varying probability
ε³
Simple extension violates
submodularity
Efficient influence max.?
π‘
ππ’π£ π‘
π‘
ππ’π£ π‘
π‘
ππ’π£ π‘
generalize generalize