View
21
Download
0
Category
Preview:
Citation preview
Monotone 𝑘-Submodular Function
Maximization with Size Constraints
Naoto Ohsaka (UTokyo)
Joint work with Yuichi Yoshida (NII & PFI)
1
2016/3/23
My research interests:
Graph algorithms in practice & theoryWork in [Ohsaka,Akiba,Yoshida,Kawarabayashi. AAAI'14]
Fast algorithm for influence maximization[Kempe,Kleinberg,Tardos. KDD'03]
Q. How to distribute free items to 𝐵 peopleto maximize the spread of influence?
(e.g., rumors & product adoptions)
Instance of monotone submodular maximization
■Introduction
Item0.2
0.3
0.4
0.5
0.9
0.10.7
0.2
0.80.6
Monotone submodular maximization
Monotonicity:
𝑓 𝑆 ≤ 𝑓 𝑇 for 𝑆 ⊆ 𝑇
Submodularity (diminishing return):
𝑓 𝑆 + 𝑒 − 𝑓 𝑆 ≥ 𝑓 𝑇 + 𝑒 − 𝑓 𝑇
for 𝑆 ⊆ 𝑇 & 𝑒 ∉ 𝑇
Goal: max 𝑓 𝑆 s.t. 𝑆 = 𝐵
Naturally arise & 0.63-approx. in poly time[Nemhauser,Wolsey,Fisher. Math. Program. '78]
3
■Introduction
Document summarization [Lin,Bilmes. ACL-HLT'11]
Network inference [Gomez-Rodriguez,Leskovec,Krause. KDD'10]
Sensor placement [Krause,Singh,Guestrin. J. Mach. Learn. Res.'08]
Invest.
Retu
rn
⇨
Motivation of this research
We often have multiple kinds of items 1, 2, 3
Each has a different topic ⇨ different cascades
Q. How to distribute 𝒌 kinds of items to 𝐵 people?
Can be modeled as 𝒌-submodular functions
4
■Introduction
13
2
⋃ ⋃
3
Our contributionsPresented in [Ohsaka,Yoshida. NIPS'15]
5
■Introduction
Monotone 𝑘-submodular function maximization under
Total size constraint
#1 + #2 + #3 = 𝐵
Approx. ratio = 1/2# function eval. =Greedy: Quadratic
Randomized: Almost linear
Influence maximization
with 𝑘 topics
Individual size constraint
#1 = 𝐵1 , #2 = 𝐵2, #3 = 𝐵3
Approx. ratio = 1/3# function eval. =Greedy: Quadratic
Randomized: Almost linear
Sensor placement
with kinds of sensors
Approximation algorithms
Experimental evaluations
Related work
Theoretical results under NO constraint[Iwata,Tanigawa,Yoshida. SODA'16]
Non-monotone: 1/2-approx. alg.
Monotone:𝑘
2𝑘−1-approx. alg.
𝑘+1
2𝑘+ 𝜖 -approx. needs exponentially many queries
Application of bi(2-)submodular functions[Singh,Guillory,Bilmes. AISTATS'12]
Sensor placement & feature selection
6
■Introduction
Function formWe consider a function of form
7
■𝑘-submodular functions
𝑿𝟏 𝑿𝟐
𝑿𝟑
𝑒 𝒂 𝒃 𝒄 𝒅 𝒆 𝒇
𝒙(𝑒) 1 3 2 1 0 3
𝒂
𝒄
𝒃 𝒇 𝒆𝒅
one-to-one
correspondence
the family of
𝑘 disjoint subsets of 𝑉
𝑓: 0,1, … , 𝑘 𝑉 → ℝ𝑉 = 𝑛
𝑓: 𝑘 + 1 𝑉 → ℝ
Another interpretation
𝑓 is monotone 𝒌-submodular iff① & ②
Characterization of 𝑘-submodular functions [Huber,Kolmogorov. ISCO'12] [Ward,Živný. ACM Trans. Algor. '15]
8
■𝑘-submodular functions
0 1 1 3 ≥ 0
Δ𝑒,𝑖𝑓 𝒙 ≥ 0 for 𝒙 s.t. 𝒙 𝑒 = 0
①Monotone
Δ𝑒,𝑖𝑓 𝒙 = change of 𝑓 by assigning 𝑖 to 𝑒th element
② Orthant submodular
0 1 0 0 2 1 0 3≥
Δ𝑒,𝑖𝑓 𝒙 ≥ Δ𝑒,𝑖𝑓 𝒚 for 𝒙 ≼ 𝒚 s.t. 𝒚 𝑒 = 0
0 1 0 3−
0 1 2 0 2 1 2 3− −𝒚𝒙
Modeling influence maximization with 𝑘 topics
9
■𝑘-submodular functions
𝑓 𝒔 ≔Exp. # vertices influenced
in one of the 𝑘 graphs
We have 𝑘 graphs 𝐺1, … , 𝐺𝑘 and a strategy 𝒔 ∈ 𝑘 + 1 𝑉
a b c d e f
1 0 3 3 0 2
Influence from 𝑣 (𝒔 𝑣 = 𝑖) stochastically spreads in 𝐺𝑖
ac d
be f
ac d
be f
ac d
be f
⋃ ⋃ → 5
ac d
be f
ac d
be f
ac d
be f
⋃ ⋃ → 4
…
Event 1
Event 𝑟 𝑓 is
Monotone & 𝑘-submodular
Monotone submodular maximization by
greedy algorithm with lazy evaluations[Nemhauser,Wolsey,Fisher. Math. Program. '78] [Minoux. '78]
■Preliminaries
𝑗 𝑆𝑗 Δ𝑎(𝑆𝑗−1) Δ𝑏(𝑆𝑗−1) Δ𝑐(𝑆𝑗−1) Δ𝑑(𝑆𝑗−1) Δ𝑒(𝑆𝑗−1)
1 𝑏 7 10 3 8 6
2 𝑏, 𝑒 ― ― ≤ 3 2 5
3 ― ― ≤ 3 ≤ 2 ―
𝐟𝐨𝐫 𝑗 = 1 𝐭𝐨 𝐵
𝑒𝑗 ← argmax𝑒
Δ𝑒 𝑆𝑗−1 ≔ 𝑓 𝑆𝑗−1 + 𝑒 − 𝑓 𝑆𝑗−1
𝑆𝑗 ← 𝑆𝑗−1 + 𝑒𝑗
Greedy algorithm
for the total size constraint
Constraint: #1 + #2 + #3 + ⋯ + #k = 𝐵
𝑘-Greedy-TS (total size)
# function eval. = 𝑶 𝒌𝒏𝑩
Approx. ratio = 𝟏/𝟐11
■Proposed algorithms
𝒔𝟎 ← 𝟎𝐟𝐨𝐫 𝑗 = 1 𝐭𝐨 𝐵
𝑒𝑗 , 𝑖𝑗 ← argmax𝑒:𝒔𝑗−1 𝑒 =0, 𝑖
Δ𝑒,𝑖𝑓(𝒔𝑗−1)
𝒔𝑗 ← assign 𝑖𝑗 to the 𝑒𝑗st element in 𝒔𝑗−1
Proof sketch
12
■Proposed algorithms
𝒐0 = 𝒐∗
𝒐1
𝒐2
𝒐3
𝒐4 = 𝒔4
𝒔3
𝒔2
𝒔1
𝒔0 = 𝟎 0 0 0 0 0 0
3 0 0 0 0 0
3 1 0 0 0 0
3 1 4 1 0 0
2 0 4 2 3 0
3 1 4 0 0 0
Sw
ap
Gre
ed
y
optimal
output
𝑎 𝑏 𝑐 𝑑 𝑒 𝑓
3
3 1
3 1 4
Construct a sequence
from 𝒐∗ to 𝒔𝐵 preserving
(#nonzero in 𝒐𝑖) = 𝐵
zero
𝑓 𝒔𝑗 − 𝑓 𝒔𝑗−1
𝑓 𝒐𝑗−1 − 𝑓 𝒐𝑗
≥
Proof sketch
13
■Proposed algorithms
𝑓 𝒐∗ − 𝑓 𝒔
𝑓 𝒔 − 𝑓 𝟎
𝑓 𝒔
≤≤ 𝒇 𝒔 ≥
𝟏
𝟐𝒇(𝒐∗)
𝒐0 = 𝒐∗
𝒐1
𝒐2
𝒐3
𝒐4 = 𝒔4
𝒔3
𝒔2
𝒔1
𝒔0 = 𝟎
Sw
ap
Gre
ed
y
optimal
output
zero
Speeding-up by random sampling
ー 𝑂(𝑘𝑛𝐵) may be quadratic (e.g., 𝐵 = 𝑛/2)
Can be reduced?
Combining random sampling[Mirzasoleiman,Badanidiyuru,Karbasi,Vondrák,Krause. AAAI'15]
At step 𝑗:
Looking at all elements a single random element
W.p. ≥𝐵−𝑗+1
𝑛−𝑗+1, 𝑓 𝒔𝑗 − 𝑓 𝒔𝑗−1 ≥ 𝑓 𝒐𝑗−1 − 𝑓 𝒐𝑗
Looking more elements gains success probability
14
■Proposed algorithms
Almost linear-time algorithm
for the total size constraint
𝑘-Stochastic-Greedy-TS
# function eval. = 𝑶 𝒌𝒏 𝐥𝐨𝐠 𝑩 ⋅ 𝐥𝐨𝐠 𝑩 𝜹
Approx. ratio = 𝟏/𝟐 w.p. 𝟏 − 𝜹15
■Proposed algorithms
𝒔0 ← 𝟎𝐟𝐨𝐫 𝑗 = 1 𝐭𝐨 𝐵
𝑅𝑗 ← a random subset of size min𝑛−𝑗+1
𝐵−𝑗+1log
𝐵
𝛿, 𝑛
𝑒𝑗 , 𝑖𝑗 ← argmax𝑒∈𝑅𝑗:𝒔𝑗−1 𝑒 =0,𝑖
Δ𝑒,𝑖𝑓(𝒔𝒋−𝟏)
𝒔𝒋 ← assign 𝑖𝑗 to the 𝑒𝑗st element in 𝒔𝑗−1
Influence maximization with 𝑘 topics
16
■Experimental evaluations
𝑓 𝒔 ≔ Exp. # vertices influenced in one of the 𝑘 graphs
Exact computation of 𝑓(⋅) is #P-hard [Chen,Wang,Wang. KDD'10]
Averaging 100 simulations
Given: 𝑘 graphs 𝐺1, … , 𝐺𝑘 & a budget 𝐵
Goal: max𝑓(𝒔) s.t. #1 + #2 + #3 + ⋯ + #k = 𝐵
a b c d e f
1 0 3 3 0 2
ac d
be f
ac d
be f
ac d
be f
⋃ ⋃ → 5
Settings
Digg dataset (a social news website)http://www.isi.edu/~lerman/downloads/digg2009.html
A friendship network
𝑛 = 3,522 vertices & 90,244 edges
A log of user votes for stories
Learning parameters
Using the method by [Barbieri,Bonchi,Manco. ICDM'12]
Edge probabilities in 𝑘 = 10 graphs
17
■Experimental evaluations
Algorithms
𝒌-Greedy-TS w/ lazy evaluations
𝒌-Stochastic-Greedy-TS 𝛿 = 0.1 w/ lazy evaluations
Baselines
Single 𝒊
Greedy strategy that considers only the 𝑖th topic
Degree
Select vertices of high degree & assign random topics
Random
Select vertices randomly & assign random topics
18
■Experimental evaluations
Result: Objective function value
19Environment: Intel Xeon E5-2690 2.90GHz CPU with 256GB memory
■Experimental evaluations
outperformed baselines
Result: Objective function value
20Environment: Intel Xeon E5-2690 2.90GHz CPU with 256GB memory
■Experimental evaluations
Single(𝟑) was worse than Degree(Effectiveness of assigning multiple kinds of items)
3 3 0 3 0 1 5 3
Result: # function evaluations
21Environment: Intel Xeon E5-2690 2.90GHz CPU with 256GB memory
Random sampling works well
𝒌-Greedy-TS rapidly grows
∵ 𝑒, 𝑖 gives similar marginal gains at 40th iteration
⇨ we need to check Δ𝑒,𝑖𝑓(𝒔) for most of the 𝑒, 𝑖 's
■Experimental evaluations
Summary
22
Constraint Approx. ratio # function eval.
Total size
#1 + #2 + #3 = 𝐵
1/2 𝑂 𝑘𝑛𝐵
1/2 w.p. 1 − 𝛿 𝑂 𝑘𝑛
Individual size
#1 = 𝐵1 , #2 = 𝐵2, #3 = 𝐵3
1/3 𝑂 𝑘𝑛 𝑖 𝐵𝑖
1/3 w.p. 1 − 𝛿 𝑂 𝑘2𝑛
𝑨𝑪 𝑫
𝑩𝑬 𝑭
𝑨𝑪 𝑫
𝑩𝑬 𝑭
𝑨𝑪 𝑫
𝑩𝑬 𝑭
⋃ ⋃
Influence maximization with 𝑘 topics Sensor placement with 𝑘 kinds of sensors
Monotone 𝑘-submodular function maximization
Recommended