TAM 5.1.'05 Boosted Sampling 1
Boosted Sampling: Approximation Algorithms for Stochastic
Problems
Martin Pál
Joint work with Anupam Gupta R. Ravi Amitabh Sinha
TAM 5.1.'05 Boosted Sampling 2
Anctarticast, Inc.
TAM 5.1.'05 Boosted Sampling 3
Optimization Problem:
Build a solution Sol of minimal cost, so that every user is satisfied.
minimize cost(Sol)
subject to happy(j,Sol) for j=1, 2, …, n
For example, Steiner tree:
Sol: set of links to build
happy(j,Sol) iff there is a path from terminal j to root
cost(Sol) = eSol ce
TAM 5.1.'05 Boosted Sampling 4
Unknown demand?
?
?
??
?
??
TAM 5.1.'05 Boosted Sampling 5
The model
Two stage stochastic model with recourse:
On Monday, links are cheap, but we do not know how many/which clients will show up. We can buy some links.
On Tuesday, clients appear. Links are now σ times more expensive. We have to buy enough links to satisfy all clients.
drawn from a known distribution π
TAM 5.1.'05 Boosted Sampling 6
The model
Two stage stochastic model with recourse:
Find Sol1 Edges and Sol2 : 2Users 2Edges to
minimize cost(Sol1) + σ Eπ(T)[cost(Sol2(T))]
subject to happy(j, Sol1 Sol2(T))
for all sets TUsers and all jT
Want compact representation of Sol2 by an algorithm
TAM 5.1.'05 Boosted Sampling 7
What distribution?
•Scenario model: There are k sets of users – scenarios; each scenario Ti has probability pi. (e.g.[Ravi & Sinha 03])
•Independent decisions model: each client j appears with prob. pj independently of others (e.g.[Immorlica et al 04])
•Oracle model: at our request, an oracle gives us a sample T; each set T can have probability pT. [Our work].
TAM 5.1.'05 Boosted Sampling 8
Example
S1 = { }S2 = { }S3 = { }
…
L
L
σ = 2
OPT = (1+σ) L =3 L
Connected OPT = min(2+σ,2σ) L = 4 L
TAM 5.1.'05 Boosted Sampling 9
Deterministc Steiner tree
cost(Steiner) cost(MST)
cost(MST) 2 cost(Steiner)
Theorem: Finding a Minimum Spanning Tree is a 2-approximation algorithm for Minimum Steiner Tree.
TAM 5.1.'05 Boosted Sampling 10
The Algorithm BS
1. Boosted Sampling: Draw σ samples of clients S1,S2 ,…,Sσ from the distribution π.
2. Build the first stage solution Sol1: use Alg to build a tree for clients S = S1S2 … Sσ.
3. Actual set T of clients appears. To build second stage solution Sol2, use Alg to augment Sol1 to a feasible tree for T.
Given Alg, an -approx. for deterministic Steiner Tree
.
TAM 5.1.'05 Boosted Sampling 11
Is BS any good?
Theorem: Boosted sampling algorithm is a 4-approximation for Stochastic Steiner Tree (assuming Alg is a 2-approx..).
Nothing special about Steiner Tree; BS works for other problems: Facility Location, Steiner Network, Vertex Cover..
Idea: Bound stage costs separately
•First stage cheap, because not too many samples, and Alg is good
•Second stage is cheap, because samples “dense enough”
TAM 5.1.'05 Boosted Sampling 12
First stage cost
Recall: BS builds a tree on samples S1,S2 ,…,Sσ from π.
Lemma: There is a (random) tree Sol1 on S=i Si
such that E[cost(Sol1)] Z*.
stochastic optimum cost = Z* = cost(Opt1) + σ Eπ[cost(Opt2(T))].
Lemma: BS pays at most α Z* in the first stage.
TAM 5.1.'05 Boosted Sampling 13
Second stage cost
After Stage 2, have a tree for S’ = S1 … Sσ T.
There is a cheap tree Sol’ covering S’.
Sol’ = Opt1 [ Opt2(S1) … Opt2(Sσ) Opt2(T)].
Fact: E[cost(Sol’)] (σ+1)/σ Z*.
T is “responsible” for 1/(σ+1) part of Sol’.
At Stage 1 costs, it would pay Z*/σ.
Need to pay Stage 2 premium pay Z*.
Problem: do not know T when building Sol’.
TAM 5.1.'05 Boosted Sampling 14
Idea: cost sharing
Scenario 1:
Pretend to build a solution for S’ = S T.
Charge each jS’ some amount ξ(S’,j).
Scenario 2:
Build a solution Alg(S) for S.
Augment Alg(S) to a valid solution for S’ = S T.
Assume: jS’ ξ(S’,j) Opt(S’)
We argued: E[jT ξ(S’,j)] Z*/σ (by symmetry)
Want to prove:
Augmenting cost in Scenario 2 β jT ξ(S’,j)
TAM 5.1.'05 Boosted Sampling 15
Cost sharing function
Input: A set of users S’
Output: cost share ξ(S,j) for each user jS’
Example: Build a spanning tree on S’ root.
Let ξ(S’,j) = cost of parental edge.
Note:
• jS’ ξ(S’,j) = cost of MST(S’)
• jS’ ξ(S’,j) 2 cost of Steiner(S’)
TAM 5.1.'05 Boosted Sampling 16
Strictness
A csf ξ(,) is β-strict, if
cost of Augment(Alg(S), T) β jT ξ(S T, j)
for any S,TUsers.
Second stage cost = σ cost(Augment(Alg(i Si), T)) σ β jT ξ(j Sj T, j)
Fact: E[shares of T] Z*/σ
Hence:E[second stage cost] σ β Z*/σ = β Z*.
S
T
TAM 5.1.'05 Boosted Sampling 17
Strictness for Steiner Tree
Alg(S) = Min-cost spanning tree MST(S)
ξ(S,j) = cost of parental edge in MST(S)
Augment(Alg(S), T):
for all jT build its parental edge in MST(S T)
Alg is a 2-approx for Steiner Tree
ξ is a 2-strict cost sharing function for Alg.
Theorem: We have a 4-approx forStochastic Steiner Tree.
TAM 5.1.'05 Boosted Sampling 18
Removing the root
L >> OPT
Problem: Building a Steiner tree on too expensive
Soln: build a Steiner forest on samples {S1, S2,…,Sσ}.
TAM 5.1.'05 Boosted Sampling 19
Works for other problems, too
BS works for any problem that is
- subadditive (union of solns for S,T is a soln for S T)
- has α-approx algo that admits β-strict cost sharing
Constant approximation algorithms for stochastic Facility Location, Vertex Cover, Steiner Network..
(hard part: prove strictness)
TAM 5.1.'05 Boosted Sampling 20
What if σ is random?
Suppose σ is also a random variable.
π(S, σ) – joint distribution
For i=1, 2, …, σmax do sample (Si, σi) from π with prob. σi/σmax accept Si
Let S be the union of accepted Si’s
Output Alg(S) as the first stage solution
TAM 5.1.'05 Boosted Sampling 21
Multistage problems
Three stage stochastic Steiner Tree:
•On Monday, edges cost 1. We only know the probability distribution π.
•On Tuesday, results of a market survey come in. We gain some information I, and update π to the conditional distribution π|I. Edges cost σ1.
•On Wednesday, clients finally show up. Edges now cost σ2 (σ2>σ1), and we must buy enough to connect all clients.
Theorem: There is a 6-approximation for three stage stochastic Steiner Tree (in general, 2k approximation for k stage problem)
TAM 5.1.'05 Boosted Sampling 22
Conclusions
We have seen a randomized algorithm for a stochastic problem: using sampling to solve problems involving randomness.
•Do we need strict cost sharing? Our proof requires strictness – maybe there is a weaker property? Maybe we can prove guarantees for arbitrary subadditive problems?
•Prove full strictness for Steiner Forest – so far we have only uni-strictness.
•Cut problems: Can we say anything about Multicut? Single-source multicut?
TAM 5.1.'05 Boosted Sampling 23
+++THE++END+++
Note that if π consists of a small number of scenarios, this can be transformed to a deterministic problem.
Find Sol1 Elems and Sol2 : 2Users 2Elems to
minimize cost(Sol1) + σ Eπ(T)[cost(Sol2(T))]
subject to satisfied(j, Sol1 Sol2(T))
for all sets TUsers and all jT
.
TAM 5.1.'05 Boosted Sampling 24
Related work
•Stochastic linear programming dates back to works of Dantzig, Beale in the mid-50’s
•Scheduling literature, various distributions of job lengths
•Resource installation [Dye,Stougie&Tomasgard03]
•Single stage stochastic: maybecast [Karger&Minkoff00], bursty connections [Kleinberg,Rabani&Tardos00]…
•Stochastic versions of NP-hard problems (restricted π) [Ravi&Sinha03], [Immorlica,Karger,Minkoff&Mirrokni 04]
•Stochastic LP solving&rounding: [Shmoys&Swamy04]
TAM 5.1.'05 Boosted Sampling 25
Infrastructure Design Problems
Assumption: Sol is a set of elements
cost(Sol) = elemSol cost(elem)
Facility location: satisfied(j) iff j connected to an open facility
Vertex Cover: satisfied(e={uv}) iff u or v in the cover
Connectivity problems: satisfied(j) iff j’s terminals connected
Cut problems: satisfied(j) iff j’s terminals disconnected
TAM 5.1.'05 Boosted Sampling 26
Vertex Cover8
3
3
10 9
4
5
Users: edges
Solution: Set of vertices that covers all edges
Edge {uv} covered if at least one of u,v picked.
1
1
1 1
1
1 1
1
11
1
1 1
1
1 1
1
11
1
2 2
1
2 2
2
21
1
2 3
1
4 2
2
31
1
2 3
1
3 2
2
3
Alg: Edges uniformly raise contributions
Vertex can be paid for by neighboring edges freeze all edges adjacent to it. Buy the vertex.
Edges may be paying for both endpoints 2-approximation
Natural cost shares: ξ(S, e) = contribution of e
TAM 5.1.'05 Boosted Sampling 27
Strictness for Vertex Cover1
1
1
1
1
n+1n+1n
S = blue edges1
1
1
1
1
T = red edge
Alg(S) = blue vertices:
Augment(Alg(S), T) costs (n+1)
ξ(S T, T) =1
•Find a better ξ? Do not know how. Instead, make Alg(S) buy a center vertex.
gap Ω(n)!
TAM 5.1.'05 Boosted Sampling 28
Making Alg strictAlg’: - Run Alg on the same input.
- Buy all vertices that are at least 50% paid for.
1
1
1
1
1
n+1n+1n
1
1
1
1
1
½ of each vertex paid for, each edge paying for two vertices still a 4-approximation.
Augmentation (at least in our example) is free.
TAM 5.1.'05 Boosted Sampling 29
Why should strictness hold?Alg’: - Run Alg on the same input.
- Buy all vertices that are at least 50% paid for.
Suppose vertex v fully paid for in Alg(S T).
•If jT αj’ ≥ ½ cost(v) , then T can pay for ¼ of v in the augmentation step.
•If jS αj ≥ ½ cost(v), then v would be open in Alg(S).
(almost.. need to worry that Alg(S T) and Alg(S) behave differently.)
α1
α2
α3
α1’
α2’Alg(S T) S = blue edges
T = red edgesv
TAM 5.1.'05 Boosted Sampling 30
Metric facility location
Input: a set of facilities and a set of cities living in a metric space.
Solution: Set of open facilities, a path from each city to an open facility.“Off the shelf” components:
3-approx. algorithm [Mettu&Plaxton00].
Turns out that cost sharing fn [P.&Tardos03] is 5.45 strict.
Theorem: There is a 8.45-approx for stochastic FL.
TAM 5.1.'05 Boosted Sampling 31
Steiner Network
client j = pair of terminals sj, tj
satisfied(j): sj, tj connected by a path2-approximation algorithms known ([Agarwal,Klein&Ravi91], [Goemans&Williamson95]), but do not admit strict cost sharing.
[Gupta,Kumar,P.,Roughgarden03]: 4-approx algorithm that admits 4-uni-strict cost sharing Theorem: 8-approx for Stochastic Steiner Network in the “independent coinflips” model.
TAM 5.1.'05 Boosted Sampling 32
The Buy at Bulk problem
client j = pair of terminals sj, tj
Solution: an sj, tj path for j=1,…,n
cost(e) = ce f(# paths using e)
cost
# paths using e
f(e):
# paths using e
cost Rent or Buy: two pipes
Rent: $1 per path
Buy: $M, unlimited # of paths
TAM 5.1.'05 Boosted Sampling 33
Special distributions: Rent or BuyStochastic Steiner Network:
client j = pair of terminals sj, tj
satisfied(j): sj, tj connected by a path
cost(e) = ce min(1, σ/n #paths using e)# paths using e
costn/σ
Suppose.. π({j}) = 1/n
π(S) = 0 if |S|1
Sol2({j}) is just a path!
TAM 5.1.'05 Boosted Sampling 34
Rent or Buy
The trick works for any problem P. (can solve Rent-or-Buy Vertex Cover,..)
These techniques give the best approximation for Single-Sink Rent-or-Buy (3.55 approx [Gupta,Kumar,Roughgarden03]), and Multicommodity Rent or Buy (8-approx [Gupta,Kumar,P.,Roughgarden03], 6.83-approx [Becchetti, Konemann, Leonardi,P.04]).
“Bootstrap” to stochastic Rent-or-Buy: - 6 approximation for Stochastic Single-Sink RoB - 12 approx for Stochastic Multicommodity RoB (indep. coinflips)
TAM 5.1.'05 Boosted Sampling 35
Performance Guarantee
Theorem:Let P be a sub-additive problem, with α-approximation algorithm, that admits β-strict cost sharing.
Stochastic(P) has (α+β) approx.
Corollary: Stochastic Steiner Tree, Facility Location, Vertex Cover, Steiner Network (restricted model)… have constant factor approximation algorithms.
Corollary: Deterministic and stochastic Rent-or-Buy versions of these problems have constant approximations.