Upload
kyrene
View
26
Download
1
Embed Size (px)
DESCRIPTION
A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees. 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171. David R. Karger Philip N. Klein Robert E. Tarjan. Outline. Introduction Basic Property & Definition Algorithm Analysis. Outline. Introduction - PowerPoint PPT Presentation
Citation preview
A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees
黃則翰 R96922141蘇承祖 R96922077張紘睿 R96922136許智程 D95922022戴于晉 R96922171
David R. KargerPhilip N. KleinRobert E. Tarjan
OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis
OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis
Introduction[Borůvka 1962] O(m log n)Gabow et al.[1984] O(m log β(m,n) )
◦β(m,n)= min { i |log(i)n <= m/n}
Verification algorithm ◦King[1993] O(m)
A randomize algorithm runs in O(m) time with high probability
OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis
Cycle propertyFor any cycle C in a graph, the
heaviest edge in C dose not appear in the minimum spanning forest.
2
3
5
6
2
3
5
6
Cut Property
2
3
5
6
For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree
X
DefinitionLet G be a graph with weighted edges.
◦w(x,y) The weight of edge {x,y}
If F is a forest of a subgraph in G◦F(x, y) the path (if any) connecting x and y
in F◦wF(x, y) the maximum weight of an edge on
F(x, y)◦wF(x, y)=∞ If x and y are not connected in F
F-heavy & F-lightAn edge {x,y} is F-heavy if w(x,y) >
wF(x,y) and F-light otherwise
Edge of F are all F-lightA C
B D
2
3
5
6
E G
F H
2
3
5
6W(B,D)=6WF(B,D)=max{2,3,5}F-heavy
W(F,H)=6WF(F,H)= ∞ F-light
W(C,D)=5WF(C,D)=5F-light
No F-heavy edge can be in the minimum spanning forest of G (cycle property)
Discard edge that cannot be in the minimum spanning tree
F-light edge can be the candidate edge for the minimum spanning tree of G
Observation
OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis
Boruvka AlgorithmFor each vertex, select the minimum-
weight edge incident to the vertex.Replace by a single vertex each
connected component defined by the selected edges.
Delete all resulting isolated vertices, loops, and all but the lowest-weight edge among each set of multiple edges.
Algorithm Step1Apply two successive Boruvka steps to
the graph, thereby reducing the number of vertices by at least a factor of four.
Algorithm Step2Choose a subgraph H by selecting
each edge independently with probability ½.
Apply the algorithm recursively to H, producing a minimum spanning forest F of H.
Find all the F-heavy edges and delete them from the contracted graph.
Algorithm Step3Apply the algorithm recursively to the
remaining graph to compute a spanning forest F’. Return those edges contracted in Step1 together with the edges of F’.
G
H
Boruvka × 2
G*
Original Problem
G’
Right Sub-
problem
Return minimum forest
F of H
Delete F-heavy edges from G*
Left Sub-
problem
F’Sample with p=0.5
CorrectnessBy the cut property, every edge
contracted during Step1 is in the MSF.By the cycle property, the edges
deleted in Step2 do NOT belong to the MSF.
By the induction hypothesis, the MSF of the remaining graph is correctly determined in the recursive call of Step3.
Candidate Edge of MSTThe expected number of F-light edges
in G is at most n/p (negative binomial)
For every sample graph H, the expected candidate edge for MST in G is at most n/p (F-light edge)
Random-samplingTo help discard some edge that cannot
be in the minimum spanning treeConstruct the sample graph H
◦Process the edges in increasing order◦To process an edge e◦1. Test whether both endpoints of e
in same component◦2. Include the edge in H with
probability p◦3. If e is in H and is F-light, add e to
the Forest F
Random-samplingC E
D F
6
5
11
9
A G4
3
10
14
13
B7
C E
D F
6
5
11
9
A G4
3
10
14
13
B7
GH
F
W(E,G)=14WF(E,G)=max{5,6,9,13}F-heavyW(E,F)=11WF(E,F)=max{5,6,9}F-heavy
W(D,F)=9WF(D,F)=9F-lightW(A,B)=7WF(A,B)= ∞F-light
Random-samplingC E
D F
6
5
11
9
A G4
3
10
14
13
B7
G
F
1. Increasing Order2. If F-light
Throw If
Select3. Else
ThrowDon’t select
1. Random select edges to H2. Find F of HC E
D F
6
5
11
9
A G4
3
10
14
13
B7
G
No F-heavy edge can be in the minimum spanning forest of G (cycle property)
F-light edge can be the candidate edge for the minimum spanning tree of G
The forest F produced is the forest that would be produced by Kruskal and inlcude all possible MSF of G
Observation
ObservationThe size of F is at most n-1The expected number of F-light edges
in G is at most n/p (negative binomial)kn pp
knk
pnkf )1(1
);;(
pppknk kn )1(
)1( 1
ppn 1
nppn1
pn
Mean k =
Expected n =
OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis
Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations
running time.
Running time AnalysisTotal running time= running time in
each steps.Step(1): 2 steps Boruvka’s algorithmStep(2):Dixon-Rauch-Tarjan verification
algorithm.All takes linear time to the number of
edges.◦Estimate the total number of edges.
Observe the recursion treeG=(V,E) |V| = n, |E|=m .
◦m≧n/2 since there is no isolate vertices.
Each problem generates at most 2 subproblems.◦At depth d, there is at most 2d nodes.◦Each node in depth d has at most
n/4d vertices.The depth d is at most log4n.
◦There are at most vertices in all subproblems
0022/4/2
dd
ddd nnn
The worst case Theorem 4.1 The worst-case running
time of the minimum-spanning-forest algorithm is O(min{n2,m log n}), the same as the bound for Boruvka’s algorithm.
Proof: There is two different estimate ways.
1. A subproblem at depth contains at most (n/4d)2/2 edges. Total edges in all subproblems is:
n
dd
d
nOn4log
02
2
)(22
)4/(
The worst case2. Consider a subprolbem G=(V,E)
after step(1), we have a G’=(V’ ,E’),|E’|≦|E| - |V|/2, |V’| ≦|V|/4Edges in left-child = |H| Edges in right-child ≦ |E’| - |H| + |F| so edges in two subproblem is less then: (|H|) + (|E’| - |H| + |F|)
=|E’| +|F|≦|E|-|V|/2 + |V|/4≦|E| The two sub problem at most contains
|E| edges.
The worst case
m edges
edges m
edges m
edges mnlog
The worst caseThe depth is at most log4n and each
level has at most m edges, so there are at most (m log n) edges.
The worst-case running time of the minimum-spanning-forest algorithm is O(min{n2,m log n}).
Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations
running time.
Analysis – Average Case (1/8)
Theorem: the expected running time of the minimum spanning forest algorithm is O(m)◦Calculating the expected total number
of edges for all left path problemsOriginal Problem
Left Sub-problem Right Sub-problem
Left Subsub-problem Right Subsub-problem
Analysis – Average Case (2/8)
Calculating the expected total edge number for one left path started at one problem with m’ edges
Evaluating the total edge number for all right sub-problems# of edges
= m’
Expected total edge number
≤ 2m’
Analysis – Average Case (3/8)
G
H G’
Boruvka × 2
G*Sample with p=0.5
1. E[edge number of H] = 0.5 × edge number of G*
Original Problem
Left Sub-problem
Right Sub-problem
2. ∵ Boruvka × 2 ∴ edge number of G* ≤ edge number of G
E[edge number of H] ≤ 0.5 × edge number of G
Calculating the expected total edge number for one left path started at one problem with m’ edges
Analysis – Average Case (4/8)
G
H G’
Boruvka × 2
G*Sample with p=0.5
Original Problem
Left Sub-problem
Right Sub-problem
E[edge number of H] ≤ 0.5 × edge number of G
Calculating the expected total edge number for one left path started at one problem with m’ edges
# of edges = m’
# of edges ≤ 0.5 × m’
Expected total edge number ≤ = 2m’
Analysis – Average Case (5/8)
Calculating the expected total edge number for one left path L started at one problem with m’ edges◦Expected total edge number on L ≤ 2m’
• Evaluating the total edge number of all right sub-problems• E[total edges of all right sub-problem] ≤ n
K.O.
Analysis – Average Case (6/8)
G
H G’
Original Problem
Left Sub-problem
Right Sub-problem
1. ∵ Boruvka × 2 ∴ vertex number of G* ≤ 0.25 × vertex number of G
E[edge number of G’] ≤ 0.5×vertex number of G
Evaluating the total edge number for all right sub-problems◦ To prove : E[total edges of all right sub-problem] ≤ n
Boruvka × 2
G*Sample with p=0.5
Return minimum forest
F of H
Delete F-heavy edges from G*
2. Based on lemma 2.1: E[edge number of G’] ≤ 2 × vertex number of G*
Analysis – Average Case (7/8)
E[edge number of G’] ≤ 0.5×vertex number of G
Evaluating the total edge number for all right sub-problems◦ To prove : E[total edges of all right sub-problem] ≤ n
G
H G’
Original Problem
Left Sub-problem
Right Sub-problem
Boruvka × 2
G*Sample with p=0.5
# of vertices of sub-problems ≤ 2×n/4# of vertices of sub-problems ≤ 4×n/42
# of vertices of sub-problems ≤ 8×n/43
# of vertices of sub-problems ≤ 16×n/44
# of edges of right sub-problems ≤ n/2# of edges of right sub-problems ≤ 2×n/8
# of vertices of original-problems=n
# of edges of right sub-problems ≤ 4×n/(42×2)# of edges of right sub-problems ≤ 8×n/(43×2)
= n
Analysis – Average Case (8/8)
Calculating the expected total edge number for one left path started at one problem with m’ edges◦ Expected total edge number for one left path ≤ 2m’
Evaluating the total edge number for all right sub-problems◦ E[total edges of all right sub-problem] ≤ n
# of edges = m’
Expected total edge number
≤ 2m’E[processed edges in the original problem and all sub-problems]=2×(m+n)
Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations
running time.
The Probability of LinearityTheorem 4.3
◦The minimum spanning forest algorithm runs in Ο(m) time with probability 1 – exp(-Ω(m))
The Probability of Linearity
n
1i
tXAt ieEeAXPr
Chernoff Bound:Given xi as i.d.d. random variables and 0< i n, and X is the sum of all xi, for t > 0, we have
Thus, the probability that less than s successes (each with chance p) within k trials is
2121)s(Ω
ktst
k
1i
tXst
p and t for ,e)pe(e
eEesXPr i
The Probability of LinearityRight Subproblems
◦At most the number of vertices in all right subproblems: n/2 ( proved by theorem 4.2 )
◦n/2 is the upper bound on the total number of heads in nickel-flips
Right SubproblemsThe probability
◦It occurs fewer than n/2 heads in a sequence of 3m nickel-tosses
m + n ≦ 3m since n/2 ≦ mThe probability is exp (-Ω(m)) by
a Chernoff bound
The Probability of LinearityLeft Subproblem
◦Sequence: every sequence ends up with a tail, that is, HH…HHT
◦The number of occurrences of tails is at most the number of sequences
◦Assume that there are at most m’ edges in the root problem and in all right subproblems
Left SubproblemsThe probability
◦It occurs m’ tails in a sequence of more than 3m’ coin-tosses
The probability is exp (-Ω(m)) by a Chernoff bound
The Probability of LinearityCombining Right & Left
Subproblems◦The total number of edges is Ο(m)
with a high-probability bound 1 – exp(-Ω(m))