52
Online Topological Ordering Siddhartha Sen, COS 518 11/20/2007

Online Topological Ordering

  • Upload
    bran

  • View
    58

  • Download
    0

Embed Size (px)

DESCRIPTION

Online Topological Ordering. Siddhartha Sen , COS 518 11/20/2007. Outline. Problem statement and motivation Prior work (summary) Result by Ajwani et al. Algorithm Correctness Running time Implementation Comparison to prior work Incremental complexity analysis Practical implications - PowerPoint PPT Presentation

Citation preview

Page 1: Online Topological Ordering

Online Topological Ordering

Siddhartha Sen, COS 51811/20/2007

Page 2: Online Topological Ordering

Outline• Problem statement and motivation• Prior work (summary)• Result by Ajwani et al.

– Algorithm– Correctness– Running time– Implementation

• Comparison to prior work– Incremental complexity analysis– Practical implications

• Open problems• Breaking news

Page 3: Online Topological Ordering

Problem statement

• Offline or static version (STO)– Given a DAG G = (V,E) (with n = V and m = E),

find a linear ordering T of its nodes such that for all directed paths from x є V to y є V (x ≠ y), T(x) < T(y), where T:V [1..n] is a bijective mapping

• Online version (DTO)– Edges of G are not known before hand, but are

revealed one by one– Each time an edge is added to the graph, T must

be updated

Page 4: Online Topological Ordering

Problem statement

aa bbccdd uuvv

affected region

u v invalidates topological order

Page 5: Online Topological Ordering

Motivation

• Traditional applications– Online cycle detection in pointer analysis– Incremental evaluation of computational circuits– Semantic checking by structure-based editors– Maintaining dependences between modules

during compilation• Other applications– Scheduling jobs in grid computing systems, where

dependences arise between the subtasks of a job

Page 6: Online Topological Ordering

• Offline problem: per edge– for m edges

• Alpern et al. (AHRSZ, ‘90): per edge

• Marchetti-Spaccamela et al. (MNR, ‘96): per edge (amortized)– for m edges

• Pearce and Kelly (PK, ’04): per edge

• Katriel and Bodlaender (KB, ’05): . per edge (amortized)– for m edges

uvuvuv log

Prior work (summary))( nmO

)( 2 mnmO

)(nO

)(mnO

*min

*min log KKO

})(min{ log2

,log mnn

mnmO

})log,log(min{ 22/32/3 nnmnmO

incremental complexity analysis

Page 7: Online Topological Ordering

Ajwani et al. (AFM)• Contributions– Solves DTO in O(n2.75) time, regardless of the number of

edges m inserted– Uses generic bucket data structure with efficient support

for: insert, delete, collect-all• Analysis based on tunable parameter t = max number of nodes in

each bucket

• Contributions– Poor discussion of motivating applications– No insight into how algorithm works or achieves running

time– No intuitive comparison with prior algorithms (AHRSZ,

MNR, etc.)

Page 8: Online Topological Ordering

Notation

• d(u,v) denotes T(u) – T(v)• u < v is shorthand for T(u) < T(v)• u v denotes an edge from u to v• u v means v is reachable from u

Page 9: Online Topological Ordering

Algorithm AFM

Page 10: Online Topological Ordering

Algorithm AFM

aa bbccdd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(u,v){ v , a }{ c , u }

u v invalidates topological order

Page 11: Online Topological Ordering

Algorithm AFM

aa bbccdd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(c,a)ØØ

Page 12: Online Topological Ordering

Algorithm AFM

cc bbaadd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(c,a)ØØ

Swap!

Page 13: Online Topological Ordering

Algorithm AFM

cc bbaadd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(u,v){ v , a }{ c , u }

Page 14: Online Topological Ordering

Algorithm AFM

cc bbaadd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(u,a){ a , b }{ u }

Page 15: Online Topological Ordering

Algorithm AFM

cc bbaadd uuvv

Call:Set A:Set B:Recursion depth:

REORDER(u,b)ØØ

Page 16: Online Topological Ordering

Algorithm AFM

cc uuaadd bbvv

Call:Set A:Set B:Recursion depth:

REORDER(u,b)ØØ

Swap!

Page 17: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,a){ a , b }{ u }

cc uuaadd bbvv

Page 18: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,a)ØØ

cc uuaadd bbvv

Page 19: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,a)ØØ

cc aauudd bbvv

Swap!

Page 20: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,v){ v , a }{ c , u }

cc aauudd bbvv

Page 21: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(c,v)ØØ

cc aauudd bbvv

Page 22: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(c,v)ØØ

vv aauudd bbcc

Swap!

Page 23: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

vv aauudd bbcc

REORDER(u,v){ v , a }{ c , u }

Page 24: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,v)ØØ

vv aauudd bbcc

Page 25: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,v)ØØ

uu aavvdd bbcc

Swap!

Page 26: Online Topological Ordering

Algorithm AFM

Call:Set A:Set B:Recursion depth:

REORDER(u,v)ØØ

uu aavvdd bbcc

Done!

Page 27: Online Topological Ordering

Data structures

• Store T and T-1 as arrays– O(1) lookup for topological order and inverse

• Graph stored as array of vertices, where each vertex has two adjacency lists (for incoming/outgoing edges)

• Each adjacency list stored as array of buckets– Each bucket contains at most t nodes for a fixed t– i-th bucket of node u contains all adjacent nodes v

with i t d(u,v) (i + 1) t

Page 28: Online Topological Ordering

Data structures

• A bucket is any data structure with efficient support for the following operations:– Insert: insert an element into a given bucket– Delete: given an element and a bucket, delete the element

from the bucket (if found; otherwise, return 0)– Collect-all: copy all elements from a given bucket to some

vector• Analysis assumes a generic bucket data structure and

counts the number of bucket operations– Later, we will consider different implementations of the

data structure and corresponding running times/space usage

Page 29: Online Topological Ordering

Correctness

• Theorem 1. Algorithm AFM returns a valid topological order after each edge insertion.

• Lemma 1. Given a DAG G and a valid topological order, if u v and u v, then all subsequent calls to REORDER will maintain u v.

• Lemma 2. Given a DAG G with v y and x u, a call of REORDER(u,v) will ensure that x < y.

• Theorem 2. The algorithm detects a cycle iff there is a cycle in the given edge sequence.

Page 30: Online Topological Ordering

Correctness

• Theorem 1. Algorithm AFM returns a valid topological order after each edge insertion.

• Proof: use Lemmas 1 and 2.– For graph with no edges, any ordering is a topological

ordering– Need to show that INSERT(u,v) maintains correct

topological order of G’ = G {(∪ u,v)}• If u v, this is trivial; otherwise,• Show that x y for all nodes x,y of G’ with x y. If there was

a path x y in G, Lemma 1 gives x y. Otherwise, x y was introduced to G’ by (u,v), and Lemma 2 gives x y in G’ since there is x u v y in G’.

Page 31: Online Topological Ordering

Correctness

• Lemma 1. Given a DAG G and a valid topological order, if u v and u v, then all subsequent calls to REORDER will maintain u v.

• Proof: by contradiction– Consider the first call of REORDER that leads to u v.

Either this led to swapping u and w with w v or swapping w and v with w u. In the first case:• Call was REORDER(w,u) and A = Ø• However, x A for which u x v (since v is between u

and w), leading to a contradiction

Page 32: Online Topological Ordering

Correctness

• Lemma 2. Given a DAG G with v y and x u, a call of REORDER(u,v) will ensure that x < y.

• Proof: by induction on recursion depth of REORDER(u,v)– For leaf nodes, A = B = Ø. If x y before, Lemma 1 ensures x

y will continue; otherwise, x = u and y = v and swapping gives x y.

– Assume lemma is true up to a certain tree level (show this implies higher levels). If A Ø, there is a v’ such that v v’ y, otherwise v’ = v = y. If B Ø, there is a u’ such that x u’ u, otherwise u’ = u = x. Hence v’ y x u’.• For loops will call REORDER(u’,v’), which ensures x y by inductive

hypothesis• Lemma 1 ensures further calls to REORDER maintain x y

Page 33: Online Topological Ordering

Correctness

• Theorem 2. The algorithm detects a cycle iff there is a cycle in the given edge sequence.

• Proof: – Within a call to Insert(u,v), there are paths v v’

and u’ u for each recursive call to REORDER(u’,v’)• Trivial for first call and follows by definition of A and B

for subsequent calls• If algorithm detects a cycle in line 1, then we have v

v’ = u’ u and adding u v completes the cycle

Page 34: Online Topological Ordering

Correctness• Theorem 2. The algorithm detects a cycle iff there

is a cycle in the given edge sequence.• Proof: , by induction on number of nodes in

path v u– Consider edge (u,v) of the cycle v u v inserted

last. Since v u before inserting this edge, Theorem 1 states that v u, so REORDER (u,v) will be called.• Call of REORDER (u’,v’) with u’ = v’ or v’ u’ clearly reports a

cycle• Consider path v x y u of length k 2 and call to

REORDER(u,v). Since v x y u before the call, x A and y B, so REORDER(y,x) will be called. y x has k – 2 nodes in the path, so call to Reorder will detect the cycle (by the inductive hypothesis).

Page 35: Online Topological Ordering

Algorithm AFM

Page 36: Online Topological Ordering

Running time• Theorem 3. Online topological ordering can be

computed using O(n3.5/t) bucket inserts and deletes, O(n3/t) bucket collect-all operations collecting O(n2t) elements, and O(n2.5 + n2t) operations for sorting.

• Lemma 4. REORDER is called O(n2) times.• Lemma 5. The summation of A + B over all calls of

REORDER is O(n2).• Lemma 6. Calculating the sorted sets A and B over all

calls of REORDER can be done by O(n3/t) bucket collect-all operations touching a total of O(n2t) elements and O(n2.5 + n2t) operations for sorting these elements.

• Lemma 9. Updating the data structure over all calls of REORDER requires O(n3.5/t) bucket inserts and deletes.

Page 37: Online Topological Ordering

Running time

• Theorem 3. Online topological ordering can be computed using O(n3.5/t) bucket inserts and deletes, O(n3/t) bucket collect-all operations collecting O(n2t) elements, and O(n2.5 + n2t) operations for sorting.

• Proof:– Use lemmas 4, 6, and 9. Additionally, show that merging

sets A and B (lines 6-7 in the algorithm) takes O(n2) time• Merging takes O(A + B), which is O(n2) over all calls to

REORDER by Lemma 5; finding vertices in B that exceed the chosen v’ takes O(the number of those vertices), which is also the number of recursive calls to REORDER made. Lemma 4 says the latter value is O(n2).

Page 38: Online Topological Ordering

Running time• Lemma 4. REORDER is called O(n2) times.• Proof:

– Consider the first time REORDER(u,v) is called. If A = B = Ø, then u and v are swapped. Otherwise, REORDER(u’,v’) is called recursrivelly for all v’ {v} ∪ A and u’ B ∪ {v} with u’ v’. The order in which recursive calls are made and the fact that REORDER is local (only touches the affected region) ensures that REORDER(u,v) is not called except as the last recursive call. In this second call to REORDER(u,v), A = B = Ø• Consider all v’ A and v’ B from the first call of REORDER(u,v).

REORDER(u,v’) and REORDER(u’,v) must have been called by the for loops before the second call to REORDER(u,v). Therefore, u v’ and u’ v for all v’ A and v’ B, so u and v are swapped during the second call.

• REORDER(u,v) will not be called again because u v.

Page 39: Online Topological Ordering

Running time

• Lemma 9. Updating the data structure over all calls of REORDER requires O(n3.5/t) bucket inserts and deletes.

• Proof: use LP– Data structure requires O(d(u,v)n/t) bucket inserts and

deletes to swap two nodes u and v. • Need to update adjacency lists of u and v and all w adjacent to u

and/or v. If d(u,v) t, build from scratch in O(n). Otherwise, can show that at most d(u,v) nodes need to transfer between any pair of consecutive buckets. This yields a bound of O(d(u,v)n/t).

– Each node pair is swapped at most once (Lemma 7), so summing up over all calls of REORDER(u,v) where u and v are swapped, we need O( d(u,v)n/t) bucket inserts and deletes. d(u,v) = O(n2.5) by Lemma 8, so the result follows.

Page 40: Online Topological Ordering

Running time

• How to prove d(u,v) = O(n2.5)?• Use an LP:– Let T* denote the final topological ordering and

– Model some linear constraints on X(i,j):• 0 X(i,j) n for all i,j [1..n]• X(i,j) = 0 for all j i• ji X(i,j) – j<i X(j,i) n for all 1 i n

– Over insertion of all edges, a node’s net movement right and left in the topological ordering must be less than n

0

),())(*),(*(

vudvTuTX

if and when REORDER(u,v) leads to a swapping

otherwise

Page 41: Online Topological Ordering

• Yields the following LP:

• And it’s dual:

Running time

Page 42: Online Topological Ordering

Running time

• Which yields the following feasible solution:

• This solution has a value of:

)(2 5.2

1

2/52 nOinnnn

i

Page 43: Online Topological Ordering

Implementation of data structure

• Balanced binary tree gives O(1 + log) time insert and delete and O(1 + ) collect-all– Total time is O(n2t + n3.5 log n/t) by Theorem 3. Setting t =

n0.75 (log n)1/2, we get a total time of O(n2.75 (log n)1/2) and O(n2) space

• n-bit array gives O(1) insert and delete and O(total output size + total # of deletes) collect-all operation– Total time is O(n2t + n3.5/t). Setting t = n0.75 gives O(n2.75)

time and O(n2.25) space for O(n2/t) buckets• Uniform hashing is similar to n-bit array– O(n2.75) expected time and O(n2) space

Page 44: Online Topological Ordering

Empirical comparison

• Compared against PK, MNR, and AHRSZ for the following “hard-case” graph:

Page 45: Online Topological Ordering

Empirical comparison

Page 46: Online Topological Ordering

Comparison to prior work• No insight provided by Ajwani et al.• Pearce and Kelly compare PK, AHRSZ, and MNR using

incremental complexity analysis– In dynamic problems, typically no fixed input captures the

minimal amount of work to be performed– Use complexity analysis based on input size: measure work

in terms of a paramter representing the (minimal) change in input and output required• For DTO problem, input is current DAG and topological order,

output after an edge insertion is updated DAG and (any) valid ordering

– Algorithm is bounded if time complexity can be expressed only in terms of ; otherwise, it is unbounded

Page 47: Online Topological Ordering

Comparison to prior work• Runtime comparisons:

– AHRSZ is bounded by Kmin, the minimal cover of vertices that are incorrectly ordered after an edge insertion, plus adjacent edges

– PK is bounded by uv, the set of vertices in the affected region which reach u or are reachable from v, plus adjacent edges; PK is worst-case optimal wrt number of vertices reordered

– MNR takes (uvF+ + ARuv) in the incremental complexity

model, where ARuv is the set of vertices in the affected region– Kmin uv ARuv, so AHRSZ is strictly better than PK, but

PK and MNR are more difficult to compare (former expected to outperform the latter on sparse graphs)

– KB analyzes a variant of AHRSZ– AFM appears to improve the bound on the time to insert m

edges for AHRSZ

Page 48: Online Topological Ordering

Comparison to prior work• Intuitive comparison– AHRSZ performs simultaneous forward and backward

searches from u and v until the two frontiers meet; nodes with incorrect priorities are placed in a set and corrected using DFS’s in this set

– MNR does a similar DFS to discover incorrect priorities, but visits all nodes in the affected region during reassignment

– PK is similar to MNR but reassigns priorities using only positions previously held by members of uv

– KB and AFM appear to be improvements in the runtime analysis of variants of AHRSZ

Page 49: Online Topological Ordering

Comparison to prior work • Practical implications– PK and MNR use simpler data structures (arrays) than AHRSZ

(priority queues and Diez and Sleator ordered list structure)– PK and MNR use simpler traversal algorithms than AHRSZ– PK visits fewer nodes during reassignments

• Experiments run by Pearce and Kelly– MNR performs poorly on sparse graphs, but is the most

efficient on dense graphs– PK performs well on very sparse/dense graphs, but not so

well in between– AHRSZ is relatively poor on sparse graphs, but has constant

performance otherwise (competitive with the others)

Page 50: Online Topological Ordering

Open problems• Only lower bound in the problem is (n log n) for inserting n – 1

edges, by Ramalingam and Reps; better lower bounds?• Reduce the (wide) gap between best known lower and upper

bounds• Answer: does the definition of for DTO need to include

adjacent edges?• Does the bounded complexity model capture the power of

amortization?• Include edge deletions in the analysis of AFM or any of the other

algorithms• Perform a theoretical and empirical analysis of a parallel version

of AFM or any of the other algorithms

Page 51: Online Topological Ordering

Breaking news

• Kavitha and Mathew improve the upper bound to O(minn2.5, (m + n log n)m0.5)– Doesn’t appear to be anything wildly unique

about their algorithm– Do a better job of keeping the sizes of sets uv

F and uv

B close to each other

Page 52: Online Topological Ordering

Thank you