Network Flow Analysis of Algorithms. Network Flow A flow network is a directed graph in which each directed edge (u,v) has a positive capacity c(u,v)

Network Flow

Analysis of Algorithms

Network Flow

• A flow network is a directed graph in which each directed edge (u,v) has a positive capacity c(u,v) > 0– Capacity is the maximum rate of flow that

any edge can carry

• Two special nodes: source, s, and target, t

Network Flow

• Flow is the rate at which material moves from the source to the target– For example, liquid through pipes, electrical

current through a circuit, data through a network, or many other types of information

• The value of flow f is defined to be the total flow leaving the source vertex– Can be positive, negative, or zero– f is also the value of the in-flow at the target

Network Flow

• A flow network is a directed graph with a source vertex s, a target vertex t, and every edge labeled with a positive capacity, c

Initial network with capacities on the edges. No flow yet.

Network Flow

• A flow is then induced at the source• At every vertex except s and t, the inflow

must be equal to the outflow at that vertex

Network Flow

• Every flow must satisfy the following 3 properties:– Capacity constraint: for

all u,v V, f(u,v) ≤ c(u,v)– Skew symmetry: for all u,v V,

f(u,v) = - f(v,u)– Flow conservation: for all u V -

{s,t},vV f(u,v) = 0

• The maxflow problem: given a directed graph (network) with a capacity on each edge, find a flow of maximum value

Ford-Fulkerson Algorithm

• Ford-Fulkerson is a greedy, iterative method for solving the maxflow problem:– Initially, flow = 0 for all u,v V– At each stage, an augmenting path is found

• An augmenting path is a path from s to t for which we can push more flow

– Repeat until no more augmenting paths can be found


• The residual network• Consider a flow network (G,c) together with a flow f• Define a new flow network (Gf,cf), where Gf has the same

node set as G, and cf(u,v) = c(u,v) - f(u,v)– The capacities of Gf are the unused capacities of G

• Gf is called the residual network, cf is called the residual capacity

• If we can find an additional flow f’(u,v) in Gf, then we can define f*(u,v) to be the current total flow in the network G, where f*(u,v) = f(u,v) + f’(u,v)


• The “leftover” capacity in the original network with flow f creates an augmenting path available for more flow (the residual network)

When the capacity of an edge is used up, we can reverse the direction of the edge to allow data to flow in the opposite direction.


• The residual capacity cf = the capacity that is left over after a flow has “run” along an edge

• The bottleneck capacity b = the minimum residual capacity of the edges along the augmenting path

b = min { cf(u,v): (u,v) is on path p }


• Initialize: Start with a flow of 0 on each edge• Repeat: Push more flow along an

augmenting path until a maximum flow has been found

Ford-Fulkerson Algorithm• An augmenting path is any undirected

path from s to t such that either:– Flow can be increased along a forward edge (an edge

with capacity that is not full) or– Flow can be decreased along a backward edge (a

reversed edge with a flow that is not zero)

Send 10 units of flow from s to t.






Terminate when all paths from s to t are blocked by either a full forward edge or an empty backward edge


ts 42

10/10

6/10 10/10

6/108/86/6

8/9

2/2

Question: In the flow network shown below, how many different augmenting paths are there with respect to the given flow f ?

1 0/4 3


When the capacities are integers, Ford-Fulkerson runs in time proportional to the number of edges in the graph times the maximum flow in the graph.

Analysis of Ford-Fulkerson

• How do you choose an augmenting path?• Will any sequence of paths do just as well?• Consider the following flow network:

v

1,000,000

s1,000,000

t1,000,000

1,000,000

1 s

v

u u

t999,999

999,999

999,999

v

u

t999,999

s1,000,000

999,999

1,000,000

1

1 1 1 1

999,999

11

1

Two augmenting paths with bottleneck capacities of 1, and resulting residual network. Running time is proportional to E f, where f is the value of the maximum flow; for this example, f = 2,000,000.

Improvements to Ford-Fulkerson

• Edmunds and Karp suggested two heuristics for improving performance by selecting better augmenting paths– Always augment by a path of maximum

bottleneck capacity (the fattest path) or

– Always augment by a path with the fewest number of edges (the shortest path)

Is Ford-Fulkerson Optimal?

• Since Ford-Fulkerson is a greedy method, how do we know if we have found the optimal (maximum) flow?

• Consider:– A flow is maximum if and only if its residual

network contains no augmenting paths– There will be no more augmenting paths when

all the bottleneck capacities have been used up– The bottleneck capacities are found from

the minimum cut of the network

Network Cuts

• Definition: A cut is a partition of the vertices into two disjoint sets, A and B

• An st-cut is a cut where the source s is in Aand the sink t is in B

ABs

t

Network Cuts

• The capacity of a cut is the sum of the capacities of the edges from A to B

• In the example below, A = {s} and B = {all other vertices}

Network Cuts

• In this example, A = {the dark-colored vertices} and B = {the light-colored vertices}– Only include the edges from A to B when

determining the capacity across the cut; do not include the edges from B to A or from A to A

Network Cuts

Question: In the flow network shown here, what is the capacity of the cut withA = {s,2,4} and B = {1,3,t}?

Mincut problem: For all possible ways to cut a graph with s Aand t B, find the cut of minimum capacity.

1

ts 42

3

10

10 10

1086

9

2

4

Flows and Cuts

• The net flow across a cut (A, B) is the sum of the flows on its edges from A (gray vertices) to B (white vertices) minus the sum of the flows on its edges from B to A

• Flow-value lemma: Let f be any flow and let (A, B) be any cut. Then, the net flow across (A, B) equals the value of f.

• In other words, every cut must have the same net flow!

Maxflow-Mincut Theorem

• According to the flow-value lemma, the net flow is the same across all s-t cuts, including the minimum s-t cut– If you send as much flow as you possibly can

(the maximum flow), then the capacities of the minimum cut will be saturated and you cannot send more flow

• Therefore, the maximum flow = minimum cut

Maxflow-Mincut Theorem

• Proof:– Suppose (A, B) is a cut such that vertex u A– We can move u to B without altering the flow because of

the conservation of flow (flow in = flow out), thus f(A, B) = f(A - {u}, B + {u})

– Therefore, every flow (even the maximum flow) is the same regardless of where you cut (even the minimum cut)

=

A

Bs

w4

uw3

w2

w1t

w5

B+{u}

A-{u}

s

w4

u

w1

w3

w2

t

w5

Flow Equivalence Conditions

• The following three conditions are equivalent for any flow f:

1. There exists a cut whose capacity equals the value of the flow f

2. Flow f is a maxflow3. There is no augmenting path with respect to f


1. There exists a cut whose capacity equals the value of the flow f

2. Flow f is a maxflow

1 implies 2:– Suppose that (A, B) is a cut with capacity equal to

the value of f– Then, the value of any other flow f ' ≤ capacity of

cut(A, B) = value of f (by assumption on 1 above)– Thus, f is a maxflow


2. Flow f is a maxflow3. There is no augmenting path with respect to f

2 implies 3 using the contrapositive (3 implies 2):– Suppose that there is another augmenting path

with respect to the current flow f– We could send more flow along this path– Thus, the current flow f is not a maxflow


3. There is no augmenting path with respect to f1. There exists a cut whose capacity equals the value

of the flow f

3 implies 1:– If there is no augmenting path with respect to f

(i.e., if there is no path from s to t where we could send more flow), then this creates a cut

– The capacity of this cut defines the bottleneck, and has flow with value f (from the flow-value lemma)

Applications of Network Flow• Data mining• Open-pit mining• Bipartite matching• Network reliability• Baseball elimination• Image segmentation• Network connectivity• Distributed computing• Security of statistical data• Egalitarian stable matching• Multi-camera scene reconstruction• Sensor placement for homeland security• Many, many, more

Edge-Disjoint Paths

• A set of paths is edge-disjoint if their edge sets are disjoint, i.e., no two paths share an edge– Multiple paths may go through some of the nodes

• The Edge-Disjoint Paths Problem is to find the maximum number of edge-disjoint s-t paths in G

• The maximum number of edge-disjoint s-t paths is equal to the minimum number of edges whose removal separates s from t

Edge-Disjoint Paths

• The Ford-Fulkerson algorithm can be used to find a maximum set of edge-disjoint s-t paths in runtime proportional to VE

• Given a graph G = (V,E) with source s and sink t, create a flow network with a capacity of 1 on each edge

Maximum number of edge-disjoint paths = 3

s t

Maximum Bipartite Matching

• The assignment/matching/marriage problem• Goal: Find a maximum bipartite match

– Example: match L machines with R tasks to be performed

– Example: match L applicants to R job offers• Edge (x,y) represents “applicant x wants job y”

and “company y wants applicant x”• The goal is to make the maximum number

of matches of student to jobs– Multiple acceptance is forbidden!


X YX Y

A bipartite graph. On the left is a match with cardinality 2, and on the right is a match with cardinality 3. The match on the right is optimal because 3 > 2.


• To find a maximum bipartite matching: construct a flow network, where a flow correspond to a match1. Construct a flow network G’2. Set all weights to 1 (initial capacities)3. Run Ford-Fulkerson4. Maximum bipartite matching =

maximum flow

• Runtime is proportional to VE


s t

X Y XY

The corresponding flow network G with maximum flow shown. Each edge in G has capacity of 1. Shaded edges have a flow of 1, and all other edges carry no flow. The shaded edges from X to Y correspond to those in a maximum matching of the bipartite graph.

G G

Baseball Elimination

Four baseball teams are trying to finish in first place. Currently, each team has the following number of wins:

New York: 92 Baltimore: 91 Toronto: 91 Boston: 90

There are five games left in the season; these consist of all possible pairings of the four teams, except for New York and Boston.

Can Boston finish in first place, or at least tie for first?



Boston can finish with at most 92 wins. Cumulatively, the other three teams have 274 wins currently, and their three games against each other will produce exactly three more wins, for a final total of 277. But 277 wins divided by three teams means that one of them must end up with more than 92 wins.

Therefore, Boston cannot end up in first place, or even tie for first.


Suppose each team has the following number of wins:


The remaining games are as follows: Boston still has four games against each of the other three teams. Baltimore has one more game against each of New York and Toronto. New York and Toronto still have six games left to play against each other.

Is Boston eliminated?



Boston can end with at most 91 wins. Together, New York and Toronto already have 177 wins; their six remaining games will result in a total of 183, and since 183/2 > 91, one of them must end up with more than 91 wins. Therefore, Boston is eliminated.

Interestingly, in this instance, we cannot prove that Boston is eliminated by averaging all three teams. The three teams ahead of Boston together have a total of 265 wins with 8 games left among them. This is a total of 273, and 273/3 = 91. Averaging over all three teams does not prove that Boston could not tie for first. You must be careful to choose the correct set!


In general, suppose we have a set S of teams, and for each x S, its current number of wins is wx. Also, for two teams x, y S, they still have to play gxy games against each other. We also have a specific team z, for which we want to know the best outcome. Let’s suppose that z requires m wins total to end up in first place.

Let S = S – {z}, and let g* = the total number of games left between all pairs of teams in S. Now construct a flow network G to determine whether z has been eliminated: include nodes s and t, a vertex uxy for each pair of teams x, y S with a non-zero number of games left to play against each other, and a node vx for each team x S.


Include the following edges:• (s, uxy) with capacity gxy

• (uxy, vx) and (uxy, vy) with capacities gxy

• (vx, t) with capacity m - wx

If there is a flow of value at least g*, then it is possible for the outcomes of all remaining games to yield a situation where no team has more than m wins. Hence, if team z wins all its remaining games, it can still achieve at least a tie for first place.

Team z has been eliminated if and only if the maximum flow inG has value strictly less than g*.


The maximum flow has value = 7, whereas g* = 6 + 1 + 1 = 8 (g* is the total number of games left between teams in S).Since max flow < g*, Boston cannot at least tie for first place.

wNewYork = 90wBaltimore = 88

wToronto

wBoston

= 87= 79

gNY-Tor = 6

gNY-Bal = 1

gBal-Tor = 1

Boston has 12 games left, so m = 91

sNY-Balt

Balt-Tor

NY

Balt

NY-Tor

1

1

6

91 - 88 = 3

Tor 91 - 87 = 4 t

91 - 90 = 1

6

6

1

11

1

Baseball EliminationQUESTION: How many vertices and edges, respectively, are there in a flow network that is constructed to determine whether one team is mathematically eliminated from a baseball league containing N teams? Assume the worst case (when there is a game remaining between every pair of teams in the league).

a) N and N2

b) N2 and N2

c) N2 and N3

d) N2 and N4

Network Flow Holy Grail

• Worst-case analysis is generally not useful for predicting or comparing maxflow algorithm performance in practice

• Current best in practice: Push-relabel method with gap relabeling runs in time proportional toE3/2

Network Flow Holy GrailYear Method Worst Case Discovered By

1951 Simplex E3 C Dantzig

1955 Augmenting path E C* Ford-Fulkerson

1970 Shortest augmenting path E3 Dinitz and Edmonds-Karp

1970 Fattest augmenting path E2 log E log(E C) Dinitz and Edmonds-Karp

1977 Blocking flow E5/2 Cherkasky

1978 Blocking flow E7/3 Galil

1983 Dynamic trees E2 log E Sleator and Tarjan

1985 Capacity scaling E2 log C Gabow

1997 Length function E1/2 log E log C Goldberg and Rao

2012 Compact network E2 / log E Orlin

? ? E ?

Documents

Network Flow Analysis of Algorithms. Network Flow A flow network is a directed graph in which each directed edge (u,v) has a positive capacity c(u,v)