13

Click here to load reader

Msa

Embed Size (px)

Citation preview

Page 1: Msa

Equilibrium Solution MethodsCE 392C

Stephen D. BoylesFall 2013

1 Introduction

Thus far in the course, we’ve learned what the principle of user equilibrium is, some basic optimizationtechniques, and derived the Beckmann formulation which allows you to find user equilibrium link flows bysolving an optimization problem. Now, we’ll talk about how you actually solve this problem, especially onlarge networks. We could just use the trial-and-error method described in earlier notes, but that isn’t themost efficient way, and as we’ll soon see, when it comes to large-scale transportation networks, finding anefficient algorithm is of the utmost importance.

Briefly reviewing, user equilibrium link demands can also be represented as the optimal solution to theprogram

minx,h

∑(i,j)∈A

∫ xij

0

tij(x)dx (1)

s.t. xij =∑π∈Π

hπδπij ∀(i, j) ∈ A (2)∑π∈Πrs

hπ = drs ∀(r, s) ∈ Z2 (3)

hπ ≥ 0 ∀π ∈ Π (4)

where xij is the travel demand on a link, tij(·) the link performance function mapping link demand to traveltime, hπ is the number of people choosing path π, drs the total number of people traveling from origin rto destination s, δπij is an indicator variable (one if link (i, j) is part of path k, and zero otherwise), and Aand Z are the sets of all links and zones. Substituting constraint (2) into the objective function, writingthe optimality conditions, reversing the substitution (2), and simplifying further with some algebra, theseconditions include

Cπ ≥ κrs ∀(r, s) ∈ Z2, π ∈ Πrs (5)

hπ(Cπ − κrs) = 0 ∀π ∈⋃

(r,s)∈Z2

Πrs (6)

hπ ≥ 0 ∀π ∈ Π (7)

where Cπ is the total travel time on path π. Condition (5) states that the travel time on each path is atleast equal to the minimum travel time κrs among all paths connecting OD pair (r, s) must share. To satisfythe complementarity condition (6), we either must have Cπ = κrs (path π is a shortest path between r ands) or hπ = 0 (the path is unused). This is exactly the definition of user equilibrium flows.

We’ll now describe a few methods to solve this formulation, keeping in mind applicability to large networks.We’ll start with two simple methods (the method of successive averages and Frank-Wolfe), then move on tosome other topics of importance. Later in the course, we’ll return to this original problem and cover moreadvanced solution methods. Eventually, we’ll see all of these algorithms:

1

Page 2: Msa

Method of Successive Averages (MSA) No longer a practical method, but the simplest to understandand illustrates the basic concepts. Advantage: requires very little computer memory; disadvantage:convergence to the equilibrium solution is glacially slow.

Frank-Wolfe (FW) For many decades, the most commonly used by transportation planners and softwareprograms. Advantage: requires no more memory than MSA, but a good deal faster; disadvantage:convergence is still far too slow on large networks.

Gradient Projection (GP) Developed in the mid-1990’s, with a resurgence in the past few years. Thisis the quintessential “path-based” algorithm. Advantage: can find much more accurate solutions thanMSA and FW, and much faster; disadvantage: can require a huge amount of computer memory.

Algorithm B Developed in 2006 by Bob Dial, this method is fast, intuitive, and relatively easy to imple-ment. Not yet included in these notes.

Traffic Assignment by Paired Alternative Segments (TAPAS) One of the most recent algorithms(published in 2010), developed by Hillel Bar-Gera, seems to offer performance superior to any of theabove. Not yet included in these notes.

The importance of selecting a good algorithm is shown by considering the progress of these five algorithms onthe Chicago Regional network, which is commonly used as a realistic large-scale network for testing purposes.This network has 12,982 nodes; 39,018 links; and 2,297,945 OD pairs1 Current wisdom is that a relative gapof 10−6 is small enough for practical purposes; TAPAS can reach this level of precision in about 20 minutes,gradient projection and Algorithm B in roughly 2 hours. Frank-Wolfe would require approximately 42 days,and the method of successive averages over 100 years! For the sake of stress-testing, let’s say we wantto run each algorithm until the gap is reduced below the precision of standard data types in the computer(roughly 10−15). TAPAS would require about an hour to reach this level, gradient projection about 10 hours,and Algorithm B a few days. Such precision with FW and MSA is frankly impossible: extrapolating theirconvergence rate, FW needs 80,000 years to reach this level, and MSA over 1016 years — nearly a billiontimes longer than the age of the universe. The moral of the story: even though each of these algorithms“works” in the sense that they will eventually get to the right answer, for large-scale practical problems it ishugely important to pick a good one. The good news is that all examples and homework problems in thisclass will be small enough that any of them can be usefully applied.

2 History

The earliest methods used for traffic assignment were heuristic in nature, and not guaranteed to find an exactequilibrium solution, or even to converge towards one. Examples include simply assigning all vehicles to theshortest paths using free-flow travel times, or “incremental assignment” methods, where a certain numberof vehicles was assigned to free-flow shortest paths; travel times were recomputed; an additional number ofvehicles assigned to the new shortest paths; travel times recalculated again; more vehicles assigned; and soforth.

The first exact method to achieve popularity in practice, and one of the longest-lasting, was the Frank-Wolfemethod, first applied to transportation problems in the late 1960s and 1970s. It was acceptably fast giventhe network sizes being used and standards of the day, more accurate than incremental assignment, andmost importantly, required very little computer memory in a day when RAM was a scarce and expensive

1Please don’t try to solve this one by hand.

2

Page 3: Msa

commodity. Furthermore, the algorithm had an appealing intuitive interpretation which was easily under-stood. Faster and more accurate methods were developed in the 1980s and 1990s, but these all failed toreach the same ubiquity in practice that Frank-Wolfe did. Some of these techniques, including path-basedalgorithms such as gradient projection, were developed during this time, but computers had yet to reach thepoint where increased speed could justify the extravagance of higher memory consumption.

A seminal development in the history of algorithms for the traffic assignment problem is the concept ofa bush, which we’ll return to later in this course. Bush-based methods were independently developed byseveral researchers. The first approach to gain wide attention for traffic assignment was the “origin-basedassignment” algorithm developed by Hillel Bar-Gera in his doctoral dissertation, around the turn of themillennium. This method was substantially faster than link-based methods such as Frank-Wolfe, yet did nothave the exorbitant memory requirements that path-based algorithms require. Bushes also proved highlyuseful in other network-related problems, and improved bush-based methods have been developed since thattime. Much of the commercially-available planning software in use today uses a bush-based method.

3 Framework

As the above discussion might suggest, none of these solution methods get to the right answer immediately.That is, there isn’t any “step one, step two, step three, and then we’re done” recipe for solving large-scaleequilibrium problems2. Instead, an iterative approach is used where we start with some assignment of driversto paths and links, and move closer and closer to the equilibrium solution as you repeat a certain set of stepsover and over, until you’re “close enough” to quit and call it good.3

Broadly speaking, all equilibrium solution algorithms repeat the following three steps:

1. Find the fastest path between each origin and each destination.

2. Shift travelers from slower paths to faster ones.

3. Recalculate link demands and travel times after the shift, and return to step one unless we’ve closeenough to equilibrium.

We’ll soon talk about how to find the fastest paths (the first step) in a systematic way. The good news isthat this can be done quickly and efficiently even in large networks; for now we can spot the fastest pathsby inspection. The third step is even more straightforward, and is nothing more than re-evaluating the linkperformance functions on each link with the new volumes. The second step requires the most care; thedanger here is shifting either too few travelers onto faster paths, or shifting too many. If we shift too few,then it will take a long time to get to the equilibrium solution. On the other hand, systematically shifting toomany can be even more dangerous, because it creates the possibility of “infinite cycling” and never findingthe true equilibrium.

Recall the simple example in Figure 1, where the equilibrium is for thirty travelers to choose the top route,and twenty to choose the bottom route, with an equal travel time of 40 minutes on both paths. Solvingthis example using the above process, initially (i.e., with nobody on the network) the fastest path is thetop one (step one), so let’s assign all 50 travelers onto the top path (step two). Performing the third step,we recalculate the travel times as 60 minutes on the top link, and 20 on the bottom. This is not at all

2If you can think of one, please let me know. You’ll win the Nobel prize in economics. I’m not exaggerating.3One iterative algorithm you probably saw in calculus was Newton’s method for finding zeroes of a function. Repeat the

same step over and over until the function is sufficiently close to zero.

3

Page 4: Msa

1 2

10 + x1

20 + x2

50 vehicles traveling from 1 to 2

Figure 1: Two-link example for demonstration.

an equilibrium, so we go back to the first step, and see that the bottom path is now faster, so we have toshift some people from the top to the bottom. If we wanted, we could shift travelers one at a time, that is,assigning 49 to the top route and 1 to the bottom, seeing that we still haven’t found equilibrium, so trying48 and 2, then 48 and 7, and so forth, until finally reaching the equilibrium with 30 and 20. Clearly this isnot efficient, and is an example of shifting too few travelers at a time.

At the other extreme, let’s say we shift everybody onto the fastest path in the second step. That is, we gofrom assigning 50 to the top route and 0 to the bottom, to assigning 0 to the top and 50 to the bottom.Recalculating link travel times, the top route now has a travel time of 10 minutes, and the bottom a traveltime of 70. This is even worse!4 Repeating the process, we try to fix this by shifting everybody back (50 ontop, 0 on bottom), but now we’re just back in the original situation. If we kept up this process, we’d keepbouncing back and forth between these solutions. This is even worse than shifting too few, because we neverreach the equilibrium no matter how long we work! You might think it’s obvious to detect if something likethis is happening. With this small example, it might be. Trying to train a computer to detect this, or tryingto detect cycles with over 2 million OD pairs (as in Chicago), is much much harder.

At this point, it’s worth using the Beckmann formulation to show that this intuitive approach has math-ematical justification. If x represents the current set of link demands, and x∗ the set of link demands ifeverybody was loaded on the shortest paths using the “current” travel times t(x), shifting travelers ontoshortest path corresponds to moving from x in the direction of the vector x∗ − x, to some new solutionx′ = x + λ(x∗ − x) = (1− λ)x + λx∗ where λ represents the size of the step taken in this direction — λ = 0corresponds to no shift at all, while λ = 1 corresponds to shifting everybody onto the current shortest paths.

As a result of this shift, the Beckmann function will be changed from f(x) to f(x1), and we want to showthat it’s possible to choose λ in some way to guarantee f(x′) ≤ f(x). That is, we want to show that we canreduce the Beckmann function (and thus move closer to the equilibrium solution) by taking a (correctly-sized) step in the direction x∗ − x. Define f(x(λ)) = f((1− λ)x + λx∗) to be the Beckmann function aftertaking a step of size λ. Using the multivariate chain rule, the derivative of f(λ) is

df

dλ=

∑(i,j)∈A

∂f

∂xij

dxijdλ

=∑

(i,j)∈A

tij((1− λ)xij + λx∗ij)(x∗ij − xij)

Evaluating this derivative at λ = 0 gives

dλ(0) =

∑(i,j)∈A

tij(xij)(x∗ij − xij)

Now, x∗ was specifically chosen to put all vehicles on the shortest paths at travel times t(x), and so∑(i,j)∈A x

∗ijtij(xij) ≤

∑(i,j)∈A xijtij(xij), and therefore df

dλ (0) ≤ 0. Furthermore, if we are not at the equi-

librium solution already,∑

(i,j)∈A x∗ijtij(xij) is strictly less than

∑(i,j)∈A xijtij(xij). This implies df

dλ (0) < 0

4By “worse” I mean farther from equilibrium.

4

Page 5: Msa

or, equivalently, we can decrease the Beckmann function if we take a small enough step in the directionx∗ − x, by shifting people from longer paths onto shorter ones.

The question of exactly how large a step we ought to take will be taken up in subsequent sections, and isthe main step in which algorithms differ.

3.1 Stopping Criteria

A general issue is how one chooses to stop the iterative process, that is, how one knows when a solution is“good enough” or close enough to equilibrium. This is called a convergence criterion. Many convergencecriteria have been proposed over the years; perhaps the most common is the relative gap, which is definedhere. Remembering that the multiplier κrs represents the time spent on the fastest path between origin rand destination s, the relative gap γ is commonly defined as follows:

γ =

∑(i,j)∈A tijxij∑

(r,s)∈Z2 κrsdrs− 1 =

t · xκ · d

− 1 (8)

Notice that the numerator of the fraction is the total system travel time (TSTT). The relative gap is alwaysnonnegative, and it is equal to zero if and only if the flows xa satisfy the principle of user equilibrium.5 It isthese properties which make the relative gap a useful convergence criterion: once it is close enough to zero,our solution is “close enough” to equilibrium. For most practical purposes, a relative gap of 10−6 is smallenough.

One drawback of the relative gap is that it is unitless and does not have an intuitive meaning. Furthermore,there are several slightly different variations of the relative gap which are currently used, and while theyall retain the above flavor, the exact definitions may vary. A more recently proposed metric is the averageexcess cost, defined as

AEC =

∑(i,j)∈A tijxij −

∑(r,s)∈Z2 κrsdrs∑

(r,s)∈Z2 drs=

t · x− κ · dd · 1

(9)

This quantity represents the average difference between the travel time on each traveler’s actual path, andthe travel time on the shortest path available to him or her. Unlike the relative gap, AEC has units of time,and is thus easier to interpret.

4 Method of Successive Averages

Although the method of successive averages (MSA) is not competitive with other equilibrium solution algo-rithms, its simplicity and clarity in applying the three-step iterative process make it an ideal starting place.One advantage of MSA is that you never need to work with the path flows hπ — this algorithm operatesentirely in the space of link flows xij. In large networks, there are many, many more used paths than links(93,026,894 used paths vs. 39,018 links in the Chicago Regional network), so this is important.

The first and third steps of MSA operate the same as in all other equilibrium algorithms, so this sectionand all following ones focus only on step two: once you’ve found the shortest paths, how do you decide howmany travelers to shift onto these, and how many stay on their current paths? As shown above, there are

5If this is not apparent to you, it is worthwhile studying this equation closely until you understand why.

5

Page 6: Msa

problems if you shift too few travelers, and potentially even bigger problems if you shift too many. MSAadopts a reasonable middle ground: initially, we shift a lot of travelers, but as the algorithim progresses, weshift fewer and fewer until we settle down on the average. The hope is that this avoids both the problems ofshifting too few (at first, we’re taking big steps, so hopefully we get somewhere close to equilibrium quickly)and of shifting too many (eventually, we’ll only be moving small amounts of flow so there is no worry ofinfinite cycling).

Specifically, on the i-th iteration, MSA shifts 1/i of the travelers onto the shortest paths. So, the first timethrough the three steps, everybody is assigned to shortest paths. The second time through, half of the peoplestay on their current paths and half shift to the new shortest paths. On the third iteration, a third of thepeople shift to new paths, and two thirds stay on their old paths, and so forth. A complete description ofMSA is as follows; in these steps, xi is the vector of link flows after the i-th iteration of MSA.

1. Set the iteration counter i = 1.

2. Find the shortest path between each origin and destination, and calculate the relative gap (unless it isthe first iteration). If the relative gap is sufficiently small, stop.

3. Shift travelers onto shortest paths:

(a) Find the link flows if everybody were traveling on the shortest paths found in step 1, store thesein x∗.

(b) If this is the first iteration, x1 = x∗. Otherwise, xi = (1/i)x∗ + (1− 1/i)xi−1.

4. Calculate the new link travel times and the relative gap. Increase the iteration counter i by one andreturn to step 1.

4.1 Small network example

Here we solve the small example of Figure 1 by MSA, using the relative gap to measure how close we are toequilibrium.

Initialization. Set i = 1.

Iteration 1. Find the shortest paths: with no travelers on the network, the top link has a travel timeof 10, and the bottom link has a travel time of 20. Therefore the top link is the shortest path, sox∗ =

[50 0

]. Since is the first iteration, we simply set x1 = x∗ =

[50 0

]. Recalculating the travel

times, we have t1 = 10 + x1 = 60 and t2 = 20 + x2 = 20 (or, in vector form, t1 =[60 20

]). We set

i = 2 and return to step 1.

Iteration 2. With the new travel times, the shortest path is now the bottom link, so u = 20 and the relativegap is

γ =t · xu · q

− 1 =50× 60 + 0× 20

20× 50− 1 = 2

This is far too big, so we continue with the second iteration. If everyone were to take the new shortestpath, the flows would be x∗ =

[0 50

]. Because this is iteration 2, we shift 1/2 of the travelers onto

this path, so x2 = (1/2)x∗ + (1/2)x1 =[0 25

]+[25 0

]=[25 25

]. The new travel times are thus

t2 =[35 45

]. We set i = 3 and return to step 1.

6

Page 7: Msa

1

2

3

4

5 6

1

2

14

3

5 6

7

3 41 5,000 02 0 10,000

Figure 2: Larger example with two OD pairs. (Link numbers shown.)

Iteration 3. With the new travel times, the shortest path is now the top link, so u = 35 and the relativegap is

γ =t · xu · q

− 1 =25× 35 + 25× 45

35× 50− 1 = 0.143

This is still too big, so we continue with the third iteration. If everyone were to take the new shortestpath, the flows would be x∗ =

[50 0

]. Because this is iteration 3, we shift 1/3 of the travelers onto

this path, so x3 = (1/3)x∗ + (2/3)x2 =[50/3 0

]+[50/3 50/3

]=[100/3 50/3

]. The new travel

times are thus t3 =[43.33 36.67

]. Set i = 4 and return to step 1.

Iteration 4. With the new travel times, the shortest path is now the bottom link, so u = 36.67 andthe relative gap is γ = 0.121. A bit better, but still too big, so we carry on. Here x∗ =

[0 50

],

x4 = (1/4)x∗+(3/4)x3 =[0 50/4

]+[25 50/4

]=[25 25

]. The new travel times are t4 =

[35 45

].

Set i = 5 and return to step 1. Note that we have returned to the same solution found in Iteration 2.Don’t despair; this just means the last shift was too big. Next time we’ll shift fewer vehicles (because1/i is smaller).

Iteration 5. With the new travel times, the shortest path is now the top link, so u = 35 and the relativegap is γ = 0.143. This is the same as in Iteration 3, but take courage and carry on. x∗ =

[50 0

],

x5 = (1/5)x∗ + (4/5)x4 =[30 20

]. The new travel times are t5 =

[40 40

]. Set i = 6 and return to

step 1. We can tell this is the equilibrium by inspection, but for the sake of rigor we’ll continue untilthe algorithm formally ends.

Iteration 6. With the new travel times, the shortest path is the top link, so u = 30 and the relative gap isγ = 0, so we stop. In fact, either path could have been chosen for the shortest path. Whenever there isa tie between shortest paths, you are free to choose among them.

So, for the small example it took MSA six iterations to find the right solution.

4.2 Larger network example

Here we apply MSA to a slightly larger network with two OD pairs, shown in Figure 2, where each link hasthe link performance function t(x) = 10 + x/100.

There are four paths in this network; for OD pair (1,3) these are denoted [1, 3] and [1, 5, 6, 3] according totheir link numbers, and for OD pair (2,4) these are [2, 5, 6, 4] and [2, 4]. In this example, we’ll calculate theaverage excess cost, rather than the relative gap.

7

Page 8: Msa

Initialization. Set i = 1.

Iteration 1. Find the shortest paths: with no travelers on the network, paths [1, 3], [1, 5, 6, 3], [2, 5, 6, 4],and [2, 4] respectively have travel times of 10, 30, 30, and 10. Therefore [1, 3] is shortest for OD pair(1,3), and [2, 4] is shortest for OD pair (2,4), so x∗ =

[5000 0 0 0 0 0 10000

].6 Since is the

first iteration, we simply set

x1 = x∗ =[5000 0 0 0 0 0 10000

]Recalculating the travel times, we have

t1 =[60 10 10 10 10 10 110

]We set i = 2 and return to step 1.

Iteration 2. With the new travel times, the shortest path for (1,3) is now [1, 5, 6, 3], with a travel time of30, so κ13 = 30. Likewise, the new shortest path for (2,4) is [2, 5, 6, 4], so κ24 = 30 and the averageexcess cost is

AEC =t · x− κ · q

d · 1=

5000× 60 + 10000× 110− 30× 5000− 30× 10000

5000 + 10000= 63.33

This is far too big and suggests that the average trip is 63 minutes faster than the shortest pathsavailable! We are nowhere near equilibrium, so we continue with the second iteration. If everyone wereto take the new shortest paths, the flows would be

x∗ =[0 5000 15000 5000 10000 10000 0

](Be sure you understand how we calculated this.) Because this is iteration 2, we shift 1/2 of thetravelers onto this path, so

x2 = (1/2)x∗ + (1/2)x1 =[2500 2500 7500 2500 5000 5000 5000

]The new travel times are thus

t2 =[35 35 85 35 60 60 60

]We set i = 3 and return to step 1.

Iteration 3. With the new travel times, the shortest path for (1,3) is now [1, 3], with u13 = 35. The newshortest path for (2,4) is [2, 4], so κ24 = 60 and the average excess cost is

AEC =

2500× 35 + 2500× 35 + 7500× 85 + 2500× 35 + 5000× 60 + 5000× 60 + 5000× 60− 35× 5000− 60× 10000

15000= 68.33

This is still big (and in fact worse), but we persistently continue with the second iteration. If everyonewere to take the new shortest paths, the flows would be

x∗ =[5000 0 0 0 0 0 10000

]so

x3 = (1/3)x∗ + (2/3)x2 =[3333 1667 5000 1667 3333 3333 6667

]The new travel times are

t3 =[43.3 26.7 60 26.7 43.3 43.3 76.7

]We set i = 4 and return to step 1.

6For each OD pair, we add the total demand from the OD matrix onto each link in the shortest path.

8

Page 9: Msa

Iteration 4. With the new travel times, the shortest path for (1,3) is still [1, 3], with κ13 = 43.3, andthe shortest path for (2,4) is still [2, 4] with κ24 = 76.7 and the average excess cost is AEC = 23.6.Continuing the fourth iteration, as before

x∗ =[5000 0 0 0 0 0 10000

]so

x4 = (1/4)x∗ + (3/4)x3 =[3750 1250 3750 1250 2500 2500 7500

]The new travel times are

t4 =[47.5 22.5 47.5 22.5 35 35 85

]We set i = 5 and return to step 1.

Iteration 5. With the new travel times, the shortest path for (1,3) is still [1, 3], with κ13 = 47.5, andthe shortest path for (2,4) is still [2, 4] with κ24 = 85 and the average excess cost is AEC = 9.42.Continuing the fifth iteration, as before

x∗ =[5000 0 0 0 0 0 10000

]so

x5 = (1/5)x∗ + (4/5)x4 =[4000 1000 3000 1000 2000 2000 8000

]The new travel times are

t5 =[50 20 40 20 30 30 90

]We set i = 6 and return to step 1.

Iteration 6. With the new travel times, the shortest path for (1,3) is still [1, 3], with κ13 = 50, and theshortest path for (2,4) is still [2, 4] with κ24 = 90 and the average excess cost is AEC = 3.07. Notethat the shortest paths have stayed the same over the last three iterations. This means that we reallycould have shifted more flow than we actually did. The Frank-Wolfe algorithm, described in the nextsection, fixes this problem. We have

x∗ =[5000 0 0 0 0 0 10000

]so

x6 = (1/6)x∗ + (5/6)x5 =[4167 833 2500 833 1667 1667 8333

]The new travel times are

t6 =[51.7 18.3 35 18.3 26.7 26.7 93.3

]We set i = 7 and return to step 1.

Iteration 7. With the new travel times, the shortest path for (1,3) is still [1, 3], with κ13 = 51.7, but theshortest path for (2,4) is now [2, 5, 6, 4] with κ24 = 88.3. The average excess cost is AEC = 3.80. Notethat the OD pairs are no longer behaving “symmetrically,” the shortest path for (1,3) stayed the same,but the shortest path for (2,4) has changed. We have

x∗ =[5000 0 10000 0 10000 10000 0

]so

x7 = (1/7)x∗ + (6/7)x6 =[4286 714 3571 714 2857 2857 7142

]The new travel times are

t7 =[52.9 17.1 45.7 17.1 38.6 38.6 81.4

]We set i = 8 and return to step 1.

9

Page 10: Msa

This process continues over and over until the average excess cost is sufficiently small. Even with such asmall network, MSA requires a very long time to converge. An average excess cost of 1 is obtained aftertwelve iterations, 0.1 after sixty-four iterations, 0.01 after three hundred thirty-three, and I’m not patientenough to go further.

5 Frank-Wolfe

One of the biggest drawbacks with MSA is that it has a fixed step size (or, more informally, a “dumb” stepsize). Iteration i moves exactly 1/i of the travelers onto the new shortest paths, no matter how close or faraway we are from the equilibrium. Essentially, MSA decides its course of action before it even gets started,then sticks stubbornly to the plan of moving 1/i travelers each iteration. The Frank-Wolfe (FW) algorithmfixes this problem by using an adaptive step size. At each iteration, FW calculates exactly the right amountof flow to shift to get as close to equilibrium as possible.

So, at each iteration we calculate the new flows with the equation xi = λx∗ + (1 − λ)xi−1. With MSA wealways chose λ = 1/i, but with FW λ is chosen adaptively. The extreme values λ = 0 and λ = 1 mean wekeep everybody on the current path, or shift everybody to the shortest path, respectively. We want to pickλ in this range in such a way that xi is as close to equilibrium is possible. We might try to do this by pickingλ to minimize the relative gap or average excess cost, but this turns out to be harder to compute. Instead,we pick λ to minimize the Beckmann function.

Recall the discussion above, where we wrote the function ζ(λ) = z((1− λ)x0 + λx∗) to be the value of theBeckmann function after taking a step of size λ, and furthermore found the derivative of ζ to be

dλ=

∑(i,j)∈A

tij((1− λ)x0ij + λx∗ij)(x

∗ij − x0

ij) (10)

It is not difficult to show that ζ is a convex function, so we can find its minimum by setting the derivativeequal to zero, which occurs if the condition∑

(i,j)∈A

x∗ijtij((1− λ)x0ij + λx∗ij) =

∑(i,j)∈A

x0ijtij((1− λ)x0

ij + λx∗ij) (11)

is satisfied. Study this equation carefully: the coefficients x0ij and x∗ij are constants and do not change with λ;

the only part of this condition which is affected by λ are the travel times. You can interpret this equation astrying to find a balance between x0 and x∗ in the following sense: different values of λ correspond to shiftinga different number of travelers from their current paths to shortest paths, which will result in different traveltimes on all the links. You want to pick λ so that, after you make the switch, both the old paths x0 and theold shortest paths x∗ are equally attractive in terms of their travel times. (If you don’t find this intuitivejustification convincing, then you can focus on the mathematical one: this condition entails choosing the λvalue minimizing the Beckmann function.)

More practically, you might ask how to solve the equation (11) for λ, since the link performance functionsare typically nonlinear. General techniques such as Newton’s Method or an equation solver can be used;but it’s not too difficult to use an enlighted trial-and-error method such as a binary search or bisection —because ζ is convex, its derivative is increasing, and typically dζ/dλ ≤ 0 for λ = 0 and dζ/dλ ≥ 0 for λ = 1.Pick λ = 1/2 and calculate dζ/dλ. If it’s negative, you know the zero occurs in the interval [1/2, 1], and youcan try λ = 3/4 next. Alternatively, if the derivative is positive at λ = 1/2, the zero occurs in the interval[0, 1/2], and you can try λ = 1/4 next. Depending on the sign here, you can eliminated half of the remaining

10

Page 11: Msa

search space as well, and continue until you’ve found with zero with as much precision as desired. This isrelatively easy to set up in a spreadsheet, and would be a useful exercise.

This is the only difference between MSA and FW, but it is a significant one, as seen in the following examples.

5.1 Small network example

Here we solve the small example of Figure 1 by FW. Some steps are similar to MSA, and therefore omitted.Here, when we do the bisection method, we do five interval reductions (so we are within 1/25 = 1/32 of thecorrect λ∗ value. When solving by computer, you would usually perform more steps than this, because thebisection calculations are very fast.)

Iteration 1. As before, we load everybody on the initial shortest path, so x1 = x∗ =[50 0

]and t1 =[

60 20]

Iteration 2. As before, the relative gap is γ = 2. With the new shortest paths, x∗ =[0 50

]. Begin the

bisection method.

Bisection Iteration 1. Initially λ∗ ∈ [0, 1]. Calculate dz′/dλ(1/2) = (0−50)× (10 + 25) + (50−0)×(20 + 25) = 500 > 0 so we discard the upper half.

Bisection Iteration 2. Now we know λ∗ ∈ [0, 1/2]. Calculate dz′/dλ(1/4) = (0− 50)× (10 + 37.5) +(50− 0)× (20 + 12.5) = −750 < 0 so we discard the lower half.

Bisection Iteration 3. Now we know λ∗ ∈ [1/4, 1/2]. Calculate dz′/dλ(3/8) = (0 − 50) × (10 +18.75) + (50− 0)× (20 + 18.75) = −125 < 0 so we discard the lower half.

Bisection Iteration 4. Now we know λ∗ ∈ [3/8, 1/2]. Calculate dz′/dλ(7/16) = (0 − 50) × (10 +21.875) + (50− 0)× (20 + 21.875) = 187.5 > 0 so we discard the upper half.

Bisection Iteration 5. Now we know λ∗ ∈ [3/8, 7/16]. Calculate dz′/dλ(13/32) = (0 − 50) × (10 +20.3125) + (50− 0)× (20 + 20.3125) = 31.25 > 0 so we discard the upper half.

From here we take the midpoint of the last interval [3/8, 13/32] to estimate λ∗ ≈ 25/64 = 0.390625, sox2 = 25/64x∗ + 39/64x1 =

[30.47 19.53

]and t2 =

[40.47 39.53

].

Iteration 3. The relative gap is calculated as γ = 0.014. (This is an order of magnitude smaller thanthe relative gap MSA found by this point.) The shortest paths are still x∗ =

[0 50

], and we begin

bisection.

Bisection Iteration 1. Initially λ∗ ∈ [0, 1]. Calculate dz′/dλ(1/2) = 900 > 0 so we discard the upperhalf.

Bisection Iteration 2. Now we know λ∗ ∈ [0, 1/2]. Calculate dz′/dλ(1/4) = 435 > 0 so we discardthe upper half.

Bisection Iteration 3. Now we know λ∗ ∈ [0, 1/4]. Calculate dz′/dλ(1/8) = 203 > 0 so we discardthe upper half.

Bisection Iteration 4. Now we know λ∗ ∈ [0, 1/8]. Calculate dz′/dλ(1/16) = 87 > 0 so we discardthe upper half.

Bisection Iteration 5. Now we know λ∗ ∈ [0, 1/16]. Calculate dz′/dλ(1/32) = 29 > 0 so we discardthe upper half.

The midpoint of the final interval is λ∗ ≈ 1/64, so x3 = 1/64x∗ + 63/64x2 =[29.99 20.01

]and

t3 =[39.99 40.01

].

11

Page 12: Msa

Iteration 4. The relative gap is now γ = 0.00014, so we quit and claim we have found flows that are “goodenough” (the difference in travel times between the routes is less than a second).

Alternately, using calculus, we could have identified λ∗ during the second iteration as exactly 0.40, whichwould have found the exact equilibrium after only one step.

5.2 Large network example

Here we apply FW to the network shown in Figure 2, using the same notation as in the MSA example.

Iteration 1. Path [1, 3] is shortest for OD pair (1,3), and path [2, 4] is shortest for OD pair (2,4), so

x∗ =[5000 0 0 0 0 0 10000

]and

x1 = x∗ =[5000 0 0 0 0 0 10000

]Recalculating the travel times, we have

t1 =[60 10 10 10 10 10 110

]Iteration 2. With the new travel times, the shortest path for (1,3) is now [1, 5, 6, 3], and the new shortest

path for (2,4) is [2, 5, 6, 4], so AEC = 63.33 If everyone were to take the new shortest paths, the flowswould be

x∗ =[0 5000 15000 5000 10000 10000 0

]Begin the bisection method to find the right combination of x∗ and x1.

Bisection Iteration 1. Initially λ∗ ∈ [0, 1]. Calculate dz′/dλ(1/2) = (0− 5000)× (10 + 2500/100) +. . .+ (0− 10000)× (10 + 5000/100) = 1025000 > 0 so we discard the upper half.

Bisection Iteration 2. Now we know λ∗ ∈ [0, 1/2]. Calculate dz′/dλ(1/4) = (0 − 5000) × (10 +3750/100) + . . .+ (0− 10000)× (10 + 7500/100) = 137500 > 0 so we discard the upper half.

Bisection Iteration 3. Now we know λ∗ ∈ [0, 1/4]. Calculate dz′/dλ(1/8) = (0 − 5000) × (10 +4375/100) + (0− 10000)× (10 + 8750/100) = −25000 < 0 so we discard the lower half.

Bisection Iteration 4. Now we know λ∗ ∈ [1/8, 1/4]. Calculate dz′/dλ(3/16) = (0 − 5000) × (10 +4062/100) + (0− 10000)× (10 + 8125/100) = 32812 > 0 so we discard the upper half.

Bisection Iteration 5. Now we know λ∗ ∈ [1/8, 3/16]. Calculate dz′/dλ(5/32) = (0− 5000)× (10 +4219/100) + (0− 10000)× (10 + 8437/100) = −1953 < 0 so we discard the lower half.

The final interval is λ∗ ∈ [5/32, 3/16], so the estimate is λ∗ = 11/64 and

x2 = (11/64)x∗ + (53/64)x1 =[4141 859 2578 859 1719 1719 8281

]The new travel times are thus

t2 =[51.4 18.6 35.8 18.6 27.2 27.2 92.8

]Iteration 3. With the new travel times, the shortest path for (1,3) is now [1, 3], but the shortest path for

(2,4) is still [2, 5, 6, 4]. The relative gap is AEC = 2.67 (roughly 30 times smaller than the corresopndingpoint in the MSA algorithm!) We have

x∗ =[5000 0 10000 0 10000 10000 0

]We begin the bisection method to find the right combination of x∗ and x1.

12

Page 13: Msa

Bisection Iteration 1. Initially λ∗ ∈ [0, 1]. Calculate dz′/dλ(1/2) = 637329 > 0 so we discard theupper half.

Bisection Iteration 2. Now we know λ∗ ∈ [0, 1/2]. Calculate dz′/dλ(1/4) = 154266 > 0 so wediscard the upper half.

Bisection Iteration 3. Now we know λ∗ ∈ [0, 1/4]. Calculate dz′/dλ(1/8) = 36063 > 0 so we discardthe upper half.

Bisection Iteration 4. Now we know λ∗ ∈ [0, 1/8]. Calculate dz′/dλ(1/16) = 7741 > 0 so we discardthe upper half.

Bisection Iteration 5. Now we know λ∗ ∈ [0, 1/16]. Calculate dz′/dλ(1/32) = 1302 > 0 so wediscard the upper half.

The final interval is λ∗ ∈ [0, 1/32], so the estimate is λ∗ = 1/64 and

x3 = (1/64)x∗ + (63/64)x2 =[4154 845 2694 845 1848 1848 8152

]The new travel times are thus

t3 =[51.5 18.5 36.9 18.5 28.5 28.5 91.5

]At this point, the average excess cost is around 1.56 min; note that FW is able to decrease the relativegap much faster than MSA. However, we’re still quite far from equilibrium if you compute the actual pathtravel times. In this case, even though we’re allowing the step size to vary for each iteration, we are forcingtravelers from all OD pairs to shift in the same proportion. In reality, OD pairs farther from equilibriumshould see bigger flow shifts, and OD pairs closer to equilibium should see smaller ones. We’ll return to thisissue later in the semester, but first we’ll introduce some other network models.

13