Software Multiagent Systems: Lecture 10

Software Multiagent Systems: Lecture 10

Milind TambeUniversity of Southern [email protected]

Announcements

From now on, slides posted on our class web sitePassword: teamcore

Homework answers will be sent out by email next week

DCOP Definition

Variables {x1,x2,…,xn} distributed among agents

Domains D1,D2,...,DN,

Link functions fij: Di x Dj → N.

x1

x2

x3 x4

Find assignment A* s.t. F(A*) is min,

F(A) = fij(di,dj), xidi,xj dj in A

Cost = 0

x1

x2

x3 x4

Cost = 4

x1

x2

x3 x4

Cost = 7

di dj f(di,dj)1220

Branch and Bound Search

Familiar with branch and bound search?

Synchronous Branch and Bound (Hirayama97)

• Agents prioritized into chain • Choose value, send partial solution (with cost) to child• When cost exceeds upper bound, backtrack• Agent explores all its values before reporting to parent

x1

x2

x3

x4

0

di dj f(di,dj)1220

x1

x2

x3

x4

1

x1

x2

x3

x43

x1

x2

x34 = UB

x4

x1

x2

x3

x4

DCOP before ADOPT

Branch and Bound

Backtrack condition - when cost exceeds upper bound

Problem – sequential, synchronous

Asynchronous Backtracking

Backtrack condition - when constraint unsatisfiable

Problem - only hard constraints allowed

Observation: Backtrack only when sub-optimality is proven

Can we backtrack without proving optimality?

Adopt: Idea #1

Weak backtracking: When lower bound gets too high

Why lower bounds?

Allows asynchrony!

Yet allows quality guarantees

Downside?

Backtrack before sub-optimality is proven

Cant throw away solutions; need to revisit!

Adopt: Idea #2

Solutions need revisiting

How could we do that?

Remember all previous solutions

Efficient reconstruction of abandoned solutions

Adopt Overview

Agents are ordered in a DFS TREE

Constraint graph need not be a tree

x1

x2

x3 x4

Adopt Overview

Agents concurrently choose values

VALUE messages sent down

COST messages sent up only to parent

THRESHOLD messages sent down only to child

Constraint Graph

x1

x2

x3 x4

di dj f(di,dj)1220

VALUE messages

COST messages

x1

x2

x4x3

THRESH messages

Asynchronous, concurrent search

di dj f(di,dj)1220

Each variable has two values: b and wEach initialized with a lower-bound of 0

x1

x2

x3 x4

Asynchronous, concurrent search

Concurrently choose, send to descendents

x1

x2

x3 x4

Optimal Solution

x1

x2

x3 x4

. . .Concurrently reportlocal costs,with contexte.g. x3 sends cost 2 withx1=b,x2=b

x1

x2

x3 x4

12

1

x4

x1 switches to “better?” value

x1

x2

x3

•x2, x3 switch to best value, report cost, with context•x2 disregards x3’s report (context mismatch)

x1

x2

x3 x4

0

2

di dj f(di,dj)1220

Asynchronous, concurrent searchAlgorithm:

Agents are prioritized into tree

Agents:

Initialize lower bounds of values to zero

Concurrently choose values, send to all connected descendents.

Choose the best value given what ancestors chose:

immediately send cost message to parent

Cost = lower bound + cost with ancestors

Costs asynchronously reach parent

Asynchronous costs: context attachment

Weak Backtracking

Suppose parent has two values, “white” and “black”

parent

Explore “white” first

LB(w) = 0LB(b) = 0

parent

Receive cost msg

LB(w) = 2LB(b) = 0

parent

Now explore “black”

LB(w) = 2LB(b) = 0

parent

Receive cost msg

LB(w) = 2LB(b) = 3

parent

Go back to “white”

LB(w) = 2LB(b) = 3

parent

Termination Condition True

LB(w)=10 =UB(w)LB(b)=12

. . . .

Key Lemma for soundness/correctness

Lemma: Assuming no context change, an agent’s report of cost is non-decreasing and is never greater than the actual cost.

Inductive Proof Sketch: Leaf agents never overestimate cost. Each agent sums the costs from its children and chooses its best choice and reports to parent.

di dj f(di,dj)1220

x2 receives costsfrom children, computestotal cost of 2 + 1 + 2 = 5.

x1

x2

x3 x4

12

Instead, x2 switches to unexplored value, reports lower bound

x1

x2

x3 x4

0

5 is an OVERestimate!

x1

x2

x3 x4

5

Revisiting Abandoned Solutions

Problem

reconstructing from scratch: inefficient

remembering solutions: expensive

Solution

remember only lower bounds: polynomial space

use lower bounds to efficiently re-search

lower bound = 10parent

single child

Chain Ordering

threshold = 10

Revisiting Abandoned Solutions

Solution

remember only lower bounds – polynomial space

use lower bounds to efficiently re-search

Suppose parent has two values, “a” and “b”

parent

single child

Explore “a” First

LB(a) = 10LB(b) = 0

parent

single child

Now explore “b”

parent

single child

Return to “a”

threshold = 10

LB(a) = 10

LB(b) = 11

Backtrack Thresholds

agent i received threshold = 10 from parent

Explore “white” first

LB(w) = 0LB(b) = 0threshold = 10

Receive cost msg


Stick with “white”


Receive more cost msgs


Now try black


agent i agent i

Key Point: Don’t change value until LB(current value) > threshold.

parent parent parent

thresh=5thresh=5 cost=6 thresh=4 thresh=6

Time T1 Time T2 Time T3

lower bound = 10parent

multiplechildren

Tree Ordering

thresh = ?thresh = ?

Idea: Rebalance threshold

Is Adopt completely distributed?

Evaluation of Speedups

Conclusions • Adopt’s asynchrony and parallelism yields significant efficiency gains• Sparse graphs (density 2) solved optimally, efficiently by Adopt.

Metric: Cycles

Cycle = one unit of algorithm progress in which all agents receive incoming messages; perform computation, send outgoing messages

Independent of machine speed, network conditions, etc.

Outgoing comm

Number of Messages

Conclusion• Communication grows linearly

• only local communication (no broadcast)

Is optimality a good goal to reach for?

Bounded error approximation

Motivation Quality control for approximate solutions

Problem User provides error bound b

Goal Find any solution S where

cost(S) cost(optimal soln) + b

lower bound = 10root

thresh = 10 + b

• Adopt’s ability to provide quality guarantees naturally leads to bounded error approximation!

Evaluation of Bounded Error

Conclusion

• Varying b is an effective method for doing time-to-solution/solution-quality tradeoffs.

Adopt summary – Key Ideas

First-ever optimal, asynchronous algorithm for DCOP

polynomial space at each agent

Weak Backtracking

lower bound based search method

Parallel search in independent subtrees

Efficient reconstruction of abandoned solutions

backtrack thresholds to control backtracking

Bounded error approximation

sub-optimal solutions faster

bound on worst-case performance

Discussion

Can we improve Adopt efficiency?

Can we allow n-ary constraints in Adopt?

Does Adopt preserve privacy?

What are some key applications of Adopt?

New Ideas for EfficiencyCommunication Structure

Idea: Reach a solution faster if end-to-end messaging is shorter

Application: Shorter depth trees in ADOPT

Intelligent Preprocessing of Bounds

PASSUP heuristic: bounds via one-time message up the tree

PASSUP extended via a framework of several preprocessing heuristics

Performance (EAV)Orders of Magnitude Speedup!

OptAPO 2004

OPTAPO

0

1000

2000

3000

4000

5000

8 12 16 20

Variables

Cycles

Adopt

OptAPO

Performance

•J. Davin, P. J. Modi , "Impact of Problem Centralization in Distributed Constraint Optimization Algorithms," Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2005.

Defining DCOP Centralization

Centralization: Aggregating problem information into a single agent

information was initially distributed among multiple agents, and

aggregation results in a larger local search space.

For example, constraints on external variables canbe centralized.

Motivation

Adopt and OptAPO:

Adopt does no centralization.

OptAPO does partial centralization.

OptAPO completes in fewer cycles than Adopt for graph coloring

But, cycles do not capture performance differences

When different levels of centralization.

Metric: Cycles

What is missing in measuring cycles?

Outgoing comm

Key Questions

How do we measure performance of DCOP algorithms that differ in their level of centralization?

How do Adopt and OptAPO compare when we use such a measure?

Results

Tested on graph coloring problems, |D|=3 (3-coloring).

# Variables = 8, 12, 16, 20, with link density = 2n or 3n.

50 randomly generated problems for each size.

Cycles: CCC:

OptAPO takes fewer cycles, but more constraint checks.

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

8 12 16 20

Variables

CCC

Adopt

OptAPO

0

1000

2000

3000

4000

5000

8 12 16 20

Variables

Cycles

Adopt

OptAPO

Documents

Software Multiagent Systems: Lecture 10