Upload
natan
View
56
Download
8
Tags:
Embed Size (px)
DESCRIPTION
Software Multiagent Systems: Lecture 10. Milind Tambe University of Southern California [email protected]. Announcements. From now on, slides posted on our class web site Password: teamcore Homework answers will be sent out by email next week. DCOP Definition. d i d j f(d i ,d j ) 1 - PowerPoint PPT Presentation
Citation preview
Software Multiagent Systems: Lecture 10
Milind TambeUniversity of Southern [email protected]
Announcements
From now on, slides posted on our class web sitePassword: teamcore
Homework answers will be sent out by email next week
DCOP Definition
Variables {x1,x2,…,xn} distributed among agents
Domains D1,D2,...,DN,
Link functions fij: Di x Dj → N.
x1
x2
x3 x4
Find assignment A* s.t. F(A*) is min,
F(A) = fij(di,dj), xidi,xj dj in A
Cost = 0
x1
x2
x3 x4
Cost = 4
x1
x2
x3 x4
Cost = 7
di dj f(di,dj)1220
Branch and Bound Search
Familiar with branch and bound search?
Synchronous Branch and Bound (Hirayama97)
• Agents prioritized into chain • Choose value, send partial solution (with cost) to child• When cost exceeds upper bound, backtrack• Agent explores all its values before reporting to parent
x1
x2
x3
x4
0
di dj f(di,dj)1220
x1
x2
x3
x4
1
x1
x2
x3
x43
x1
x2
x34 = UB
x4
x1
x2
x3
x4
DCOP before ADOPT
Branch and Bound
Backtrack condition - when cost exceeds upper bound
Problem – sequential, synchronous
Asynchronous Backtracking
Backtrack condition - when constraint unsatisfiable
Problem - only hard constraints allowed
Observation: Backtrack only when sub-optimality is proven
Can we backtrack without proving optimality?
Adopt: Idea #1
Weak backtracking: When lower bound gets too high
Why lower bounds?
Allows asynchrony!
Yet allows quality guarantees
Downside?
Backtrack before sub-optimality is proven
Cant throw away solutions; need to revisit!
Adopt: Idea #2
Solutions need revisiting
How could we do that?
Remember all previous solutions
Efficient reconstruction of abandoned solutions
Adopt Overview
Agents are ordered in a DFS TREE
Constraint graph need not be a tree
x1
x2
x3 x4
Adopt Overview
Agents concurrently choose values
VALUE messages sent down
COST messages sent up only to parent
THRESHOLD messages sent down only to child
Constraint Graph
x1
x2
x3 x4
di dj f(di,dj)1220
VALUE messages
COST messages
x1
x2
x4x3
THRESH messages
Asynchronous, concurrent search
di dj f(di,dj)1220
Each variable has two values: b and wEach initialized with a lower-bound of 0
x1
x2
x3 x4
Asynchronous, concurrent search
Concurrently choose, send to descendents
x1
x2
x3 x4
Optimal Solution
x1
x2
x3 x4
. . .Concurrently reportlocal costs,with contexte.g. x3 sends cost 2 withx1=b,x2=b
x1
x2
x3 x4
12
1
x4
x1 switches to “better?” value
x1
x2
x3
•x2, x3 switch to best value, report cost, with context•x2 disregards x3’s report (context mismatch)
x1
x2
x3 x4
0
2
di dj f(di,dj)1220
Asynchronous, concurrent searchAlgorithm:
Agents are prioritized into tree
Agents:
Initialize lower bounds of values to zero
Concurrently choose values, send to all connected descendents.
Choose the best value given what ancestors chose:
immediately send cost message to parent
Cost = lower bound + cost with ancestors
Costs asynchronously reach parent
Asynchronous costs: context attachment
Weak Backtracking
Suppose parent has two values, “white” and “black”
parent
Explore “white” first
LB(w) = 0LB(b) = 0
parent
Receive cost msg
LB(w) = 2LB(b) = 0
parent
Now explore “black”
LB(w) = 2LB(b) = 0
parent
Receive cost msg
LB(w) = 2LB(b) = 3
parent
Go back to “white”
LB(w) = 2LB(b) = 3
parent
Termination Condition True
LB(w)=10 =UB(w)LB(b)=12
. . . .
Key Lemma for soundness/correctness
Lemma: Assuming no context change, an agent’s report of cost is non-decreasing and is never greater than the actual cost.
Inductive Proof Sketch: Leaf agents never overestimate cost. Each agent sums the costs from its children and chooses its best choice and reports to parent.
di dj f(di,dj)1220
x2 receives costsfrom children, computestotal cost of 2 + 1 + 2 = 5.
x1
x2
x3 x4
12
Instead, x2 switches to unexplored value, reports lower bound
x1
x2
x3 x4
0
5 is an OVERestimate!
x1
x2
x3 x4
5
Revisiting Abandoned Solutions
Problem
reconstructing from scratch: inefficient
remembering solutions: expensive
Solution
remember only lower bounds: polynomial space
use lower bounds to efficiently re-search
lower bound = 10parent
single child
Chain Ordering
threshold = 10
Revisiting Abandoned Solutions
Solution
remember only lower bounds – polynomial space
use lower bounds to efficiently re-search
Suppose parent has two values, “a” and “b”
parent
single child
Explore “a” First
LB(a) = 10LB(b) = 0
parent
single child
Now explore “b”
parent
single child
Return to “a”
threshold = 10
LB(a) = 10
LB(b) = 11
Backtrack Thresholds
agent i received threshold = 10 from parent
Explore “white” first
LB(w) = 0LB(b) = 0threshold = 10
Receive cost msg
LB(w) = 2LB(b) = 0threshold = 10
Stick with “white”
LB(w) = 2LB(b) = 0threshold = 10
Receive more cost msgs
LB(w) = 11LB(b) = 0threshold = 10
Now try black
LB(w) = 11LB(b) = 0threshold = 10
agent i agent i
Key Point: Don’t change value until LB(current value) > threshold.
parent parent parent
thresh=5thresh=5 cost=6 thresh=4 thresh=6
Time T1 Time T2 Time T3
lower bound = 10parent
multiplechildren
Tree Ordering
thresh = ?thresh = ?
Idea: Rebalance threshold
Is Adopt completely distributed?
Evaluation of Speedups
Conclusions • Adopt’s asynchrony and parallelism yields significant efficiency gains• Sparse graphs (density 2) solved optimally, efficiently by Adopt.
Metric: Cycles
Cycle = one unit of algorithm progress in which all agents receive incoming messages; perform computation, send outgoing messages
Independent of machine speed, network conditions, etc.
Outgoing comm
Number of Messages
Conclusion• Communication grows linearly
• only local communication (no broadcast)
Is optimality a good goal to reach for?
Bounded error approximation
Motivation Quality control for approximate solutions
Problem User provides error bound b
Goal Find any solution S where
cost(S) cost(optimal soln) + b
lower bound = 10root
thresh = 10 + b
• Adopt’s ability to provide quality guarantees naturally leads to bounded error approximation!
Evaluation of Bounded Error
Conclusion
• Varying b is an effective method for doing time-to-solution/solution-quality tradeoffs.
Adopt summary – Key Ideas
First-ever optimal, asynchronous algorithm for DCOP
polynomial space at each agent
Weak Backtracking
lower bound based search method
Parallel search in independent subtrees
Efficient reconstruction of abandoned solutions
backtrack thresholds to control backtracking
Bounded error approximation
sub-optimal solutions faster
bound on worst-case performance
Discussion
Can we improve Adopt efficiency?
Can we allow n-ary constraints in Adopt?
Does Adopt preserve privacy?
What are some key applications of Adopt?
New Ideas for EfficiencyCommunication Structure
Idea: Reach a solution faster if end-to-end messaging is shorter
Application: Shorter depth trees in ADOPT
Intelligent Preprocessing of Bounds
PASSUP heuristic: bounds via one-time message up the tree
PASSUP extended via a framework of several preprocessing heuristics
Performance (EAV)Orders of Magnitude Speedup!
OptAPO 2004
OPTAPO
0
1000
2000
3000
4000
5000
8 12 16 20
Variables
Cycles
Adopt
OptAPO
Performance
•J. Davin, P. J. Modi , "Impact of Problem Centralization in Distributed Constraint Optimization Algorithms," Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2005.
Defining DCOP Centralization
Centralization: Aggregating problem information into a single agent
information was initially distributed among multiple agents, and
aggregation results in a larger local search space.
For example, constraints on external variables canbe centralized.
Motivation
Adopt and OptAPO:
Adopt does no centralization.
OptAPO does partial centralization.
OptAPO completes in fewer cycles than Adopt for graph coloring
But, cycles do not capture performance differences
When different levels of centralization.
Metric: Cycles
What is missing in measuring cycles?
Outgoing comm
Key Questions
How do we measure performance of DCOP algorithms that differ in their level of centralization?
How do Adopt and OptAPO compare when we use such a measure?
Results
Tested on graph coloring problems, |D|=3 (3-coloring).
# Variables = 8, 12, 16, 20, with link density = 2n or 3n.
50 randomly generated problems for each size.
Cycles: CCC:
OptAPO takes fewer cycles, but more constraint checks.
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
8 12 16 20
Variables
CCC
Adopt
OptAPO
0
1000
2000
3000
4000
5000
8 12 16 20
Variables
Cycles
Adopt
OptAPO