View
230
Download
0
Category
Tags:
Preview:
Citation preview
CS584 - Software Multiagent Systems
Lecture 12
Distributed constraint optimization II: Incomplete algorithms and recent theoretical
results
University of Southern California
2
PersonalAssistantAgents forScheduling
SensorNetworks
Multi-SpacecraftCoordination
Distributed Constraint Optimization
• DCOP (Distributed Constraint Optimization Problem)– Agents cooperate to
assign values to variables, subject to constraints, to maximize a global objective.
– R(a)=∑ RS(a) for all constraints S
University of Southern California
3
Algorithms for DCOP
• Complete algorithms– Pro: finds optimal solution– Con: not always feasible (exponential in time
or space)
• Local algorithms (today)– Pro: usually finds a high-quality solution
quickly– Con: not optimal (but can guarantee within %
of optimal)
University of Southern California
5
k-optimality
• k-optimal solution (k-optimum)– No deviation by ≤ k agents can increase solution
quality– Local optimum
– Globally optimal solution: 000.– 1-optimal solutions: 000, 111– k-optimal algorithms: DSA (k=1), MGM-2 (k=2), ...
University of Southern California
6
Approach
• Decompose DCOP.
• Each agent only sees its local constraints.
• Agents maximize individual utility.– (individual view of
team utility)
University of Southern California
7
1-optimal algorithms
• Algorithms for 1 agent:
– Broadcast current value to all neighbors and receive neighbors’ values.
– Find new value that gives highest gain in utility, assuming that neighbors stay fixed.
– Decide whether or not to change value, and act accordingly.
University of Southern California
9
x1 x2 x3
+5 0
0 +8
8 10 8
x1 x2 x3
0 0 0
MGM – Maximal Gain Message
• Monotonic algo., but gets stuck in a local optimum!• Only one agent in a neighborhood moves at a time
University of Southern California
10
x1 x2 x3
+5 0
0 +8
8 10 8
x1 x2 x3
5 16 5
DSA – Distributed Stochastic Algorithm
• One possible path (say p=0.5):
.1 .2 .3
x1 x2 x3
0 0 0
.6 .1 .7
University of Southern California
11
Experimental Domains
• Regular graph coloring (~sensor network)– Cost if neighbors choose same value.
• Randomized DCOP – Each combination of neighbors’ values gets
uniform random reward
• High-stakes scenario (~UAVs)– Large cost if neighbors choose same value. – Otherwise, small uniform random reward is
given.– Add “safe” value where all agents start. No
reward or penalty if neighbors choose this value.
University of Southern California
12
DSA vs. MGM
• Graph coloring and Randomized DCOP:– DSA gives higher
solution quality than MGM.
– DSA improves more quickly than MGM.
• High-stakes scenario:– DSA and MGM give
same solution quality.– MGM generally
improves more quickly than DSA.
• But, these graphs are averages....
University of Southern California
13
DSA vs. MGM
• MGM increases monotonically
• Much better for– anytime algorithm– high-stakes
domains.
University of Southern California
14
Algorithms with higher k
• Until now (DSA, MGM), agents have only acted based on their own, local constraints– a myopic worldview
• Now, we look at algorithms where agents form groups, and act based on all constraints in the group.– enlarging the worldview
• First step: groups of 2.– “2-optimality”– Maheswaran, Pearce, and Tambe ‘04
University of Southern California
15
Coordinated Algorithms
• All agents are either offerers or receivers with probability q.
• Offerers:– Pick neighbor j at random, and calculate my gains from
all combinations of values from myself and j.– Send this information (several offers) as a message to j.
• < <myGain1, myValue1, yourValue1>• <myGain2, myValue2, yourValue2>,… >
• Receivers:– Accept the offer that makes my group’s gain the highest,
or just move alone instead.– groupGain = offerersGain + receiversGain - gain in
common link.– If I accept an offer, tell the offerer which one I am
accepting, and how much our group will gain.
University of Southern California
16
2-optimal algorithms
• To improve solution quality, agents can form groups of 2
• Groups move according to group utility– sum of all constraints on any group member
• 2-optimal algorithm– any connected group of 2 agents can coordinate to
make a joint move.• 2-optimum
– state at which no group of up to 2 agents can make a joint move that will increase group reward.
• Any 2-optimum is also a 1-optimum
University of Southern California
17
MGM-2
• Form groups of 2 agents and then do:– Send my gain (can be group gain) to all my neighbors.– Receive gain messages from my neighbors.– If I am involved in an accepted offer,
• If my gain > neighbors’ gain (not counting my partner), send “yes” to my partner.
• If not, then send “no” to my partner.• If I sent “yes” and got “yes”, then make the move in the offer.
– If I am not involved in an offer• If my gain > neighbors’ gain, then make my best move.
• 5 message cycles per move– (offer, accept, gain, confirm, move).
• Monotonically increasing solution quality
University of Southern California
18
x1 x2 x3+5 0
0 +12
x1 x2 x3
MGM-2 Example
x1 x2 x3
no gains
offerer receiver offerer
x1, x2gain=7
x2, x3gain=7
x1 x2 x3
accepts x1, x2group gain=2
x1 x2 x3
receiver offerer receiver
x2, x3gain=12
accepts x2, x3group gain=12
University of Southern California
19
SCA-2(Stochastic Coordination Algorithm)
• Based on DSA• If offerer
– Send out offers to a randomly chosen neighbor.– If offer accepted, prepare to do the move in the
offer.– If offer not accepted, prepare to move alone (pick
move with highest individual gain).• If receiver
– If accepting offer, send acceptance message back to offerer, and prepare to do the move in the offer.
– Else, prepare to move alone. • Move, according to probability p.• 3 message cycles per move (offer, accept, move).
University of Southern California
20
Experimental Trends
Monotonic (1-opt, 2-opt) Stochastic (1-opt, 2-opt)
University of Southern California
22
Guarantees on Solution Quality
• Guarantee of k-optimum as % of global optimum – Factors:
• k (how local of an optimum)• m (maximum -arity of constraints)• n (number of agents)• constraint graph structure (if known)•Note: actual costs/rewards on constraints
– distributed among agents, not known a priori
University of Southern California
23
Guarantees on Solution Quality
• Three results• Guarantees for:
– Fully-connected DCOP graphs• Applies to all graphs (i.e. when graph is unknown)• Closed-form equation
– Particular graphs• Stars• Rings• Closed-form equation
– Arbitrary DCOP graphs• Linear program
University of Southern California
24
Fully-Connected Graph
• Reward of k-optimum in terms of global optimum
• Independent of rewards• Independent of domain size• Provably tight (in paper)• One assumption: rewards are non-negative
R(a)
n m
k m
n
k
n m
k
R(a*)
R(a) k 1
2n k 1 R(a*)For binary graph (m=2),
University of Southern California
25
Proof sketch / example
10R(a) ≥ ∑R(â)
10R(a) ≥ 3R(a*) + 1R(a)
n
k
n m
k
n m
k m
R(a)
n m
k m
n
k
n m
k
R(a*) 1
3R(a*)
Fully connected graphn = 5 agentsm = 2 (binary constraints)k = 3
Goal: express R(a) in terms of R(a*).
a dominates:Â = {11100
11010 11001 10110 10101 10011 01110 01101 01011 00111}
a* = 11111 (global opt)a = 00000 (3-opt)
University of Southern California
26
Other graph types
• Ring: Star:
• Similar analysis, but exploit graph structure– Only consider  where connected subsets of k agents
deviate
R(a) k 1 k 1
R(a*)
R(a) k 1 n 1
R(a*)
University of Southern California
27
Proof sketch / example
5R(a) ≥ ∑R(â)
5R(a) ≥ 2R(a*) + 1R(a)
n
n k 1
k 1
R(a) k 1
n (n k 1)R(a*)
1
2R(a*)
Ring graphn = 5 agentsm = 2 (binary constraints)k = 3
Goal: express R(a) in terms of R(a*).
a dominates:Â = {11100
01110 00111 10011 11001}
a* = 11111 (global opt)a = 00000 (3-opt)
University of Southern California
28
Arbitrary graph
• Arbitrary graph = linear program• Minimize R(a)/R(a*) such that:
– for all dominated assignments â, R(a) - R(â) ≥ 0.
• Each constraint S in DCOP = 2 variables in LP.– 1: RS(a) for reward on S in k-optimal solution
– 2: RS(a*) for reward on S in global optimum
– All other rewards on S taken as 0 (as before). Why ok?• R(a) and R(a*) don’t change• a still k-optimal (no k agents would change)• a* still globally optimal
University of Southern California
30
Experimental Results
• Designer can choose appropriate k or topology!
University of Southern California
31
Experimental Results
University of Southern California
32
Conclusions
• Guarantees for k-optima in DCOPs as % of optimum– Despite not knowing constraint rewards– Helps choose algorithm to use– Helps choose topology to use
• Big idea:– Single agent: Rationality -> Bounded Rationality– Multi agent: Global Optimality -> k-Optimality – Ability to centralize information (coordinate) is
bounded• (only groups of k agents)
– Guarantees on performance of “bounded coordination”
University of Southern California
33
Readings
• J. P. Pearce and M. Tambe, "Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems," in IJCAI-07.
• R. T. Maheswaran, J. P. Pearce and M. Tambe, "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach," in PDCS-04.– (just read algorithms - don’t need to read proofs)
• W. Zhang, Z. Xing, G. Wang and L. Wittenburg, "An analysis and application of distributed constraint satisfaction and optimization algorithms in sensor networks," in AAMAS-03.
Recommended