CS584 - Software Multiagent Systems Lecture 12 Distributed constraint optimization II: Incomplete...

CS584 - Software Multiagent Systems

Lecture 12

Distributed constraint optimization II: Incomplete algorithms and recent theoretical

results

University of Southern California

PersonalAssistantAgents forScheduling

SensorNetworks

Multi-SpacecraftCoordination

Distributed Constraint Optimization

• DCOP (Distributed Constraint Optimization Problem)– Agents cooperate to

assign values to variables, subject to constraints, to maximize a global objective.

– R(a)=∑ RS(a) for all constraints S

Algorithms for DCOP

• Complete algorithms– Pro: finds optimal solution– Con: not always feasible (exponential in time

or space)

• Local algorithms (today)– Pro: usually finds a high-quality solution

quickly– Con: not optimal (but can guarantee within %

of optimal)

k-optimality

• k-optimal solution (k-optimum)– No deviation by ≤ k agents can increase solution

quality– Local optimum

– Globally optimal solution: 000.– 1-optimal solutions: 000, 111– k-optimal algorithms: DSA (k=1), MGM-2 (k=2), ...

Approach

• Decompose DCOP.

• Each agent only sees its local constraints.

• Agents maximize individual utility.– (individual view of

team utility)

1-optimal algorithms

• Algorithms for 1 agent:

– Broadcast current value to all neighbors and receive neighbors’ values.

– Find new value that gives highest gain in utility, assuming that neighbors stay fixed.

– Decide whether or not to change value, and act accordingly.

x1 x2 x3

8 10 8

x1 x2 x3

MGM – Maximal Gain Message

• Monotonic algo., but gets stuck in a local optimum!• Only one agent in a neighborhood moves at a time

x1 x2 x3

8 10 8

x1 x2 x3

5 16 5

DSA – Distributed Stochastic Algorithm

• One possible path (say p=0.5):

.1 .2 .3

x1 x2 x3

.6 .1 .7

Experimental Domains

• Regular graph coloring (~sensor network)– Cost if neighbors choose same value.

• Randomized DCOP – Each combination of neighbors’ values gets

uniform random reward

• High-stakes scenario (~UAVs)– Large cost if neighbors choose same value. – Otherwise, small uniform random reward is

given.– Add “safe” value where all agents start. No

reward or penalty if neighbors choose this value.

DSA vs. MGM

• Graph coloring and Randomized DCOP:– DSA gives higher

solution quality than MGM.

– DSA improves more quickly than MGM.

• High-stakes scenario:– DSA and MGM give

same solution quality.– MGM generally

improves more quickly than DSA.

• But, these graphs are averages....

DSA vs. MGM

• MGM increases monotonically

• Much better for– anytime algorithm– high-stakes

domains.

Algorithms with higher k

• Until now (DSA, MGM), agents have only acted based on their own, local constraints– a myopic worldview

• Now, we look at algorithms where agents form groups, and act based on all constraints in the group.– enlarging the worldview

• First step: groups of 2.– “2-optimality”– Maheswaran, Pearce, and Tambe ‘04

Coordinated Algorithms

• All agents are either offerers or receivers with probability q.

• Offerers:– Pick neighbor j at random, and calculate my gains from

all combinations of values from myself and j.– Send this information (several offers) as a message to j.

• < <myGain1, myValue1, yourValue1>• <myGain2, myValue2, yourValue2>,… >

• Receivers:– Accept the offer that makes my group’s gain the highest,

or just move alone instead.– groupGain = offerersGain + receiversGain - gain in

common link.– If I accept an offer, tell the offerer which one I am

accepting, and how much our group will gain.

2-optimal algorithms

• To improve solution quality, agents can form groups of 2

• Groups move according to group utility– sum of all constraints on any group member

• 2-optimal algorithm– any connected group of 2 agents can coordinate to

make a joint move.• 2-optimum

– state at which no group of up to 2 agents can make a joint move that will increase group reward.

• Any 2-optimum is also a 1-optimum

• Form groups of 2 agents and then do:– Send my gain (can be group gain) to all my neighbors.– Receive gain messages from my neighbors.– If I am involved in an accepted offer,

• If my gain > neighbors’ gain (not counting my partner), send “yes” to my partner.

• If not, then send “no” to my partner.• If I sent “yes” and got “yes”, then make the move in the offer.

– If I am not involved in an offer• If my gain > neighbors’ gain, then make my best move.

• 5 message cycles per move– (offer, accept, gain, confirm, move).

• Monotonically increasing solution quality

x1 x2 x3+5 0

x1 x2 x3

MGM-2 Example

x1 x2 x3

no gains

offerer receiver offerer

x1, x2gain=7

x2, x3gain=7

x1 x2 x3

accepts x1, x2group gain=2

x1 x2 x3

receiver offerer receiver

x2, x3gain=12

accepts x2, x3group gain=12

SCA-2(Stochastic Coordination Algorithm)

• Based on DSA• If offerer

– Send out offers to a randomly chosen neighbor.– If offer accepted, prepare to do the move in the

offer.– If offer not accepted, prepare to move alone (pick

move with highest individual gain).• If receiver

– If accepting offer, send acceptance message back to offerer, and prepare to do the move in the offer.

– Else, prepare to move alone. • Move, according to probability p.• 3 message cycles per move (offer, accept, move).

Experimental Trends

Monotonic (1-opt, 2-opt) Stochastic (1-opt, 2-opt)

Guarantees on Solution Quality

• Guarantee of k-optimum as % of global optimum – Factors:

• k (how local of an optimum)• m (maximum -arity of constraints)• n (number of agents)• constraint graph structure (if known)•Note: actual costs/rewards on constraints

– distributed among agents, not known a priori

Guarantees on Solution Quality

• Three results• Guarantees for:

– Fully-connected DCOP graphs• Applies to all graphs (i.e. when graph is unknown)• Closed-form equation

– Particular graphs• Stars• Rings• Closed-form equation

– Arbitrary DCOP graphs• Linear program

Fully-Connected Graph

• Reward of k-optimum in terms of global optimum

• Independent of rewards• Independent of domain size• Provably tight (in paper)• One assumption: rewards are non-negative

R(a) k 1

2n k 1 R(a*)For binary graph (m=2),

Proof sketch / example

10R(a) ≥ ∑R(â)

10R(a) ≥ 3R(a*) + 1R(a)

R(a*) 1

3R(a*)

Fully connected graphn = 5 agentsm = 2 (binary constraints)k = 3

Goal: express R(a) in terms of R(a*).

a dominates:Â = {11100

11010 11001 10110 10101 10011 01110 01101 01011 00111}

a* = 11111 (global opt)a = 00000 (3-opt)

Other graph types

• Ring: Star:

• Similar analysis, but exploit graph structure– Only consider Â where connected subsets of k agents

deviate

R(a) k 1 k 1

R(a) k 1 n 1

Proof sketch / example

5R(a) ≥ ∑R(â)

5R(a) ≥ 2R(a*) + 1R(a)

R(a) k 1

n (n k 1)R(a*)

2R(a*)

Ring graphn = 5 agentsm = 2 (binary constraints)k = 3

Goal: express R(a) in terms of R(a*).

a dominates:Â = {11100

01110 00111 10011 11001}

a* = 11111 (global opt)a = 00000 (3-opt)

Arbitrary graph

• Arbitrary graph = linear program• Minimize R(a)/R(a*) such that:

– for all dominated assignments â, R(a) - R(â) ≥ 0.

• Each constraint S in DCOP = 2 variables in LP.– 1: RS(a) for reward on S in k-optimal solution

– 2: RS(a*) for reward on S in global optimum

– All other rewards on S taken as 0 (as before). Why ok?• R(a) and R(a*) don’t change• a still k-optimal (no k agents would change)• a* still globally optimal

Experimental Results

• Designer can choose appropriate k or topology!

Experimental Results

Conclusions

• Guarantees for k-optima in DCOPs as % of optimum– Despite not knowing constraint rewards– Helps choose algorithm to use– Helps choose topology to use

• Big idea:– Single agent: Rationality -> Bounded Rationality– Multi agent: Global Optimality -> k-Optimality – Ability to centralize information (coordinate) is

bounded• (only groups of k agents)

– Guarantees on performance of “bounded coordination”

Readings

• J. P. Pearce and M. Tambe, "Quality Guarantees on k-Optimal Solutions for Distributed Constraint Optimization Problems," in IJCAI-07.

• R. T. Maheswaran, J. P. Pearce and M. Tambe, "Distributed Algorithms for DCOP: A Graphical-Game-Based Approach," in PDCS-04.– (just read algorithms - don’t need to read proofs)

• W. Zhang, Z. Xing, G. Wang and L. Wittenburg, "An analysis and application of distributed constraint satisfaction and optimization algorithms in sensor networks," in AAMAS-03.

CS584 - Software Multiagent Systems Lecture 12 Distributed constraint optimization II: Incomplete...

Documents

Introduction to Multiagent Systems

Tutorial on MultiAgent Virtual Worlds

Chapter 16: Multiagent Systems

Multiagent Systems

K2 07 MULTIAGENT

MULTIAGENT SYSTEMS IN MODULAR ROBOTICS50 modular robotics, multiagent systems, metamorphic structures of robots Rudolf JÁNOŠ * MULTIAGENT SYSTEMS IN MODULAR ROBOTICS Abstract The

Multiagent System (PROLOG) Multiagent System (PROLOG) JADE+PROLOG (Laboratory) JADE+PROLOG (Laboratory)…

Integrating Ontologies into Multiagent Systems Engineeringceur-ws.org/Vol-57/id-3.pdf · Background – Multiagent Systems Engineering The Multiagent Systems Engineering (MaSE) methodology

From Multiagent Systems to Multiagent Societies Michael Berger Based on: 1) “Multiagent Systems and Societies of Agents” / Michael N. Huhns and Larry M

Software Multiagent Systems: Lecture 13

Multiagent Cooperative Search for Portfolio Selectionparkes/pubs/gebfinal.pdf · Multiagent Cooperative Search for Portfolio Selection ... MULTIAGENT COOPERATIVE SEARCH 125 paper

Multiagent Simulator

Multiagent based model of economy

Learning a Multiagent Behavior

Learning in Multiagent systems

Simulació multiagent de la Girona medieval emprant SDLPS 1 ... · Simulació multiagent de la Girona Medieval emprant SDLPS 3 DADES DEL PROJECTE Títol del Projecte: Simulació multiagent

MultiAgent Architecture and an Example

COMP310 MultiAgent Systems

Multiagent Interaction

Multiagent Learning - Foundations and Recent Trendslarg/ijcai17_tutorial/multiagent... · 2018-02-05 · MultiagentSystems Multipleagentsinteractin commonenvironment Eachagentwithown