[Studies in Computational Intelligence] Novel Insights in Agent-based Complex Automated Negotiation Volume 535 ||

Studies in Computational Intelligence 535

Ivan Marsa-MaestreMiguel A. Lopez-CarmonaTakayuki ItoMinjie ZhangQuan BaiKatsuhide Fujita Editors

Novel Insights in Agent-based Complex Automated Negotiation

Studies in Computational Intelligence

Volume 535

Series Editor

Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Polande-mail: [email protected]

For further volumes:http://www.springer.com/series/7092

http://www.springer.com/series/7092

About this Series

The series “Studies in Computational Intelligence” (SCI) publishes new develop-ments and advances in the various areas of computational intelligence—quicklyand with a high quality. The intent is to cover the theory, applications, and designmethods of computational intelligence, as embedded in the fields of engineer-ing, computer science, physics and life sciences, as well as the methodologiesbehind them. The series contains monographs, lecture notes and edited volumesin computational intelligence spanning the areas of neural networks, connectionistsystems, genetic algorithms, evolutionary computation, artificial intelligence, cel-lular automata, selforganizing systems, soft computing, fuzzy systems, and hybridintelligent systems. Of particular value to both the contributors and the readershipare the short publication timeframe and the world-wide distribution, which enableboth wide and rapid dissemination of research output.

Ivan Marsa-Maestre • Miguel A. Lopez-CarmonaTakayuki Ito • Minjie Zhang • Quan BaiKatsuhide FujitaEditors

Novel Insights inAgent-based ComplexAutomated Negotiation

123

EditorsIvan Marsa-MaestreUniversity of AlcalaAlcala de Henares, Spain

Takayuki ItoSchool of Techno-Business AdministrationNagoya Institute of TechnologyNagoya, Japan

Quan BaiSchool of Computer and Mathematical

SciencesAuckland University of TechnologyAuckland, New Zealand

Miguel A. Lopez-CarmonaUniversity of AlcalaAlcala de Henares, Spain

Minjie ZhangSchool of Computer Science

and Software EngineeringThe University of WollongongWollongong, NSW, Australia

Katsuhide FujitaFaculty of EngineeringTokyo University of Agriculture

and TechnologyTokyo, Japan

ISSN 1860-949X ISSN 1860-9503 (electronic)ISBN 978-4-431-54757-0 ISBN 978-4-431-54758-7 (eBook)DOI 10.1007/978-4-431-54758-7Springer Tokyo Heidelberg New York Dordrecht London

Library of Congress Control Number: 2013957142

© Springer Japan 2014This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part ofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodologynow known or hereafter developed. Exempted from this legal reservation are brief excerpts in connectionwith reviews or scholarly analysis or material supplied specifically for the purpose of being enteredand executed on a computer system, for exclusive use by the purchaser of the work. Duplication ofthis publication or parts thereof is permitted only under the provisions of the Copyright Law of thePublisher’s location, in its current version, and permission for use must always be obtained from Springer.Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violationsare liable to prosecution under the respective Copyright Law.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.While the advice and information in this book are believed to be true and accurate at the date ofpublication, neither the authors nor the editors nor the publisher can accept any legal responsibility forany errors or omissions that may be made. The publisher makes no warranty, express or implied, withrespect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

www.springer.com

Preface

Complex automated negotiations have been widely studied and have becomean emerging area in the field of autonomous agents and multi-agent systems.Complexity in automated negotiations depends on several factors, including numberof negotiated issues, dependency of issues, representation of utility, negotiationprotocol, negotiation form (bilateral or multi-party), and time constraints, amongothers. Complex automated negotiation scenarios are concerned with negotiationencounters where we may have, for instance, a large number of agents, a largenumber of issues with a strong interdependency, non-monotonic utility functions,or strong time constraints. Many real world negotiation scenarios present oneor more of the mentioned elements. Software agents can support automation orsimulation of complex negotiations on behalf of their owners provide adequatestrategies to their owners in order to achieve realistic, win–win agreements. Toprovide solutions in such complex automated negotiation scenarios, we need toincorporate different advanced artificial intelligence technologies including search,constraint satisfaction problems, graphical utility models, Bayesian nets, auctions,utility graphs, optimization, and predicting and learning methods. The application ofcomplex automated negotiations could include e-commerce tools, decision-makingsupport tools, negotiation support tools, and collaboration tools.

This book includes extended versions of selected papers from the 5th Interna-tional Workshop on Agent-Based Complex Automated Negotiation (ACAN 2012),which was held in Valencia, Spain, in June 2012. For the workshop we solicitedpapers on all aspects of such complex automated negotiations in the field ofautonomous agents and multi-agent systems. Researchers are exploring these issuesfrom different communities in autonomous agents and multi-agent systems. Theyare, for instance, being studied in agent negotiation, multi-issue negotiations,auctions, mechanism design, electronic commerce, voting, secure protocols, match-making and brokering, argumentation, and co-operation mechanisms. The goal ofthis workshop was to bring together researchers from these communities to learnabout one another’s approaches, form long-term collaborations, and cross-fertilizethe different areas to accelerate progress towards scaling up to larger and morerealistic applications.

v

vi Preface

ACAN is closely cooperating with ANAC (Automated Negotiating AgentsCompetition), in which automated agents that have different negotiation strategiesand are implemented by different developers compete against one another indifferent negotiation domains in a tournament setting. Based on the great success ofANAC 2010 and ANAC 2011, ANAC 2012 was also held within the InternationalConference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2012 inValencia. This book includes an ANAC special section, where authors of selectedagents explain the strategies used.

Alcala de Henares, Spain Ivan Marsa-MaestreMiguel A. Lopez-Carmona

Nagoya, Japan Takayuki ItoNSW, Australia Minjie ZhangAuckland, New Zealand Quan BaiTokyo, Japan Katsuhide Fujita

Contents

Part I Agent-Based Complex Automated Negotiations

1 Intra-Team Strategies for Teams Negotiating AgainstCompetitor, Matchers, and Conceders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Victor Sanchez-Anguix, Reyhan Aydogan, Vicente Julian,and Catholijn M. Jonker

2 Alternative Social Welfare Definitions for MultipartyNegotiation Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Enrique de la Hoz, Miguel Angel Lopez-Carmona,Mark Klein, and Ivan Marsa-Maestre

3 Multilateral Mediated Negotiation Protocols with Feedback . . . . . . . . . . 43Reyhan Aydogan, Koen V. Hindriks, and Catholijn M. Jonker

4 Decoupling Negotiating Agents to Explore the Spaceof Negotiation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61Tim Baarslag, Koen Hindriks, Mark Hendrikx,Alexander Dirkzwager, and Catholijn Jonker

5 A Dynamic, Optimal Approach for Multi-IssueNegotiation Under Time Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85Fenghui Ren, Minjie Zhang, and Quan Bai

6 On Dynamic Negotiation Strategy for ConcurrentNegotiation over Distinct Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Khalid Mansour and Ryszard Kowalczyk

7 Reducing the Complexity of Negotiations OverInterdependent Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Raiye Hailu and Takayuki Ito

vii

viii Contents

8 Evaluation of the Reputation Network Using RealisticDistance Between Facebook Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137Takanobu Otsuka, Takuya Yoshimura and Takayuki Ito

Part II Automated Negotiating Agents Competition

9 An Overview of the Results and Insights from the ThirdAutomated Negotiating Agents Competition (ANAC2012) . . . . . . . . . . . . 151Colin R. Williams, Valentin Robu, Enrico H. Gerding,and Nicholas R. Jennings

10 An Adaptive Negotiation Strategy for Real-TimeBilateral Negotiations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163Alexander Dirkzwager and Mark Hendrikx

11 CUHKAgent: An Adaptive Negotiation Strategyfor Bilateral Negotiations over Multiple Items . . . . . . . . . . . . . . . . . . . . . . . . . . 171Jianye Hao and Ho-fung Leung

12 AgentMR: Concession Strategy Based on Heuristicfor Automated Negotiating Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181Shota Morii and Takayuki Ito

13 OMAC: A Discrete Wavelet TransformationBased Negotiation Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187Siqi Chen and Gerhard Weiss

14 The Simple-Meta Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197Litan Ilany and Ya’akov (Kobi) Gal

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

Contributors

Reyhan Aydogan Interactive Intelligence Group, Delft University of Technology,Delft, The Netherlands

Tim Baarslag Interactive Intelligence Group, Delft University of Technology,Delft, The Netherlands

Quan Bai School of Computing and Mathematical Sciences, Auckland Universityof Technologies, Auckland, New Zealand

Siqi Chen Department of Knowledge Engineering, Maastricht University, Maas-tricht, The Netherlands

Alexander Dirkzwager Interactive Intelligence Group, Delft University of Tech-nology, Delft, The Netherlands

Ya’akov (Kobi) Gal Ben-Gurion University, Be’er Sheva, Israel

Enrico H. Gerding School of Electronics and Computer Science, University ofSouthampton, Southampton, UK

Raiye Hailu Department of Computer Science and Engineering, Nagoya Instituteof Technology, Nagoya, Aichi, Japan

Jianye Hao Department of Computer Science and Engineering, The ChineseUniversity of Hong Kong, Shatin, New Territories, Hong Kong

Mark Hendrikx Interactive Intelligence Group, Delft University of Technology,Delft, The Netherlands

Koen V. Hindriks Interactive Intelligence Group, Delft University of Technology,Delft, The Netherlands

Enrique de la Hoz Computer Engineering Department, Universidad de Alcala,Escuela Politecnica, Alcala de Henares, Madrid, Spain

Litan Ilany Ben-Gurion University, Be’er Sheva, Israel

ix

x Contributors

Takayuki Ito School of Techno-Business Administration, Nagoya Institute ofTechnology, Nagoya, Aichi, Japan

Nicholas R. Jennings School of Electronics and Computer Science, University ofSouthampton, Southampton, UK

Catholijn M. Jonker Interactive Intelligence Group, Delft University of Technol-ogy, Delft, The Netherlands

Vicente Julian Universitat Politecnica de Valencia, Departamento de SistemasInformaticos y Computacion, Valencia, Spain

Mark Klein Sloan School of Management, Massachusetts Institute of Technology,Cambridge, MA, USA

Ryszard Kowalczyk Faculty of Information & Communication Technologies,Swinburne University of Technology, Melbourne, VIC, Australia

Ho-fung Leung Department of Computer Science and Engineering, The ChineseUniversity of Hong Kong, Shatin, New Territories, Hong Kong

Miguel Angel Lopez-Carmona Computer Engineering Department, Universidadde Alcala, Escuela Politecnica, Alcala de Henares, Madrid, Spain

Khalid Mansour Faculty of Information & Communication Technologies, Swin-burne University of Technology, Melbourne, VIC, Australia

Ivan Marsa-Maestre Computer Engineering Department, Universidad de Alcala,Escuela Politecnica, Alcala de Henares, Madrid, Spain

Shota Morii Nagoya Institute of Technology, Nagoya, Aichi, Japan

Takanobu Otsuka Center for Green Computing, Nagoya Institute of Technology,Nagoya, Aichi, Japan

Fenghui Ren School of Computer Science and Software Engineering, Universityof Wollongong, Wollongong, NSW, Australia

Valentin Robu School of Electronics and Computer Science, University ofSouthampton, Southampton, UK

Victor Sanchez-Anguix Universitat Politecnica de Valencia, Departamento deSistemas Informaticos y Computacion, Valencia, Spain

Gerhard Weiss Department of Knowledge Engineering, Maastricht University,Maastricht, The Netherlands

Colin R. Williams School of Electronics and Computer Science, University ofSouthampton, Southampton, UK

Takuya Yoshimura Master of Information Engineering, Nagoya Institute of Tech-nology, Nagoya, Aichi, Japan

Minjie Zhang School of Computer Science and Software Engineering, Universityof Wollongong, Wollongong, NSW, Australia

Part IAgent-Based Complex Automated

Negotiations

Chapter 1Intra-Team Strategies for Teams NegotiatingAgainst Competitor, Matchers, and Conceders

Victor Sanchez-Anguix, Reyhan Aydogan, Vicente Julian,and Catholijn M. Jonker

Abstract Under some circumstances, a group of individuals may need to negotiatetogether as a negotiation team against another party. Unlike bilateral negotiationbetween two individuals, this type of negotiations entails to adopt an intra-teamstrategy for negotiation teams in order to make team decisions and accordinglynegotiate with the opponent. It is crucial to be able to negotiate successfullywith heterogeneous opponents since opponents’ negotiation strategy and behaviormay vary in an open environment. While one opponent might collaborate andconcede over time, another may not be inclined to concede. This paper analyzesthe performance of recently proposed intra-team strategies for negotiation teamsagainst different categories of opponents: competitors, matchers, and conceders.Furthermore, it provides an extension of the negotiation tool GENIUS for negotiationteams in bilateral settings. Consequently, this work facilitates research in negotiationteams.

Keywords Agreement technologies • Collective decision making • Negotiationteams

V. Sanchez-Anguix (�) • V. JulianDepartamento de Sistemas Informáticos y Computación, Universitat Politècnicade València, Cami de Vera s/n, 46022 Valencia, Spaine-mail: [email protected]; [email protected]

R. Aydogan • C.M. JonkerInteractive Intelligence Group, Delft University of Technology, Delft, The Netherlandse-mail: [email protected]; [email protected]

I. Marsa-Maestre et al. (eds.), Novel Insights in Agent-based Complex AutomatedNegotiation, Studies in Computational Intelligence 535,DOI 10.1007/978-4-431-54758-7__1, © Springer Japan 2014

3

mailto:[email protected]




4 V. Sanchez-Anguix et al.

1.1 Introduction

A negotiation team is a group of two or more interdependent individuals thatjoin together as a single negotiation party because they share some commongoals related to the negotiation at hand [5, 30]. This kind of party participatesin many real life situations like the negotiation between a married couple and ahouse seller, the negotiation between a group of traveling friends and a bookingagency, and the negotiation between two or more organizations. Despite actingas a single party, most of the time negotiation teams cannot be considered as aunitary player. As a matter of fact, team members may have different and conflictingpreferences that need to be conciliated when making a team decision regardingthe negotiation. Agent-based negotiation teams (ABNT) constitutes a novel topicof research in automated negotiation, where efforts in the last few years focusedmostly on bilateral and multiparty negotiations with unitary players [8, 9, 13].Mechanisms that allow ABNT to take decisions on the negotiation process, namelyintra-team strategies or team dynamics [22, 24], are needed in order to supportmulti-agent systems for complex applications like group travel markets, groupbuying in electronic commerce, and negotiations between agent organizations (e.g.,organizational merging). An intra-team strategy for a specific negotiation protocol(e.g., alternating bilateral negotiation protocol) defines what decisions are taken bythe negotiation team, and how and when those decisions are taken.

Although there are some studies investigating negotiation among team members[29], automated negotiation between a team and an opponent is open to research.Sanchez-Anguix et al. have proposed several intra-team strategies [22,24] for ABNTfollowing the alternating-offers protocol in a bilateral setting. The proposed intra-team strategies have been studied under different environmental conditions to assessthe most appropriate intra-team strategy with respect to the given environmentalsetting [22]. However, several assumptions regarding the opponent exist. Forinstance, it is assumed that the opponent employs a time-based concession tacticssuch as Boulware or Conceder [7] in a cooperative context. Nevertheless, theseassumptions might become inconsistent with some opponents in open and dynamicenvironment. For instance, an opponent may adopt a strategy like “take it or leave it”while another opponent may choose to observe other negotiating agent’s behaviorand concede accordingly. An immediate question is which intra-team strategies willnegotiate well against other types of opponents different than those using time-basedtactics.

Without a doubt, an opponent’s negotiation attitude may affect on the per-formance of intra-team strategies. Opponent’s behavior is not limited to classictime-based concession strategies. Baarslag et al. classify the negotiation strategiesaccording to their negotiation behavior against the opponent into four categories [3].These are inverters, conceders, competitors and matchers. Conceders always con-cede regardless of the opponent’s strategy, while competitors do not yield inde-pendently of the behavior shown by opponents. A matcher mimics its opponent’sbehavior while inverter inverts it. When the opponent concedes, the matcher would

1 Intra-Team Strategies Against Competitor, Matchers, and Conceders 5

concede accordingly while the inverter would not. Based on this classification,we investigate how intra-team strategies proposed in the literature perform againstopponents belonging to different families of negotiation strategies. To do this, weextend GENIUS [14] to allow negotiation teams and enable it to perform bilateralnegotiations between a team (a group of agents) and an individual agent. Thecontributions of this paper do not solely focus on the study of intra-team strate-gies’ performance against different types of opponent, but we also describe howGENIUS has been modified to support such negotiations. This extension will allowresearchers to (1) design and test domain independent intra-team strategies, whichis desirable given the increasing number of application domains for automatednegotiation; (2) engage negotiation teams in open environments where any kindof opponent behavior is possible; (3) make use of a wide repository of negotiationdomains, utility functions, and automated negotiators; (4) focus on the design ofintra-team strategies, while leaving simulation aspects to be governed by GENIUS.

Our contributions are twofolds. First, we extend GENIUS to support ABNT; thusGENIUS can facilitate research on ABNT. Second, we analyze the performance ofdifferent intra-team strategies proposed by Sanchez-Anguix et al. against differenttypes of heterogeneous opponents. The rest of this paper is organized as follows.First, we present our general framework. After that, we briefly introduce the intra-team strategies analyzed in this paper. Then, we describe how the extension has beenincluded inside the GENIUS framework. Then, we describe how the experimentswere carried out and present and discuss the results of the experiments. Finally, wedescribe our future work and briefly conclude this work.

1.2 General Framework

In our framework, one negotiation team is involved in a negotiation with anopponent. Independently of whether or not the other party is also a team, both partiesinteract with each other by means of the alternating-offers protocol. Team dynamicsor intra-team strategies define what decisions have to be taken by a negotiation team,how those decisions are taken, and when those decisions are taken. In a bilateralnegotiation between a team and an opponent, the decisions that must be taken arewhich offers are sent to the opponent, and whether or not opponent’s offers areaccepted. A general view of our framework is represented in Fig. 1.1. Dashed linesdepict communications inside the team, while others represent communicationswith the opponent.

A team A is formed by a team mediator TMA and team members ai . Theteam mediator communicates with the other party following the alternating offersprotocol and team members communicate with the team mediator. Communicationsbetween the team and the opponent are carried out by means of the team mediator.This mediator sends team decisions to the opponent and receives, and laterbroadcasts, decisions from the opponent to team members. Thus, the fact that theopponent is communicating with a team is not known by the opponent, which only


Fig. 1.1 This picture shows our general negotiation framework

interacts with the trusted mediator. In this framework, the mechanisms employedby the team to decide on which offers to send and whether or not accept offersare carried out during the negotiation process itself. How these decisions are takendepends on the specific intra-team strategy that is implemented by the team and theteam mediator. Each team mediator can implement its own intra-team protocol tocoordinate team members as long as team members know how to play such intra-team protocol. It should be noted that we assume that team membership remainsstatic during the negotiation process. Thus, members do not leave/enter the teamas the negotiation is being carried out. It is acknowledged that team membersmay leave or join the group in certain specific situations. However, membershipdynamics is not considered in this article, and it is designated as future work.

1.3 Intra-Team Strategies

In this section we briefly describe the intra-team strategies proposed by Sanchez-Anguix et al. [22, 24] that are the focus of our study. These intra-team strategieshave been selected according to the minimum level of unanimity that they are able toguarantee regarding each team decision: no unanimity guaranteed (representative),majority/plurality (similarity simple voting), semi-unanimity (similarity Bordavoting), and unanimity (full unanimity mediated).

1.3.1 Representative (RE)

The intra-team protocol employed by the representative intra-team strategy isthe simplest possible strategy for a negotiation team. Basically, one of the teammembers is selected as representative and acts on behalf of the team. Interactions


among team members are non-existent, and therefore, every decision is taken bythe representative according to its own criterion. Obviously, the performance ofthe team will be determined by the similarity among the team members’ utilityfunctions and the negotiation skills of the representative. It is expected that ifteam members’ utility functions are very similar and the representative negotiationstrategy is appropriate, the team performance will be reasonably good.

The mechanism used by the team to select its representative may vary dependingon the domain: trust, past experiences, rational voting based on who is more similarto oneself, etc. Since GENIUS is a general simulation framework, a random teammember is selected as representative. This random team member will receive themessages from the opponent and act accordingly by sending an offer/counter-offeror accepting the opponent’s offer. Generally, any GENIUS agent that knows how toplay the alternating bilateral game can act as representative.

1.3.2 Similarity Simple Voting (SSV)

The similarity simple voting intra-team strategy relies on voting processes to decideon which offer is proposed and whether or not the opponent’s offer is accepted.Being based on voting, the strategy requires the action of team members and teamcoordination by means of a mediator. The intra-team strategy goes as follows everyround:

• Accept/Reject opponents’ offer: The team mediator receives an offer from theother party. Then, the mediator broadcasts this offer to the team members,indicating that it comes from the opponent party. The team mediator opens avoting process, where each team member should respond to the mediator withan Accept or Reject depending on the acceptability of the offer from the pointof view of the team member. The team mediator gathers the responses fromevery team member and applies a majority rule. If the number of Accept actionsreceived from team members is greater than half the size of the team, the offer isaccepted and the corresponding Accept action is sent to the opponent. Otherwise,the team mediator starts the offer proposal mechanism.

• Offer Proposal: Each team member is allowed to propose an offer to be sent.This offer is communicated solely to the mediator, who will make public theoffers, and start a voting process. In this voting process, each team membermust state whether or not he considers it acceptable for each of the offersproposed. For instance, if three different offers .x1; x2; x3/ have been proposed,the team members should state about the acceptability of the three of them:e.g., .yes,no,yes/. The mediator applies a plurality rule to determine the mostsupported offer, which is the one that is sent to the opponent.

Standard team members for SSV employ time tactics to decide on the accep-tance/rejection of the opponent’s offer, and the offer proposal. More specifically, thecurrent aspiration for team members follows the next expression [13]:


sai .t/ D 1 � .1 �RUai /�t

TA

�1=ˇai: (1.1)

where sai .t/ is the utility demanded by the team member at time t , RUai is thereservation utility for the agent, TA is the team’s deadline, and ˇai is the concessionspeed. When ˇai<1 we have a classic Boulware strategy, when ˇai D 1 we havea linear concession, and when ˇai > 1 we have a conceder strategy. On the onehand a team members considers an opponent’s offer as acceptable if it reports autility which is greater than or equal to sai .t/. On the other hand, a team memberconsiders an offer proposed by a teammate acceptable if it reports a utility which isgreater than or equal to the utility of the offer that he proposed to the team in thesame round. As for the offer proposed to the team, team members attempt to selectthe offer from the iso-utility curve which is closer to the last opponent’s offer andthe offer sent by the team in the last round (similarity heuristic based on Euclideandistance).

1.3.3 Similarity Borda Voting (SBV)

This intra-team strategy attempts to guarantee a higher level of unanimity byincorporating voting mechanisms that select broadly accepted candidates like Bordacount [19], and unanimity voting processes. The intra-team strategy proceeds asSSV with the following differences:

• Accept/Reject opponents’ offer: Instead of using a majority voting to decidewhether or not the opponent’s offer is accepted, the team mediator opens aunanimity voting. Hence, an opponent offer is accepted if and only if the numberof Accept actions is equal to the number of team members. Otherwise, the teammediator stars the offer proposal mechanism.

• Offer Proposal: Each team member is allowed to propose an offer to be sent bythe same mechanisms described in SSV. Then, the team mediator makes the teammembers privately score each proposal by means of a Borda count. The teammembers give a different score to each offer from the set Œ0; jAj � 1�, where jAjis the number of proposals received by the team mediator. Once the scores havebeen received by the team mediator, it selects the candidate offer that receivedthe highest sum of scores and it is sent to the opponent.

Standard team members are governed by an individual time-based concessiontactic like the one in Eq. (1.1). Similarly to SSV, an opponent’s offer is acceptablefor a team member if the utility that it reports is equal to or greater than its currentutility demanded sai .t/. When scoring team offers, each team member privatelyranks candidates in descending order of utility and then it assigns a score to eachoffer which is equal to the number of candidates minus the position of the offer inthe ranking.


1.3.4 Full Unanimity Mediated (FUM)

Full Unanimity Mediated is capable of reaching unanimous decisions as long asthe negotiation domain is composed by predictable issues whose type of valuationfunction is the same for team members (e.g., either monotonically increasing ordecreasing). The type of unanimity that it is capable of guaranteeing is strict in thesense that every decision reports a utility which is greater than or equal to the currentaspiration level of each team member [24]. The team mediator governs intra-teaminteractions as follows:

• Accept/Reject opponents’ offer: The interaction protocol followed by the team inthis decision is the same as the one presented in SSV. However, the decision ruleapplied by the team mediator in this case is unanimity. Therefore, an opponentoffer is only accepted if it is acceptable to each team member.

• Offer Proposal: Every team member is involved in the offer proposal, whichconsists in an iterated process where the offer is built attribute per attribute.The mediator starts the iterated building process with an empty partial offer(no attribute is set). Then, he selects the first attribute to be set following anagenda. The mediator makes public the current partial offer and the attributethat needs to be set. Each active team member states privately to the mediatorthe value that he wants for the requested issue. When all of the responses havebeen gathered, the mediator aggregates the values sent by team members usingthe max (monotonically increasing valuation function) or min (monotonicallydecreasing valuation function) and makes public among active team members thenew partial offer. Since it is assumed that team members share the same type ofvaluation function for predictable attributes, increasing the welfare of one of themembers results in other team members increasing their welfare or staying at thesame utility. Then, each active team member must evaluate the partial offer andstate if the partial offer is acceptable at the current state (Accept or Reject action).Those team members that respond with an Accept action are no longer consideredactive in the current construction process. The team mediator selects the nextattribute in the agenda and follows the same process until all of the attributeshave been set or until there are no more active team members (the rest of theattributes are maximized to match the opponent’s preferences). It should be saidthat the agenda of attributes is set by the mediator observing the concessions fromthe opponent in the first interactions. Following a rational criteria, the opponentshould have conceded less in the most important attributes in the first negotiationrounds. The amount of concession in each attribute during the first rounds issummed up and an attribute agenda is inferred at each round. The first attributesin the agenda are those inferred as less important for the opponent (more amountof concession), whereas the last attributes in the agenda are those consideredmore important for the opponent. The heuristic behind this agenda and iteratedbuilding process is attempting to satisfy team members first with those attributesless important for the opponent.


As for the standard team member behavior, team members have their demandsgoverned by an individual time-based concession tactic like the one in Eq. (1.1).In the iterated building process, each team member requests the attribute valuewhich, given the current partial offer, makes the partial offer closer to its currentdemands sai .t/. Additionally, a partial offer is acceptable when, considering onlythose attributes that have been set, the partial offer reports a partial utility whichis greater than or equal to the current demands of the team member. Each teammembers considers an opponent offer acceptable when it has a utility which isgreater or equal than its current demands sai .t/.

1.4 Implementation in Genius

GENIUS [14] is a well-known negotiation simulation framework. It supportssimulation of sessions and tournaments based on bilateral negotiations. Users areable to design their own agents and test them against a wide variety of differentagents designed by the community. The framework provides information critical foranalysis (e.g., utility, Pareto optimality, etc.) which is extremely useful for researchtasks. Moreover, the use of GENIUS as a testbed for bilateral negotiations is testifiedby its use in the annual automated negotiating agent competition (ANAC) [2].The ANAC competition provided GENIUS with a large repository of agents. Therepository of available agents contains conceder, inverter, matcher, and competitoragents. The integration of ABNT in GENIUS additionally facilitates the followingobjectives:

• The framework includes several negotiation domains and utility functions fortest purposes. Even though most of these domains are thought for bilateralnegotiations with unitary players, it is possible to add new negotiation domainsand utility functions in an easy way. In fact, we are in the process of addingnew team negotiation domains (i.e., advanced hotel group booking) besides theone employed for the experiments of this paper (i.e., hotel group booking, seeSect. 1.5.2).

• The use of GENIUS in ANAC has provided with wide variety of conceder,matcher, inverter and competitor opponents. Previous research in ABNT had onlyconsidered opponents with time-based tactics [22].

• Current research in ABNT has only considered team members following thesame kind of homogeneous behavior inside the intra-team strategy, which maynot be the case in some open environments. Due to its open nature, GENIUS

may be able to simulate ABNT whose team members are heterogeneous sincethey have been designed by different scholars.

• GENIUS is a consolidated testbed among the agent community. Thus, theinclusion of ABNT inside GENIUS can facilitate research on ABNT by otherscholars, and even give room to a future negotiating competition involving teams.


• Researchers can either design new team members for teams following theintra-team protocols included in the framework (i.e., team mediators), or theycan design new intra-team protocols and team members.

In order to implement negotiation teams, two new classes have been introducedin GENIUS: TeamMediator and TeamMember. These two classes can be extended byusers to include new intra-team strategies and types of team members in the system.Next, we depict the main traits of these classes, and how they can be used to includenew features in GENIUS:

• TeamMember: Team members extend the Agent class, so they have all of theirmethods available. Actions that come from the opponent party are received bythe ReceiveMessage method, whereas actions that come from the team mediatorare received in theReceiveTeamMessage method. The method chooseAction is used to decidethe agent’s action independently of whether or not the next action involvescommunications with the team mediator or the opponent.

• TeamMediator: The team mediator is the agent that communicates with theopponent party, and transmits opponent’s decisions to the team members. Thus,it has access to the public interface of all of the team members. Depending onthe kind of intra-team strategy, the mediator also coordinates other processeslike voting mechanisms, offer proposal mechanisms, and so forth. As the Agentclass, it receives communications from opponent by the ReceiveMessage method.In the chooseAction method, the mediator can either directly send a decisionto the opponent, or communicate with team members to decide the next actionto be taken. When interacting with team members, the mediator uses theReceiveTeamMessage method in team members’ API to send messages to teammembers. Team members can respond to the mediator with the chooseActionmethod in their public API. The TeamMediator class is completely flexible, as theonly mandatory action is receiving opponent’s decisions and sending decisionsto the other party. Therefore, any kind of mediated communication protocol canbe implemented extending the TeamMediator class.

Of course, it should be noted that team members and team mediators are tightlycoupled. For a team member to participate in a negotiation team governed by aspecific mediator, the team members should know the intra-team communicationprotocol implemented by such mediator.

GENIUS provides several measures to assess the quality of negotiating agents.The current version of GENIUS is capable of running team negotiation sessionsbetween two parties and provide online information about the minimum utility ofteam members, the average utility of team members, the maximum utility of teammembers, the joint utility of team members, current round, and current negotiationtime for each offer exchanged between both parties. A screenshot of the environmentbeing configured for a team negotiation session can be observed in Fig. 1.2. In the


Fig. 1.2 Screenshot showing the menu for configuring a team negotiation session

upper part of the menu, the user can select the intra-team strategies to be used byeach party, whereas the user add and remove team members for each party in thelower part of the menu.

1.5 Experiments and Results

As stated in the Sect. 1.1, one of the purposes of this paper is assessing theperformance of intra-team strategies against negotiation strategies different fromclassic time-based tactics. With that purpose, we tested RE, SSV, SBV and FUMagainst agents from the ANAC 2010 competition who have been previouslyclassified into competitors, conceders, and matchers1 [3]. First, we briefly describethe agents that we selected from the agent competition to represent the different

1Because of the technical inconsistencies, we could not use ANAC’s inverters directly in oursettings. Thus, they are not included in this analysis.


families of negotiation strategies. Then, we introduce the negotiation domain usedfor the experiments. After that, we describe how the experiments were carried out,and, finally, we show and analyze the results of the experiments.

1.5.1 ANAC 2010 Agents

In this section we present the different ANAC 2010 agents employed for our exper-iments. These agents pertain to three of the four categories presented previously:matchers, conceders, competitors.

• IAMHaggler & IAMCrazzyHaggler [2,32]: On the one hand, IAMCrazyHaggleris basically a take-or-leave it agent that proposes offers over a high threshold.The only aspect taken into consideration for accepting an offer is the utility ofsuch offer, and not time. Due to this behavior, the experiments carried out in [3]classified IAMCrazyHaggler as the most competitive strategy in the ANAC 2010competition. On the other hand, IAMHaggler is a much more complicated agent.It employs Bayesian learning and non-linear regression to attempt to model theopponent party and updates its acceptance threshold based on information liketime, the model of the opponent, and so forth. It was classified as a competitoragent in the experiments carried out in [3]

• Agent Smith [2, 31]: This agent is a conceder agent [3] that starts by demandingthe highest utility for himself and slowly concedes to attempt to satisfy thepreferences of the opponent by means of a learning heuristic. When the timelineis approaching 2 min, it proposes the best offer received up until that moment inan effort to finish the negotiation.

• Agent K [2, 11]: This agent was the winner of the ANAC 2010 competition.It adjusts its aspirations (i.e., target utility) in the negotiation process consideringan estimation of the maximum utility that will be offered by the other party. Morespecifically, the agent gradually reduces its target utility based on the averageutility offered by the opponent and its standard deviation. If an offer has beenproposed by the opponent that satisfies such threshold, it is sent back since,rationally, it should be also good enough for the opponent. In [3], Agent K wasclassified as a competitor agent.

• Nice Tit-for-Tat [3, 4, 10]: This strategy is a matcher agent from the 2011 ANACcompetition that reciprocates the other party’s moves by means of a Bayesianmodel of the other party’s preferences. According to the Bayesian model, theNice Tit-for-Tat agent attempts to calculate the Nash point and it reciprocatesmoves by calculating the distance of the last opponent offer to the aforementionedpoint. When the negotiation time is reaching its deadline, the Nice TFT agent willwait for an offer that is not expected to improve in the remaining time and acceptit in order to secure an agreement.


Table 1.1 Preferenceprofiles used in theexperiments

wpd wcf wpd wdba1 0.5 0.1 0.05 0.35a2 0.25 0.25 0.25 0.25a3 0.30 0.50 0.05 0.15op 0.10 0.50 0.25 0.15

ai represents team members andop represents the opponent

1.5.2 Test Domain: Hotel Group Booking

A group of friends who have decided to spend their holidays together has to bookaccommodation for their stay. Their destination is Rome, and they want to spend awhole week. The group of agents engages in a negotiation with a well-known hotelin their city of destination. Both parties have to negotiate the following issues:

• Price per person (pp): The price per person to pay.• Cancellation fee per person (cf ): The fee that should be paid in case that the

reservation is cancelled.• Full payment deadline (pd ): It indicates when the group of friends has to pay the

booking.• Discount in bar (db): As a token of respect for good clients, the hotel offers nice

discounts at the hotel bar.

In our experimental setup, preference profiles are represented by means of additiveutility functions in the form:

Upi .X/ D wpi ;1Vpi ;1 .x1/C : : :C wpi ;nVpi ;n .xn/ : (1.2)

where wpi ;j is the weight given by agent pi to attribute j , Vpi ;j is the valuationfunction for attribute j , and xj is the value of attribute j in the offer X . Thedomain of the attribute values is continuous and scaled to [0,1]. It should be notedthat all of the team members share the same type of monotonic valuation functionfor the attributes (monotonically increasing for payment deadline and discount, anddecreasing for price and cancellation fee) so that there is potential for cooperationamong team members. Despite this, team members give different weights to thenegotiation issues. The type of valuation function for the opponent is the oppositetype (increasing for price and cancellation fee, and decreasing for the paymentdeadline and the discount) and weights may be different too. The preference profilesof the agents can be found in Table 1.1. Even though SSV, SBV, and RE are able tohandle other types of domain where unpredictable attributes are present, we only usedomains with predictable attributes in our analysis because FUM does not supportdomains having unpredictable attributes.


1.5.3 Experimental Setting

In order to evaluate the performance of the intra-team strategies introduced inSect. 1.3, we set up a negotiation team consisting of three members that nego-tiates with each ANAC agent presented in Sect. 1.5.1. We tested the intra-teamstrategies with different parameters’ configurations: the FUM strategy where theconcession speed of each team member is drawn from the uniform distributionˇai D U Œ0:5; 0:99� (FUM Boulware or FUM B) or ˇai D U Œ0:01; 0:4� (FUM VeryBoulware or FUM VB), the SSV strategy where ˇai D U Œ0:5; 0:99� (SSV Boulwareor SSV B) or ˇai D U Œ0:01; 0:4� (SSV Very Boulware or VB), the SBV strategywhere ˇai D U Œ0:5; 0:99� (SBV Boulware or SBV B) or ˇai D U Œ0:01; 0:4� (SBVVery Boulware or VB), and the representative approach employing Agent K as thenegotiation strategy (RE K). Both parties have a shared deadline T D 180 s. If thedeadline is reached and no final agreement has been found, both parties get a utilityequal to 0.

In the experiments, each intra-team strategy was faced against each ANAC agentten different times to capture stochastic differences in the results. Out of those tenrepetitions, half of the times the initiating party was the team, and the other half theinitiating party was the ANAC agent. We gathered information on the average utilityof team members in the final agreement, and the joint utility of both parties (productof utilities of team members and opponent). A one-way ANOVA (˛ D 0:05) and apost-hoc analysis with Tukey’s test was carried out to assess the differences in theaverages.

1.5.4 Results

One of the goals of this paper is identifying which intra-team strategies work betteragainst different opponents. Therefore, we start by analyzing the results for theaverage utility of team members in Table 1.2. The results in bold fond indicatewhich intra-team strategy obtains statistically better results according to ANOVA(˛ D 0:05) and post-hoc analysis with Tukey’s test. As expected, all of the intra-team strategies, especially when their concession speed is very Boulware, get higheraverage utility for team members while negotiating with an opponent employinga conceder strategy like Agent Smith than while negotiating with competitiveopponents like AgentK, IAMHaggler, and IAMCrazyHaggler. This result supportsthe observation of [3] that a successful negotiating agent, if we only considera single negotiation with the opponent (short term relationship), should behavecompetitively, especially against cooperative strategies.

When the opponent is a conceder (Agent Smith), we observe that, in ourexperiments, the best intra-team strategies are those that wait as much as possibleto concede and exploit the opponent. We refer to FUM, SBV, and SSV strategiesemploying a very Boulware time tactic (FUM VB, SBV VB, and SSV VB).


Table 1.2 The table showsthe average of the averageutility for team members inthe final agreement

Competitive Matcher Conceder

Crazy Haggler K TFT Smith

FUM B 0.19 0.38 0.29 0.72 0.68FUM VB 0.16 0.42 0.65 0.72 0.97RE K 0.00 0.26 0.57 0.70 0.86SSV B 0.14 0.34 0.36 0.45 0.57SSV VB 0.08 0.36 0.57 0.44 0.98SBV B 0.14 0.35 0.31 0.49 0.55SBV VB 0.13 0.39 0.59 0.50 0.98

Crazy: IAMCrazyHaggler, Haggler: IAMHaggler,K:Agent K, TFT: Nice Tit-for-Tat, Smith: Agent Smith, B:ˇ D U Œ0:5; 0:99�, VB: ˇ D U Œ0:01; 0:49�

FUM VB, SBV VB, and SSV VB statistically get the same average for the averageutility of team members.2 This can be explained due to the fact that the concederagent has fully conceded before FUM, SBV, and SSV VB have started to concede.Since a concession from the opponent generally results in all of the team membersincreasing their welfare, these intra-team strategies perform similarly even thoughthey ensure different levels of unanimity regarding team decisions. A representativeusing agent K also performs reasonably well due to the same reason. However, sinceonly one of the team members takes decisions, it may not reach an average utilitycomparable to the ones obtained by FUM VB, SBV VB, and SSV VB.

When the opponent is a matcher, it is observed that employing FUM strategies(B and VB) and a representative strategy with a competitor representative (RE K)results in higher average utility for the team than employing SSV and SBV strategies(B and VB). According to the one-way ANOVA test, the performances of FUMstrategies (B and VB) and RE K are statistically and significantly better than thoseof SSV and SBV strategies (B and VB). The fact that SSV and SBV do not guaranteeunanimity regarding team decisions has an important impact on the average utilityof team members when faced against Nice Tit-for-Tat. There are no significantdifferences between FUM B, FUM VB, and RE K. Even though using an Agent Krepresentative guarantees less unanimity regarding team decisions than other intra-team strategies, it is shown that, against certain types of opponents, a representativewith a competitor negotiation strategy may be enough in practice to achieve resultscomparable to results obtained by strategies that guarantee unanimity like FUM.

When the opponent is a competitor, team strategies that employ FUM VB,SBV VB, and SSV VB perform better than their correspondents, FUM B, SBVB, and SSV B respectively. That is, if the opponent is competitive, taking acompetitive approach and conceding less results in better average team utility thantaking cooperative approaches. In any case, we can observe that the average utilityobtained by team members in some competitive settings (i.e., against Haggler and

2One-way ANOVA alpha=0.05 and a post-hoc Tukey test was carried out to support our claims.


Table 1.3 The table showsthe average for the jointutility (product) of teammembers and opponent in thefinal agreement

Competitive Matcher Conceder

Crazy Haggler K TFT Smith

FUM B 0.005 0.04 0.03 0.17 0.16FUM VB 0.004 0.06 0.15 0.17 0.04RE K 0.00 0.05 0.11 0.15 0.09SSV B 0.002 0.03 0.04 0.07 0.10SSV VB 0.001 0.04 0.10 0.04 0.02SBV B 0.002 0.03 0.03 0.06 0.09SBV VB 0.002 0.05 0.11 0.06 0.02

IAMCrazyHaggler) is way lower than the one obtained by the same intra-teamstrategies against conceders or matchers. This suggests the necessity to explorenew intra-team strategies that are able to cope with some competitor agents. If wecompare the performances of FUM VB, SBV VB, and SSV VB, the results showthat the team using FUM VB gathers higher utility on average than the rest of thecases. The fact that FUM approaches are usually the best options may be explaineddue to the fact that it ensures that all of the team members are satisfied with thoseoffers sent to the opponent and offers sent by the opponent. Note that when theopponent is IAMCrazyHaggler, which is a take it or leave it strategy, FUM B gets ahigher average for the average utility of team members, but there is no statisticallyand significant difference with the second runner, FUM VB. In any case, FUM Bis statistically different that the rest of intra-team strategies. In this case, RE K isnot capable of retaining an average utility for team members comparable to FUMVB, SBV VB, and SSV VB. In fact, all of the negotiations between RE K andIAMCrazyHaggler failed.

Our second evaluation metric is the joint utility of the final agreement3. The jointutility of all of the participants is a crucial metric in situations where both parties notonly want to get a deal, but also build a long term relationship and engage in multiplenegotiations in the future. An agent that has been exploited in the negotiation processmay be reluctant to negotiate with the same team/opponent in the future. Table 1.3shows the average for the joint utility in the final agreement. The best intra-teamstrategies for each opponent in the average utility case, are also the best intra-teamstrategies in the joint utility case. The only exception to this rule is the conceder case.In that scenario, the best results are obtained by employing FUM with a Boulwarestrategy instead of exploiting the opponent with very Boulware strategies (FUMVB, SSV VB). Thus, if a long term relationship is to be built with conceder agents,it may be wise to employ more concessive intra-team strategies. Very Boulwarestrategies exploit the opponent, and get very high results for the average utilityof team members, but they do not allow the opponent to get high utilities, whichresults in low joint utilities. On average the highest joint utility is gathered whenthe team employs a FUM strategy against to a matcher opponent namely, Nice

3The product of the utilities of each team member and the opponent


Tit for Tat. Since a matcher matches its opponent, TFT matches its behavior withFUM. Even if FUM concedes slowly over time, TFT will also concede, precludingboth parties from being exploited. In competitive settings, FUM needs to adjustits concession speed (very Boulware for IAMHaggler and Agent K, and Boulwarefor IAMCrazyHaggler) to be able to get the most for the joint utility, and, still,the results are specially low against Hagglers. This again suggests the necessity toexplore new intra-team strategies that are able to cope with some competitor agents.

1.6 Related Work

The artificial intelligence community has focused on bilateral or multi-partynegotiations where parties are composed of single individuals. The most relevantdifference in our work is that we consider multi-individual parties. Next, we analyzeand discuss work related in artificial intelligence.

First, we review some relevant work in bilateral negotiations. Faratin et al. [7]introduced some of the most widely used families of concession tactics in nego-tiation. The authors proposed concession strategies for negotiation issues thatare a mix of different families of concession tactics. The authors divide theseconcession tactics into three different families: (1) time-dependent concession tac-tics; (2) behavior-dependent concession tactics; and (3) resource-dependent tactics.Our negotiation framework also considers time as crucial element in negotiation.Therefore, team members employ time tactics inspired in those introduced byFaratin et al. In another work, Lai et al. [13] propose an extension of the classicalternating bargaining model where agents are allowed to propose up to k differentoffers at each negotiation round. Offers are proposed from the current iso-utilitycurve according to a similarity mechanism that selects the most similar offer to thelast offer received from the opponent. This present work also considers extendingthe bilateral alternating protocol by included layers of intra-team negotiation amongteam members. This way, team members can decide on the actions that should betaken during the negotiation. Robu et al. [20, 21] introduce a bilateral negotiationmodel where agents represent their preferences by means of utility graphs. Utilitygraphs are graphical models that represent binary dependencies between issues. Theauthors propose a negotiation scenario where the buyer’s preferences and the seller’spreferences are modeled through utility graphs. The seller is the agent that carriesout a more thorough exploration of the negotiation space in order to search foragreements where both parties are satisfied. With this purpose, the seller buildsa model of the buyer’s preferences based on historic information of past dealsand expert knowledge about the negotiation domain. Differently to this work, weintroduce multi-individual parties and add layers of intra-team negotiation to make itpossible for team members to decide on which actions to take during the negotiation.

With regards to multi-party negotiations, several works have been proposed inthe literature [1, 6, 9, 12, 15, 18, 33]. For instance, Ehtamo et al. [6] propose amediated multi-party negotiation protocol which looks for joint gains in an iterated


way and a single agreement should be found to satisfy all of the parties. Thealgorithm starts from a tentative agreement and moves in a direction accordingto what the agents prefer regarding some offers’ comparison. Klein et al. [12]propose a mediated negotiation model which can be extended to multiple partiesnegotiating the same agreement. Similarly, Ito et al. [9] propose different types of n-ary utility functions and efficient multiparty models for multiple parties negotiatingon the same agreement. Marsa-Maestre et al. [16, 17] carry out further researchin the area of negotiation models for complex utility functions. More specifically,they extend the constraint based model proposed by Ito et al. [9] by proposingdifferent bidding mechanisms for agents. One-to-many negotiations and many-to-many negotiations also represent special cases of multi-party negotiations. One-to-many negotiations represent settings where one party negotiates simultaneouslywith multiple parties. It can be a party negotiating in parallel negotiation threadsfor the same good with different opponent parties [1, 15, 18, 33] or a party thatnegotiates simultaneously with multiple parties like in the Contract-Net protocol,and the English and Dutch auction [25–27]. Many-to-many negotiations considerthe fact that many parties negotiate with many parties, the double auction beingthe most representative example [26]. Differently to the aforementioned concepts,negotiation teams are not related with the cardinality of the parties but the natureof the party itself. When addressing a negotiation team, we consider a negotiationparty that is formed by more than a multiple individuals whose preferences have tobe represented in the final agreement. This complex negotiation party can participatein bilateral negotiations, one-to-many negotiations, or many-to-many negotiations.The reason to model this complex negotiation party instead of as multiple individualparties is the potential for cooperation. Despite having possibly different individualpreferences, a negotiation team usually exists because there is a shared commongoal among team members which is of particular importance.

As far as we are concerned, only our previous works [22–24,28] have considerednegotiation teams in computational models. More specifically, the four differentcomputational models introduced in this article are analyzed in different negotiationconditions when facing opponents governed by time tactics. However, the analysisdoes not include variability with respect to the strategy carried out by the opponentlike the experiments carried out in this present article.

1.7 Conclusions and Future Work

This paper presents preliminary results on the performance of existing intra-teamstrategies for bilateral negotiations against heterogeneous opponents: competitors,conceders, and matchers. According to our analysis, the intra-team strategies likeFull Unanimity Mediated (FUM), Similarity Borda Voting (SBV), Simple SimilarityVoting (SSV) and Representative (RE) are able to negotiate with different successagainst different types of heterogeneous opponents. For the average utility of teammembers on the final agreement and the joint utility of both parties, we found


similar results. In the case of conceders, FUM, SBV, and SSV seems the bestoptions as long as they wait for the opponent to concede and exploit conceders.In the case of matchers, using either FUM or RE employing Agent K’s negotiationstrategy seem the best choices. This suggests that, for certain types of opponents,a representative approach with an appropriate negotiation strategy may be enoughin practice. Finally, the results against competitors show that while strategies likeFUM obtain reasonably good results against some competitors like Agent K, allof them suffer from exploitation against other competitor agents like IAMHagglerand IAMCrazyHaggler. Since existing intra-team strategies such as FUM, SBV, andSSV employ time tactics, they are inclined to concede during the negotiation. Thismay suggest that new intra-team strategies are needed to tackle negotiations againsta broader set of competitors.

Additionally, we have extended the well-known negotiation testbed, GENIUSto support bilateral negotiations where at least one of the parties is a team. Theextension allows developers to design their own intra-team strategies by extendingthe type of mediator used by the team and the type of team member. We expect thatby extending GENIUS with negotiation teams, the research in negotiation teamswill further advance and facilitate research.

For future work, we consider designing intra-team negotiation strategies thatanalyze the behavior of opponent and act accordingly. If a team understands thatthe opponent is cooperative, the team may act cooperatively and find a mutuallyacceptable agreement early. Otherwise, if the opponent is a competitor, the teammay decide to take a strong position and not concede during the negotiation.

Acknowledgements One part of this research is supported by TIN2011-27652-C03-01 andTIN2012-36586-C03-01 of the Spanish government. Other part of this research is supported bythe Dutch Technology Foundation STW, applied science division of NWO and the TechnologyProgram of the Ministry of Economic Affairs; the Pocket Negotiator project with grant numberVICI-project 08075 and the New Governance Models for Next Generation Infrastructures projectwith NGI grant number 04.17. We would also like to thank Tim Baarslag due to his helpful andvaluable comments and feedback about GENIUS.

References

1. An, B., Sim, K., Tang, L., Li, S., et al.: Continuous-time negotiation mechanism for softwareagents. IEEE Trans. Syst. Man Cybern. B Cybern. 36(6), 1261–1272 (2006)

2. Baarslag, T., Hindriks, K., Jonker, C., Kraus, S., & Lin, R.: The first automated negotiatingagents competition (ANAC 2010). In: New Trends in Agent-Based Complex AutomatedNegotiations, pp. 113–135. Springer Berlin Heidelberg (2012)

3. Baarslag, T., Hindriks, K. V., Jonker, C.M.: Towards a quantitative concession-based classi-fication method of negotiation strategies. In: Agents in Principle, Agents in Practice. LectureNotes of The 14th International Conference on Principles and Practice of Multi-Agent Systems(2011)

4. Baarslag, T., Hindriks, K.V., Jonker, C.M.: A Tit for Tat Negotiation Strategy for Real-TimeBilateral Negotiations. Studies in Computational Intelligence, vol. 435, pp. 229–233. Springer,Berlin (2013)


5. Brodt, S., Thompson, L.: Negotiating teams: a levels of analysis. Group Dyn. 5(3), 208–219(2001)

6. Ehtamo, H., Kettunen, E., Hamalainen, R.P.: Searching for joint gains in multi-party negotia-tions. Eur. J. Oper. Res. 130(1), 54–69 (2001)

7. Faratin, P., Sierra, C., Jennings, N.R.: Negotiation decision functions for autonomous agents.Int. J. Rob. Auton. Syst. 24(3–4), 159–182 (1998)

8. Faratin, P., Sierra, C., Jennings, N.R.: Using similarity criteria to make issue trade-offs inautomated negotiations. Artif. Intell. 142, 205–237 (2002)

9. Fujita, K., Ito, T., Klein, M.: Secure and efficient protocols for multiple interdependent issuesnegotiation. J. Intell. Fuzzy Syst. 21(3), 175–185 (2010)

10. Hindriks, K., Jonker, C., Tykhonov, D.: The benefits of opponent models in negotiation. In:Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligenceand Intelligent Agent Technology, pp. 439–444 (2009)

11. Kawaguchi, S., Fujita, K., Ito, T.: Compromising strategy based on estimated maximum utilityfor automated negotiation agents competition. In: Modern Approaches in Applied Intelligence,vol. 6704, pp. 501–510. Springer, Berlin (2011)

12. Klein, M., Faratin, P., Sayama, H., Bar-Yam, Y.: Negotiating complex contracts. Group Decis.Negot. 12(2), 111–125 (2003)

13. Lai, G., Sycara, K., Li, C.: A decentralized model for automated multi-attribute negotiationswith incomplete information and general utility functions. Multiagent Grid Syst. 4(1), 45–65(2008)

14. Lin, R., Kraus, S., Baarslag, T., Tykhonov, D., Hindriks, K., Jonker, C.M.: Genius: an integratedenvironment for supporting the design of generic automated negotiators. Comput. Intell. (2012)

15. Mansour, K., Kowalczyk, R.: A meta-strategy for coordinating of one-to-many negotiation overmultiple issues. In: Foundations of Intelligent Systems, vol. 122, pp. 343–353. Springer, Berlin(2012)

16. Marsa-Maestre, I., Lopez-Carmona, M.A., Velasco, J.R., de la Hoz, E.: Effective biddingand deal identification for negotiations in highly nonlinear scenarios. In: Proceedings of The8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’09),pp. 1057–1064. International Foundation for Autonomous Agents and Multiagent Systems,Richland (2009)

17. Marsa-Maestre, I., López-Carmona, M.A., Velasco, J.R., Ito, T., Klein, M., Fujita, K.:Balancing utility and deal probability for auction-based negotiations in highly nonlinear utilityspaces. In: International Joint Conference on Artificial Intelligence, pp. 214–219 (2009)

18. Nguyen, T., Jennings, N.: Coordinating multiple concurrent negotiations. In: Proceedings ofthe Third International Joint Conference on Autonomous Agents and Multiagent Systems,pp. 1064–1071. IEEE Computer Society, Washington, DC (2004)

19. Nurmi, H.: Voting systems for social choice. In: Handbook of Group Decision and Negotiation,pp. 167–182. Springer Netherlands (2010)

20. Robu, V., La Poutré, J.A.: Retrieving the structure of utility graphs used in multi-itemnegotiation through collaborative filtering of aggregate buyer preferences. In: Rational, Robustand Secure Negotiations. Computational Intelligence, vol. 89. Springer, Berlin (2008)

21. Robu, V., Somefun, D.J.A., La Poutré, J.A.: Modeling complex multi-issue negotiations usingutility graphs. In: Proceedings of the Fourth International Joint Conference on AutonomousAgents and Multiagent Systems (AAMAS’05), pp. 280–287. ACM, New York (2005)

22. Sanchez-Anguix, V., Julian, V., Botti, V., Garc/’ia-Fornes, A.: Analyzing intra-team strategiesfor agent-based negotiation teams. In: 10th International Conference on Autonomous Agentsand Multiagent Systems, pp. 929–936 (2011)


23. Sanchez-Anguix, V., Dai, T., Semnani-Azad, Z., Sycara, K., Botti, V.: Modeling power distanceand individualism/collectivism in negotiation team dynamics. In: 45 Hawaii InternationalConference on System Sciences (HICSS-45), pp. 628–637 (2012)

24. Sanchez-Anguix, V., Julian, V., Botti, V., García-Fornes, A.: Reaching unanimous agreementswithin agent-based negotiation teams with linear and monotonic utility functions. IEEE Trans.Syst. Man Cybern. B Cybern. 42(3), 778–792 (2012)

25. Sandholm, T.: An implementation of the contract net protocol based on marginal costcalculations. In: Proceedings of the Eleventh National Conference on Artificial Intelligence,pp. 256–262. AAAI Press, Menlo Park (1993)

26. Shoham, Y., Leyton-Brown, K.: Multiagent Systems: Algorithmic, Game-Theoretic, andLogical Foundations. Cambridge University Press, Cambridge (2009)

27. Smith, R.G.: The contract net protocol: High-level communication and control in a distributedproblem solver. IEEE Trans. Comput. 100(12), 1104–1113 (1980)

28. Sánchez-Anguix, V., Julian, V., Botti, V., García-Fornes, A.: Studying the impact of negotiationenvironments on negotiation teams’ performance. Inf. Sci. 219, 17–40 (2013)

29. Tambe, M., Jung, H.: The benefits of arguing in a team. AI Mag. 20, 85–92 (1999)30. Thompson, L., Peterson, E., Brodt, S.: Team negotiation: an examination of integrative and

distributive bargaining. J. Pers. Soc. Psychol. 70, 66–78 (1996)31. van Galen Last, N.: Agent Smith: Opponent Model Estimation in Bilateral Multi-issue

Negotiation. In: New Trends in Agent-Based Complex Automated Negotiations, pp. 167–174.Springer Berlin Heidelberg (2012)

32. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Iamhaggler: a negotiation agent forcomplex environments. In: New Trends in Agent-based Complex Automated Negotiations, pp.151–158. Springer Berlin Heidelberg (2012)

33. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Negotiating concurrently withunknown opponents in complex, real-time domains. In: 20th European Conference on ArtificialIntelligence, 242, pp. 834–839 (2012)

Chapter 2Alternative Social Welfare Definitionsfor Multiparty Negotiation Protocols

Enrique de la Hoz, Miguel Angel Lopez-Carmona, Mark Klein,and Ivan Marsa-Maestre

Abstract Multiagent negotiation protocols, understood as a group decision makingprocess, try to reach an agreement among all the negotiating agents. Traditionally,this agreement is an unanimous agreement. This consensus as unanimity maybe quite difficult to achieve in practice or even undesirable in some situations.We propose a framework to incorporate alternate consensus definitions to multiagentnegotiations in terms of utility sharing among the agents. The consensus definitionis enforced by a mediator, which implements a linguistic-expressed mediationrule based on Ordered Weighted Averaging Operators (OWA). In each step ofthe mediation process, agents send offers to the mediator. To avoid zones ofno agreement, the mediator applies Hierarchical Clustering (HC) to the offersto form group of agents. Then, the mediator computes a social contract, takinginto account the desired consensus and the distance from an ideal consensus. Thesocial contract is submitted as a feedback to the agents that explore locally thenegotiation space using of a variation of the Generalized Pattern Search (GPS) non-linear optimization technique to generate new offers that into account the socialcontract. Finally, We show how these mechanisms are able to reach agreementsaccording to different consensus policies while avoiding zones of no agreement.

Keywords Coalition formation coordination • Negotiation • Teamwork

E. de la Hoz (�) • M.A. Lopez-Carmona • I. Marsa-MaestreComputer Engineering Department, Universidad de Alcala, Alcala de Henares, Spaine-mail: [email protected]; [email protected]; [email protected]

M. KleinCenter for Collective Intelligence, MIT Sloan School of Management, Massachusetts Instituteof Technology, Cambridge, MA, USAe-mail: [email protected]


23





24 E. de la Hoz et al.

2.1 Introduction

Multi-attribute negotiation may be seen as an interaction between two or moreagents with the goal of reaching an agreement about a range of issues which usuallyinvolves solving a conflict of interests between the agents. Although, this shouldconstitute an incentive for them to cooperate and search for possible joint gains, self-interested agents often fail to reach consensus or end up with inefficient agreements.

On one hand, self-interested agents would like to reach an agreement that is asfavourable to them as possible. On the hands, final decision is jointly made andneeds to be agreed to by both the agents. As a result of this, negotiation agentshave to consider how much they could gain individually if they cooperate and inwhich way of cooperation they could gain more, or at least receive a fair deal.Negotiation protocols should include techniques for dealing fairly with rationalagents that also are able to lead them to mutually beneficial agreements. Because ofthis, a fundamental objective of any negotiation protocol should be to optimize sometype of social welfare measurement [1]. There are many different social welfaremeasurements like the sum or product of utilities, the min utility, etc. [2–4].

In spite of that, social welfare has not been taken into account as an integral partof the negotiation process. There are some works that incorporate a social welfarecriterion within the search process, though. In [5], the mediator generate jointlypreferred proposals for agreements. By iteratively moving along jointly improvingdirections from the tentative agreements produced by the method, negotiatingparties can achieve joint gains and finally reach a Pareto-optimal agreement.The procedure is repeated until no further joint improvements can be found. In [6]a mediator assists decision makers in finding Pareto-optimal solutions. Decisionmakers have to indicate their most preferred points on different sets of linearconstraints. The method can be used to generate either one Pareto-optimal solutiondominating the status quo solution of the negotiation or an approximation to thePareto frontier. In [7], a non-biased mediator agent searches for the compromisedirections based on a E-DD (Equal Directional Derivative) approach and supportsnegotiation agents in reaching an agreement. At each stage of negotiation, themediator searches for the compromise direction based on a new E-DD (EqualDirectional Derivative) approach and computes the new tentative agreement.

These solutions have some important restrictions. First, the utility functions haveto be derivable and quasiconcave. Second, the absolute value of gradient is notconsidered, so that the marginal utility obtained by the agents may not be fair. Third,the protocol is prone to untruthful revelations of information to bias the directiongenerated by the mediator. Finally, the protocols do not allow to specify the desiredconsensus on the final agreement.

The traditional or strict notion of consensus in multi-agent negotiation protocols,commonly known as unanimity, assumes that consensus exists only if all agentsagree on a contract. Unanimous agreements may be quite difficult or evenimpossible to achieve in practice and, in some cases, undesirable. Alternatedefinitions of consensus, as soft-consensus [8] have been proposed that consider

2 Alternative Social Welfare Definitions for Multiparty Negotiation Protocols 25

different degrees of partial agreement among agents to decide about the existenceof consensus on an contract. Consensus measures based on soft consensus are morecan be used to reflect linguistic expressions of mediation rules by using linguisticquantifiers.

In this work, we propose a framework to incorporate the type of consensusdesired to reach and agreement as an integral part of multiparty negotiationprotocols. We propose HCPMF, a Hierarchical Consensus Policy based MediationFramework for Multi-Agent Negotiation. HCPMF implements a mediation protocolthat is based on the Generalized Pattern Search (GPS) non-linear optimizationtechnique [9], the use of Ordered Weighted Averaging (OWA) operators [10,11], andthe use of Hierarchical Clustering (HC) [12]. GPS is used by the agents to performlocal exploration of the negotiation space, HC lets the mediator to form clustersof agents to avoid zones of no agreement, and OWA operators are used to applythe consensus policies, which are captured using linguistic quantifiers. Globally,HCMPF allows to efficiently search for agreements following predefined consensuspolicies, which may take the form of linguistic expressions. The protocol is designedto minimize the revelation of private information. Agents only propagate offers tothe mediator, not their preferences for the offers. Furthermore, agents’ offers neednot to be known by their opponents.

Next section presents the basic operation of the negotiation protocol. Thenwe present a variation of the GPS algorithm to perform local exploration of thenegotiation space and the mediation mechanisms. Two last sections describe theexperimental evaluation and present our conclusions.

2.2 The Negotiation Protocol

We shall assume a set of n agents A D fA1; : : : ; Ang and a finite set of issuesX D fx1; : : : ; xmg in a continuous or discrete domain. A contract is a vectorx D fx01; : : : ; x0mg defined by an instance of issue values. Each agent Ai has a realmapping Ui W X ! R function that associates with each contract x a value Ui.x/that gives the payoff the agent assigns to a contract. The preference function can bedescribed as any mapping function between the negotiation space contracts and theset of real numbers, and it can be non-monotonic and non-differentiable. The aim ofthe agents will be to reach an agreement on a contract x maximizing their individualpayoff while minimizing the revelation of private information.

2.2.1 Basic Operation of the Negotiation Protocol

The basic protocol of the negotiation process is as follows:

1. Each agent sends the mediator an initial contract offer. This offer may be theresult of a local utility maximization process, or a contract generated at random.


2. Based on the received offers, the mediator applies the HC algorithm to formclusters of agents. The cluster with the highest number of agents is selected.

3. The mediator applies the OWA operator to the offers in the selected cluster toobtain a feedback contract. The OWA operator synthesizes the consensus policyto apply. Finally, the mediator verifies if the deadline has been reached. If so,negotiation ends with an agreement on the feedback contract. Otherwise, goto step 4.

4. The mediator computes the group distance, which is a distance estimate to thecurrent feedback contract from the offers in the cluster. If the group distanceis below a threshold the negotiation ends with an agreement on the feedbackcontract. Otherwise go to step 5.

5. The mediator proposes the feedback contract to the agents.6. Each agent performs a local exploration of the negotiation space using GPS to

generate a new offer. The agent’s exploration considers the feedback contract andutility. Go to step 2.

In the next section we will present the GPS non-linear optimization algorithm thatwill be used by agents to explore the contract space.

2.3 Agents’ Local Exploration (GPS)

Each agent privately explores the negotiation space using a variation of the GPS[9] non-linear optimization algorithm. GPS belongs to the family of Direct SearchBased optimization algorithms. Formally, the optimization problem can be definedas max f .x/, where f W Rm ! R, x 2 R

m represents the evaluation of the contractsin terms of distance, utility or both. At an iteration k of the protocol, we have aniterate x.k/ 2 R

m and a step-length parameter 4k > 0. We will use the notationxCo.k/ to designate the mesh at round k plus the current point x.k/ (see Fig. 2.1.

Fig. 2.1 An illustration of amesh for m D 2 at round k.The reference point is x.k/


This set of points or mesh is an instance of what we call a pattern. One importantfeature of pattern search that plays a significant role in a global convergence analysisis that we do not need to have an estimate of the derivative of f at x.k/ solong as included in the search is a sufficient set of directions to form a positivespanning set for the cone of feasible directions, which in the unconstrained case is allof Rm. The set e is defined by the number of independent variables in the objectivefunction m and the positive standard basis set. A commonly used positive basisis the maximal basis, with 2m vectors. For example, if there are two independentvariables in the optimization problem, the default for a 2m positive basis consistsof the following pattern vectors: e1 D f1; 0g, e2 D f0; 1g and �e1 D f�1; 0g,�e2 D f0;�1g.

The exploration begins at the first negotiation round with the generation of aninitial random contract (reference contract) and a set of contracts (mesh) around thereference contract at a predefined distance. The reference contract will be the offerto be submitted to the mediator that will compute a feedback contract, taking intoaccount the reference contracts received from all the agents and will send it back.Then, we successively evaluate the points in the mesh xC.k/ D x.k/ ˙ 4kej ,j 2 f1; : : : ; mg, in terms both of utility and of distance to the feedback contractprovided by the mediator (evaluations will be better for higher utilities and shorterdistances). This set of points or mesh is an instance of what we call a pattern. If oneor more contracts x0.k/ in xC.k/ in the mesh improve the reference contract both inutility and distance, the contract with the highest improvement becomes the currentreference contract (x.kC1/ D x0.k/), and a new mesh is generated increasing by afactor of 2 the step-length factor, 4kC1 D 2 �4k . Otherwise, the agent has to decideif to behave as a utility maximizer, considering only the contracts’ utility in theevaluation, or as a utility conceder, considering only the distance to the feedbackcontract. We model the agents’ attitude using a random variable. In any of thesecases, if the improvement is in the mesh, that is, at least there exists a x0.k/ thatimproves x.k/ either in terms of utility or distance but not in both, the contractwith the highest improvement (x0.k/) becomes the current reference contract, anda new mesh xC.k C 1/ is generated increasing by a factor of 2 the current step-length factor,4kC1 D 2 � 4k . If there is no point x0.k/ in the mesh xC.k/ thatimproves the current reference contract x.k/, the reference contract remains thesame (x.kC 1/ D x.k/) and a new mesh xC.kC 1/ is generated at half the currentstep-length, 4kC1 D 0:5 � 4k .

2.4 The Mediation Mechanisms

The goal of the mediation process is to provide a useful feedback to the agentsto guide the joint exploration of the negotiation space implementing the desiredconsensus while avoiding zones of no agreement. This feedback is represented bythe feedback contract or social contract. The mediation process takes into accountnot only the utility of the offers but also their distance to the social contract.


This mediation process, at any round k, can be described as follows:

1. The HC algorithm is applied to the agents’ offers Ok D fok1; : : : ; okng in orderto form clusters of agents

2. For the contracts in the highest sized clusterOkc D fokc1; : : : ; okclg, the centroidck, the distances Dkc D fdkc1; : : : ; dkclg from the contracts to the centroid andthe set of direction vectors Rkc D frkc1; : : : ; rkcng from the centroid to thecontracts are computed.

3. The sets Okc ,Dkc and Rkc are ordered from lower to higher distances (distancesin Dkc). The set Dkc is normalized in the range Œmin.Dkc/; 0�, min.Dkc/

representing the lower distance and 0 the higher distance.4. The OWA operator that represents the desired consensus policy will be applied

to these values in order to obtain the feedback contract.5. To assess the convergence to a solution the mediator also computes the group

distance as the OWA-weighted distances to the feedback contract.

Next we will go into detail in each of the steps performed by the mediator at eachround k. First, we will describe the clustering mechanism, second, the procedureto obtain the feedback contract, which includes the description of the aggregationprocedures used to model the consensus policy, and finally, the computation of thegroup distance.

2.4.1 Forming Clusters of Agents (HC)

Here we look at the process whereby the mediator obtains the highest sized clusterof agents at each negotiation round. We have used an Hierarchical Clustering (HC)algorithm [12] to perform this task. HC groups data over a variety of scales bycreating a cluster tree or dendrogram. The tree is not a single set of clusters, butrather a multilevel hierarchy, where clusters at one level are joined as clusters at thenext level. This allows us to decide the level or scale of clustering that is mostappropriate at each step of the negotiation process.

In our case, we assume that the mediator has defined an upper bounded numberof rounds as a deadline. This number of rounds nr in divided into stages. Thus,we have ns stages with nr/ns rounds per stage. At each stage, a predefined scaleof clustering is applied. In our case, the mediator applies the scales of clustering indescending order. It means that as negotiation progresses the clustering process ismore prone to generate clusters. The rationale behind this is that we first try to reachagreements with as many agents as possible, and if we are not able to reach a globalagreement we progressively form smaller groups where the negotiation process isfocused on agents with closer preferences. In order to vary the scale of clusteringa cutoff level is varied which specifies the level at which the hierarchy of clustersis cut.


2.4.2 Computing the Feedback Contract

Our point of departure here is the collection of l contracts corresponding to thehighest sized cluster. For this set of contracts, the mediator computes the centroidck , the distances Dkc and the set of direction vectors Rkc . The mediator’s objectiveis to obtain a feedback contract that better represents a predefined consensus policy.

If the consensus policy is to keep as many agents satisfied as possible, undercomplete uncertainty, the mediator could propose the centroid as a compromisesolution. On the other hand, if the consensus policy is to have for instance atleast one agent satisfied with a high utility, the feedback contract should be biasedtowards the contracts closer to the centroid. To develop these ideas we use thequantifier guided aggregation technique which is implemented through the use ofOWA operators. This mechanism is a refinement with respect to the clusteringmechanisms. While the purpose of HC is to avoid zones of no agreement, the aimof using OWA operators is to apply a predefined consensus policy.

2.4.2.1 OWA Operators

Our goal is to elicit a function M which takes ck , Dkc and Rkc in order toobtain a feedback contract following a consensus policy. The form of M is calledthe mediation rule, it describes the process of combining the individual agents’preferences. The form ofM can be used to reflect a desired mediation imperative orconsensus policy for aggregating the preferences of the individual agents to get thefeedback contract. The most widespread consensus policy found in the automatednegotiation literature suggests using as an aggregation imperative a desire to satisfyall the agents. We propose to use application dependent mediation rules to managethe negotiation processes. The idea is to use a quantifier guided aggregation, whichallows a natural language expression of the quantity of agents that need to agree onan acceptable solution. As we shall see, the OWA operators [11] will provide a toolto model this kind of softer mediation rule.

We define two types of aggregation operators, scalar and vectorial.

Definition 2.1. An scalar OWA operator of dimension l is a mapping M W Sl !G; .S;G 2 Œ0; 1�/ such that,M.S1; : : : ; Sl / D Pl

tD1 wt bt , where bt is the t th largestelement of the aggregates fS1; : : : ; Slg and the wj are weights such that wt 2 Œ0; 1�and

PltD1 wt D 1

Definition 2.2. An vectorial OWA operator of dimension l is a mappingM W Sl !G; .S;G 2 R

m/, such that, M.S1; : : : ; Sl / D PltD1 wt bt , where bt is the t th largest

element of the vectorial aggregates fS1; : : : ; Slg and the wj are weights such thatwt 2 Œ0; 1� and

PltD1 wt D 1


It can be shown [11] shows that OWA aggregation has the following properties:

1. Commutativity: The indexing of the arguments is irrelevant2. Monotonicity: If Si � OSi for all i then M.Si ; : : : ; Sn/ � M. OSi ; : : : ; OSn/3. Idempotency: M.S; : : : ; S/ D S

4. Boundedness: Maxi ŒSi � � M.Si ; : : : ; Sn/ � Mini ŒSi �

In the OWA aggregation the weights are not directly associated with a particularargument but with the ordered position of the arguments. If ind is an index functionsuch that ind.t/ is the index of the t th largest argument, then we can express M as:

M.S1; : : : ; Sl / DlXtD1

wt Sind.t/ (2.1)

The form of the aggregation is dependent upon the associated weighting vector.We have a number of special cases of weighting vectors. The vector W � definedsuch that w1 D 1 and wt D 0 for all t ¤ 1 gives us the aggregation Maxi ŒSi �.Thus, it provides the largest possible aggregation. The vector W� defined such thatwl D 1 and wt D 0 for all t ¤ n gives the aggregation Mini ŒSi �. An interestingfamily of OWA operators are the E-Z OWA operators [13]. There are two families.In the first family we have wt D 1=q for t D 1 to q, and wt D 0 for t D q C 1

to l . Here we are taking the average of the q largest arguments. The other familydefines wt D 0 for t D 1 to q, and wt D 1

l�q for t D q C 1 to l . We can see thatthis operator can provide a softening of the original min and max mediation rulesby modifying q.

2.4.2.2 Quantifier Guided Aggregation

There are several approaches to perform OWA weights identification [14], includingmethods based on maximum entropy, on previous observations of decision makersperformance [15]. In this work, we will derive OWA weights from linguisticquantifiers [11]. Our final objective is to define consensus policies in the form ofa linguistic agenda. For example, the mediator should make decisions regarding thegeneration of the feedback contract following mediation rules like “Most agentsmust be satisfied by the contract”, “at least ˛ agents must be satisfied by thecontract”, “many agents must be satisfied”, : : :

The previous examples are examples of quantifier guided aggregations, whichare aligned with the notion of soft-consensus, which we discussed earlier. Linguisticquantifiers [16] can be used to semantically express aggregation policies andactually capture Kacprzyk’s notion of soft consensus.

OWA weights identification based on linguistic quantifiers is possible thanksto fuzzy set theory. There are two types of linguistic quantifiers: absolute andrelative [16]. Any relative linguistic quantifier can be expressed as a fuzzy subsetQ of the unit interval I D Œ0; 1� [10]. In this representation for any proportion


y 2 I , Q.y/ indicates the degree to which y satisfies the concept expressed bythe term Q. Relative linguistic quantifiers can be classified into three categories:Regular Increasing Monotone (RIM) quantifier, Regular Decreasing Monotone(RDM) quantifier and Regular UniModal (RUM) quantifier[11]. RIM quantifiersallow us to model the notion of soft consensus[17]. Formally, these quantifiers arecharacterized in the following way:

1. Q.0/ D 0

2. Q.1/ D 1

3. Q.x/ � Q.y/ if x > y.

Examples of this kind of quantifier are all, most, many, at least ˛. According to thisrepresentation, the quantifier all can be represented by Q� where Q�.1/ D 1 andQ�.x/ D 0 for all x ¤ 1, and any which is defined as Q�.0/ D 0 and Q�.x/ D 1

for all x ¤ 0. It has been shown [11] that the OWA weights can be parametrizedusing this kind of functions.

Under the quantifier guided mediation approach a group mediation protocol isexpressed in terms of a linguistic quantifier Q indicating the proportion of agentswhose agreement if necessary for a solution to be acceptable. The basic form ofthe mediation rule in this approach is “Q agents must be satisfied by the contract”,whereQ is a quantifier. The formal procedure used to implement the mediation ruleis as follows:

1. Use Q to generate a set of OWA weights W D w1; : : : ;wl .2. Use the weights W to calculate the feedback contract.

The procedure used for generating the weights from the quantifier is to divide theunit interval into n equally spaced intervals and then to compute the length of themapped intervals using Q

wt D Q

�t

l

��Q

�t � 1l

�for t D 1; : : : ; l : (2.2)

In Fig. 2.2 we show an example of a linguistic quantifier and illustrate the processof determining the weights from the quantifier. The weights depend on the numberof agents as well as the form of Q. In Fig. 2.3 we show the functional form for thequantifiers all, any, Q�, Q�, at least ˛ percent, linear quantifier, piecewise QZˇ

and piecewise QZ˛ .The quantifiers all, any and at least ˛ describe the consensus policy using a

natural language verbal description. For example, given Q =at least ˛ , if x >

˛ Q.X/ D 1, this means that a proportion of X fulfils the concept conveyed bythe quantifier most, where if X < ˛ Q.X/ D 0 because the proportion X is notcompatible with the concept (the minimum proportion ˛ is not reached) expressedby the quantifier.

However, more generally any function Q W Œ0; 1� ! Œ0; 1� such that meets therequirements previously stated for the quantifiers, can be seen to be an appropriateform for generating mediation rules or consensus policies.


0 1/5 2/5 3/5 4/5 10

1

i/5

Q(y)

w5

w4

w3

w2w2w1 y

Fig. 2.2 Example of how to obtain the weights from the quantifier for n D 5 agents

0 10

0.2

0.4

0.6

0.8

1

Q(y)

0 10

0.2

0.4

0.6

0.8

1

0 10

0.2

0.4

0.6

0.8

1

0 10

0.2

0.4

0.6

0.8

1

Q(y)

0 10

0.2

0.4

0.6

0.8

1

0 10

0.2

0.4

0.6

0.8

1

y

yLinear

ANYALL At least α

QZβ

QZαβ α

Fig. 2.3 Functional form of typical quantifiers: all, any, at least, linear, piecewise linear QZˇ andpiecewise linear QZ˛


Table 2.1 VOID values fordifferent quantifiers

Quantifier VOID

All 1Any 0At least ˛ ˛

linear 0.5QZ˛

˛2

QZˇ12C ˇ

2

Qpp

pC1

One feature which distinguishes the different types of mediation rules is thepower of an individual agent to eliminate an alternative. For example, in the caseof all this power is complete, and any agent could force an alternative to be rejectedby voting zero. In order to capture this idea, we introduce the Value Of IndividualDisapproval (VOID) [10], which is defined as:

VOID.Q/ D 1 �Z 1

0

Q.y/dy (2.3)

VOID measures this power of an individual agent to eliminate an alternative. For theall, any, at least ˛ and linear quantifiers the VOID measures are respectively 1, 0, ˛and 0:5. For theQZˇ quantifier VOID.QZˇ/ D 1

2C ˇ

2and therefore VOID.QZˇ/ 2

Œ0:5; 1�. The QZ˛ quantifier gets VOID.QZ˛/ D ˛2

and VOID.QZ˛/ 2 Œ0; 0:5�.Another family of quantifiers are those defined by Qp.y/ D yp for p > 0. In this

case VOID.Qp/ D 1 � R 10rpdr D p

pC1 . For Qp we see that as p increases we getcloser to the min and that as p gets closer to zero we get the max (Table 2.1).

2.4.2.3 Computation of the Feedback Contract

Finally, onceW has been obtained, the feedback contract at round k is computed as

fc.k/ D ck C vkvk �

lXiD1

wi � dkci ; (2.4)

where

v DlX

iD1wi � rkci : (2.5)

Vector v results from applying the vectorial OWA operator to the directionvectors. The feedback contract is generated in the direction pointed by v from theorigin ck . The distance at which the feedback contract is generated is obtainedby applying the scalar OWA operator to the distances to the centroid. Now, forinstance, let us assume a quantifier Qp.y/ D yp and p D 20, which means that


VOID D 0:95 (i.e. we want many agents satisfied) and that we have four contractsin the selected cluster. In this case wl will approach 1 and vector v will approximaterkcl pointing to the farther contract from the centroid. However, the feedbackcontract will be the centroid ck because

PliD1 wi � dkci D dkcl D 0. For a very low

VOID, w1 will approximate 1, which means than v D rkc1 pointing to one of thecontracts. In addition, the second summand in fc.k/ will be v

kvk � dkc1 D min.Dkc/,which means that the feedback contract will be very close to one of the contracts (thecloser one). These are only two examples of the effect that W has in the generationof the feedback offer. For high VOID values the feedback contract approachesthe centroid to satisfy many agents. For low VOID values the feedback contractapproaches the closer contracts to the centroid.

2.4.3 Measuring the Quality of the Agreement

Once a feedback contract has been generated, it is important to evaluate how thedegree in which this feedback contract satisfies the desired consensus policy. Thiswill serve as an signal to know when to stop the negotiation process. We use thegroup distance as a measure of closeness to the desired agreement. To compute thisgroup distance, we employ again the OWA weights computed previously and usingthem we calculate the weighted sum of the distances from the offers in the clusterto the feedback contract. The formula is as follows:

Gdk DlX

iD1wi � kokci � fc.k/k : (2.6)

Notice that we use W to OWA-weight the distance estimate to take into accountthe consensus policy. If the group distance falls below a threshold, the negotiationends with an agreement on the feedback contract.

2.5 Experimental Evaluation

In this section, we show that the proposed mechanisms provide the mediator thetools to efficiently conduct multiagent negotiations following different consensuspolicies. In the first experimental setup we have considered seven agents, two issuesand two different types of negotiation spaces: a negotiation space where agents’utility functions are strategically built to define a proof of concept negotiationscenario, and a complex negotiation scenario where utility functions exhibit a morecomplex structure. In both cases utility functions are built using an aggregationof Bell functions. This type of utility functions captures the intuition that agents’utilities for a contract usually decline gradually with distance from their idealcontract. Bell functions are ideally suited to model, for instance, spatial andtemporal preferences and to simulate different levels of complexity.


020

4060

80100

0

20

40

60

80

1000

0.5

1

x1

x2

Util

ity

Agent 1

Agent 2 Agent 3Agent 4

Agent 5

Agent 6

Agent 7

Fig. 2.4 Utility Functions for the proof of concept Scenario

A Bell is defined by a center c, height h, and a radius r . Let k s � c k bethe Euclidean distance from the center c to a contract s, then the Bell function isdefined as

fbell.s; c; h; r/ D

8<ˆ:h � 2h ks�ck2

r2if k s � c k< r

22hr2.k s � c k �r/2 if r >k s � c k� r

2

0 k s � c k� r

(2.7)

and the Bell utility function as

Ub;s.s/ DnbXi

f bel l.s; ci ; hi ; ri / (2.8)

where nb is the number of generated bells. The complexity of the negotiation spacecan be modulated by varying ci , hi , ri and nb.

In the proof of concept negotiation scenario each agent has a utility function witha single optimum. Figure 2.4 shows in the same graph the agents’ utility functionsin the bidimensional negotiation space Œ0; 100�2. Four agents (Agent 1, 2, 3, 4) are inweak opposition (i.e. their preferences are quite similar), Agents 6 and 7 are in weakopposition and in very strong opposition with respect the other agents, and Agent5 is in very strong opposition with respect the rest of the agents. In the complexnegotiation scenario (Fig. 2.5) each agent’s utility function is generated using tworandomly located bells. The radius and height of each bell are randomly distributed


0 50 1000

501000

0.51

x1x2

Util

ity

Agent 1 Agent 2

Agent 3 Agent 4

Agent 5 Agent 6

Agent 7

Fig. 2.5 Utility Functions for the Complex Negotiation Scenario

within the ranges ri 2 Œ20; 35� and hi D Œ0:1; 1�. The configuration of parametersin the mediator is: nr D 50 rounds, ns D 10 stages and a group distance threshold0:001. The cutoffs applied in HC go from 2 in the first stage to 0:1 in the last stagefollowing linear decrements. The probability for an agent to concede (i.e. to attendexclusively the feedback contract) is modelled for each agent using a probabilityvalue obtained from a uniform distribution between 0:25 and 0:5. For instance,an agent with probability 0:5 will concede with a 50% probability whenever itis not possible to improve both utility and distance from the feedback contract.We tested the performance of the protocol for three different consensus policies withVOID degrees: 0, 0:5 and 0:95, using the quantifier Qp.y/ D yp . Each experimentconsist of 100 negotiations where we capture the utilities achieved by each agent.To analyze the results we first build a 7 agents�100 negotiations utility matrix whereeach row provides each agent’s utilities and each column is a negotiation. The matrixis then reorganized such that each column is individually sorted from higher to lowerutility values. Note that after this transformation the association row/particular-agentdisappears. Given the matrix, we form seven different utility groups: a first groupnamed group level 1 where we take the highest utility from each negotiation (i.e. thefirst row), a second group named group level 2 with the two first rows and so on.In order to show the performance of the protocol we have used the Kaplan-Meierestimate of the cumulative distribution function (cdf ) [18] of agents’ utilities foreach group. Thus, we compute the cdf for the highest utilities, for the two highestutilities and so on. The cdf estimates the probability of finding agent’s utilities below


0.6 0.8 10

0.5

11 Agent

0.6 0.8 10

0.5

12 Agents

0.6 0.8 10

0.5

13 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

4 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

5 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

6 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

7 Agents

Utility

Pro

babi

lity

Voidness 0Voidness 0.5Voidness 1

Fig. 2.6 Cumulative distributions of utilities for the proof of concept scenario

a certain value. The rationale behind using grouping in the analysis is to evaluate theability of the protocol to find solutions which satisfy groups of agents.

In the proof of concept scenario (see Fig. 2.4) it can be seen that when anunanimous is needed, the best alternative is to get satisfied agents 1, 2, 3 and 4.If it is enough to have one agent satisfied, any of the utility peaks would be a goodsolution. In Fig. 2.6 we show the results for the proof of concept scenario. Eachline shows the cdf for a VOID value, and each plot focuses on the results obtainedfor each group level. For instance, in group level 1 (i.e. one Agent) there is a 75%


0.6 0.8 10

0.5

11 Agent

0.6 0.8 10

0.5

12 Agents

0.6 0.8 10.4

0.6

0.8

13 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

4 Agents

0.6 0.8 10.5

0.6

0.7

0.8

0.9

5 Agents

0.6 0.8 10.6

0.7

0.8

0.9

16 Agents

0.6 0.8 10.7

0.8

0.9

17 Agents

Utility

Pro

babi

lity

Voidness 0Voidness 0.5Voidness 1

Fig. 2.7 Cumulative distributions of utilities for the complex negotiation scenario

probability of having agents with utility 1 for VOID 0, a 40% of having one agentwith utility 1 for a VOID 0:5 and a 2% probability of having agents with utility 1for a VOID approaching 1. We can see how as we evaluate the utility distributionfor more agents, if we want many agents satisfied the best we can do is to use a highVOID value. In this case we will share utility in a more uniform way, maybe at thecost of not having agents highly satisfied.

In Fig. 2.7 the results for the complex negotiation scenario are shown. The resultsalso show that as VOID increases, the mediator biases the search for agreementswhere more agents are satisfied at the expense of the individual satisfaction level.


0 0.25 0.5 0.75 0.95

40%

50%

60%

70%

80%

90%

100%

VOIDNESS

Soc

ial W

elfa

re O

ptim

ality

Rat

e

High Complexity Scenario

Lowest Complexity Scenario

Fig. 2.8 Social Welfare Optimality Rate vs VOID

In general, it is worth noting that the application of a consensus policy may incur ina cost in terms of social welfare. In a second experimental setup we have consideredseven agents, two issues and four different types of negotiation spaces in increasingcomplexity to evaluate this issue.

Figure 2.8 shows the social welfare measurements (sum of utilities) for differentVOID degrees. Social welfare is normalized to its optimal value. VOID ranges from0 to 0.95. We can see how the application of consensus policies come at a cost interms of social welfare, both for low and for high VOID values. For example, inscenarios where there exist a strong opposition among the agents, if we want tohave many agents satisfied, individual utilities cannot be simultaneously large forall the agents, and therefore social welfare decreases. Also note that there exists aVOID value which maximizes social welfare. For complex scenarios, there will bea trade-off between VOID and social welfare.

2.6 Conclusion

We argue that there exist situations where an unanimous agreement is not possible orsimply the rules imposed by the system may not seek such unanimous agreement.Thus, we developed a hierarchical consensus policy based mediation framework(HCPMF) to perform multiparty negotiations. To perform the exploration ofthe negotiation space agents use a variation of the GPS non-linear optimizationtechnique. The mediator guides the joint exploration of a solution by using


aggregation rules which take the form of linguistic expressions. These rules areapplied over the agents’ offered contracts in order to generate a feedback contractwhich is submitted to the agents in order to guide their exploration. To avoidzones of no agreement the mediator uses Hierarchical Clustering to form clustersof agents. We showed empirically that HCPMF efficiently manages negotiationsfollowing predefined consensus policies, which has been modelled using OWAoperators.

The negotiation framework presented is one of the first proposals that incor-porate alternate consensus definitions for the mediation rule as an integral part ofmultiparty negotiation protocols. This framework can be extended to incorporatemore complex consensus rules that would take into consideration, for instance,the different importance of the negotiating agents or their attitudes. There are alsoopen aspects that we expect to deal with in future works. It is expected that theperformance of the protocol deviates from the optimal if agents act strategically.Alternatives ways of generating the feedback contract, based for instance on thehistory of passed offers, and not only on their current position should be considered.Finally, we plan to explore its possible application to domains as consortiumformation in brokering events.

Acknowledgements This work has been supported by Spanish Ministry of Economy andInnovation grant IPT-2012-0808-370000, STIMULO research project.

References

1. Hindriks, K., Jonker, C., Tykhonov, D.: A multi-agent environment for negotiation. In: El Fal-lah Seghrouchni, A., Dix, J., Dastani, M., Bordini, R.H. (eds.) Multi-Agent Programming, pp.333–363. Springer, New York (2009)

2. Klein, M., Faratin, P., Sayama, H., Bar-Yam, Y.: Protocols for negotiating complex contracts.IEEE Intell. Syst. 18(6), 32–38 (2003)

3. Endriss, U., Maudet, N., Sadri, F., Toni, F.: Negotiating socially optimal allocations ofresources. J. Artif. Intell. Res. 25, 315–348 (2006)

4. Lai, G., Sycara, K.: A generic framework for automated multi-attribute negotiation. GroupDecis. Negot. 18, 169–187 (2009)

5. Ehtamo, H., Hamalainen, R.P., Heiskanen, P., Teich, J., Verkama, M., Zionts, S.: Generatingpareto solutions in a two-party setting: constraint proposal methods. Manag. Sci. 45(12),1697–1709 (1999)

6. Heiskanen, P., Ehtamo, H., Hamalainen, R.P.: Constraint proposal method for computing paretosolutions in multi-party negotiations. Eur. J. Oper. Res. 133(1), 44–61 (2001)

7. Li, M., Vo, Q.B., Kowalczyk, R.: Searching for fair joint gains in agent-based negotiation. In:Decker, K., Sichman, J., Sierra, C., Castelfranchi, C. (eds.) Proceedings of 8th InternationalConference on Autonomous Agents and Multiagent Systems (AAMAS 2009), Budapest,pp. 1049–1056, 10–15 May 2009

8. Kacprzyk, J.: Group decision making with a fuzzy linguistic majority. Fuzzy Sets Syst. 18(2),105–118 (1986)

9. Lewis, R.M., Torczon, V., Trosset, M.W.: Direct search methods: then and now. J. Comput.Appl. Math. 124, 191–207 (2000)


10. Yager, R., Kacprzyk, J.: The Ordered Weighted Averaging Operators: Theory and Applications.Kluwer, Dordrecht (1997)

11. Yager, R.: Quantifier guided aggregation using OWA operators. Int. J. Intell. Syst. 11, 49–73(1996)

12. Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc.58(301), 236–244 (1963)

13. Yager, R.: E-z OWA weights. In: Proceedings of 10th International Fuzzy Systems AssociationWorld Congress, Istanbul, pp. 39–42 (2003)

14. Grabisch, M., Orlovski, S.A., Yager, R.R.: Fuzzy Sets in Decision Analysis, OperationsResearch and Statistics, pp. 31–68. Kluwer, Norwell (1998)

15. Yager, R.R., Filev, D.P.: Essentials of Fuzzy Modeling and Control. Wiley-Interscience,New York (1994)

16. Zadeh, L.: A computational approach to fuzzy quantifiers in natural languages. Comput. Math.Appl. 9, 149–184 (1983)

17. Liu, X., Han, S.: Orness and parameterized RIM quantifier aggregation with OWA operators:a summary. Int. J. Approx. Reason. 48(1), 77–97 (2008)

18. Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Am. Stat.Assoc. 53(282), 457–481 (1958)

Chapter 3Multilateral Mediated Negotiation Protocolswith Feedback

Reyhan Aydogan, Koen V. Hindriks, and Catholijn M. Jonker

Abstract When more than two participants have a conflict of interest, finding amutual agreement may entail a time consuming process especially when the numberof participants is high. Automated negotiation tools can play a key role in providingeffective solutions. This paper presents two variants of feedback based multilateralnegotiation protocol in which a mediator agent generates bids and negotiating agentsgive their feedback about those bids. We investigate different types of feedbackgiven to the mediator. The mediator uses agents’ feedback to models each agent’spreferences and accordingly generates well-targeted bids over time rather thanarbitrary bids. Furthermore, the paper investigates the performance of the protocolsin an experimental setting. Experimental results show that the proposed protocolsresult in a reasonably good outcome for all agents in a relatively short time.

Keywords Multilateral negotiation • Protocols • Smart mediators

3.1 Introduction

Much attention has been paid to bilateral negotiation in which the dispute is betweenonly two parties. However, automated multilateral negotiation in which more thantwo negotiating parties need to reach a joint agreement, has received relatively lessattention [4], even though such negotiations are required in many circumstances. Forinstance, decision making process in organizations (i.e. business or governmentalorganizations) mostly involve more than two individuals, or in personal life a groupof friends or family members need to have an agreement on a particular matter suchas their holiday.

R. Aydogan (�) • K.V. Hindriks • C.M. JonkerInteractive Intelligence Group, Delft University of Technology, Delft, The Netherlandse-mail: [email protected]; [email protected]; [email protected]


43




44 R. Aydogan et al.

Multilateral negotiation is more complicated than bilateral negotiation in viewof the fact that the agreement needs to be reached among more than two partiesmeans more conflicts and more interactions. An important issue is to decide onthe protocol that governs the interaction between parties and determines when thefinal agreement will be reached. In this paper, we focus on and investigate differentmediator-based protocols. In such protocols, a mediator generates and proposesbids. We investigate the feedback that agents provide in response to such mediator-generated bids. We take [6] as a starting point and propose two variants of theprotocol. In that protocol, a mediator generates bids and asks negotiating partiesfor their approval or disapproval of the bids; finally it determines the negotiationoutcome based on the votes of the parties during the negotiation. The protocol isconvenient for both software and human agents since the participants just need tocompare the current bid with the last accepted bid by all parties, and accordinglyvote. The mediator searches the outcome spaces based on only the most recentmutually accepted bid by all parties without taking the preferences of the partiesinto consideration. Due to the privacy concern the negotiating parties may (possiblywould) be reluctant to reveal their preferences entirely to the mediator so it isreasonable for the mediator not to ask the preferences of the parties directly.However, the mediator may try to understand preferences of the parties based ontheir feedback during the negotiation and accordingly revise its bids. This approachmay allow the mediator to complete the negotiation earlier.

This paper presents two variants of feedback based multilateral negotiationprotocol in which the mediator models the negotiating parties’ preferences basedon their feedback during the negotiation and generates bids by taking the utility ofeach negotiating party into consideration. Similar to the protocol in [6], it does notrequire high computational effort for the negotiating parties, so human agents maytake place in the negotiation as a negotiating party. Furthermore, the mediator agentsearches the outcome space based on its knowledge acquired from the feedbacksgiven by the negotiating parties during the negotiation. We experimentally comparethe original protocol proposed in [6] with the two new variants we introduce in thispaper. Experimental results show that agent benefits utility-wise.

The rest of this paper is organized as follows: Sect. 3.2 gives a brief introductionto mediated single text negotiation presented in [6]. Section 3.3 explains theproposed multilateral negotiation protocols and mediator’s preference modelingapproach. Section 3.4 explains our experimental setup, metrics, and results. Finally,Sect. 3.5 discusses our work.

3.2 Mediated Negotiation

According to the mediated single text negotiation protocol presented in [6], themediator initially generates a bid randomly and asks the negotiating agents to votefor this bid. Each agent can vote to either “accept” or “reject” in accordance with itsnegotiation strategy. If all negotiating agents vote to accept, the bid is labeled as the

3 Multilateral Mediated Negotiation Protocols with Feedback 45

most recent mutually accepted bid. In further rounds, the mediator modifies the mostrecent mutually accepted bid by exchanging one value with another randomly in thebid and asks negotiating agents to vote for the current bid. This process continuesiteratively until a predefined number of bids are reached.

In that study, two voting strategies are defined for the agents: “Hill-climber”and “Annealer”. An agent employing hill-climber strategy only accepts a bid ifits utility is greater than the utility of the most recent mutually accepted bid. Theproblem with hill climber approach is if the utility of initial bid is quite high for oneof the negotiating agents, that agent may not accept other bids even though thosebids might be better for the majority. By contrast, the agent employing Annealercalculates the probability of acceptance for the current bid based on the utilitydifference and a virtual temperature, which gradually declines over time. There isa higher probability when the difference is small and virtual temperature is high.That is, an agent employing Annealer has a tendency to accept individually worsebids earlier so that the agents can find win-win bids later. Towards to the end of thenegotiation, the agent has a tendency to accept only the bids whose utility is greaterthan the utility of the most recent mutually accepted bid. The authors also proposesome other approaches to handle the exaggerator agents but those are beyond thescope of this paper since we consider that all negotiating agents are truthful in ournegotiation framework.

3.3 Proposed Mediated Negotiation

Inspired from the mediated negotiation approach explained above, we present twovariants of feedback based mediated multilateral protocol and a preference modelingapproach for the mediator based on the feedbacks given by the negotiating agentsduring the negotiation. In both variants, the mediator agent tries to model thepreferences of each negotiating agent by using their feedbacks about the mediator’sbids. Consequently, the mediator aims to generate better bids for all of the agents byusing the learnt model over time.

Basically in the proposed approach, the mediator initially generates its firstbid randomly and for the further bids it modifies its previous bid by exchangingone value with another in the bid randomly or according to a heuristic based onthe learnt preference models during the negotiation. When the negotiating agentsreceive a bid from the mediator, they give a feedback such as “better”, “worse”,and “same” rather than simply voting the mediator’s current bid either to accept orreject. To do this, the agents compare the mediator’s current bid with its previousbid and accordingly give their feedback. For example, if the current bid is betterthan the previous one for the agent, it says “better”. Based on those feedbacks,the mediator tries to model the preferences of each negotiating party. To achievethis, the mediator only assumes that the negotiating agents give their feedbacktruly, preferences are total preorder, and there is no preferential interdependencyamong the issues. It is worth noting that the mediator does not make any other


assumptions about the negotiating agents’ preference representation. The agentsmay use a qualitative preference model to represent their preferences as well as theymay represent their preferences by means of additive utility functions. Furthermore,this allows each negotiating agents to choose their preference representation freely.Unless there exist preferential interdependencies among the issues, the agents canemploy different preference representations for their preferences.

In the following section, we first describe how the mediator models the pref-erences of each negotiating agent based on their feedbacks and then presentstwo variants of mediated multilateral protocol in which the mediator models thepreferences and accordingly generates its bids.

3.3.1 Feedback Based Preference Modeling

As stated before, the mediator mutates its previous bid by flipping one of the issuesat a time and gets feedback from the negotiating agents. This allows the mediatorto have some information about each agent’s preferences on that issue. To illustratethis, consider one of the agents specifies the current bid, (x1, y1), is better than theprevious one, (x2, y1) where xi and yi denote the values of the first and second issuein the bid, sayX and Y respectively. By interpreting this feedback, the mediator candeduce that the value x1 is preferred over x2 for the first issue by that agent. If themediator keeps the preferential information gathered from the agent’s feedbackin a graphical model such as preference graph, it can extract more preferentialinformation by using some properties such as transitivity of the preferences. That is,if we know that x1 is preferred over x2, and x2 is preferred over x3, then we can inferthat x1 is also preferred over x3 by using the transitivity of the preference orderings.

Accordingly in the proposed approach, the mediator generates a model, Mi foreach negotiating agent, Ai and updates those models after receiving feedback fromthe agents. Mi is a set of preference graphs, Mi D fPG1; PG2; : : : ; PGng wherePGk is the preference graph for the kth issue. The nodes of these graphs denotethe values for the given issue and the edges shows the improving flips, changingthe value of an issue with a more desired value. In other words, the direction ofedges are ordered from less preferred to more preferred values. Figure 3.1 showsa sample preference graph for the issue, X whose possible values are denoted asD.X/ D fx1; x2; x3; x4; x5g. From the given preference graph, it is seen that thevalues x2 and x4 are equally preferred and those values are preferred over the valuesx3 and x5. Moreover, it can be interpreted that the value x1 is preferred over allother values by using the transitivity of the preference ordering. According to thispreference graph, x3 and x5 cannot be comparable since there is no path betweenthem. By modeling the agent’s preference via preference graphs, the mediator isable to extract more information from the given feedback. Transitivity can also beapplied to “equally preferred” values. For instance, if the mediator knows that x4 ispreferred over x5, it can deduce that x2 would be preferred over x5 since x2 and x4are equally preferred. Consequently, it will be able to compare more pairs with lessinformation.


Fig. 3.1 A samplepreference graph for issue X

For example, the preference graph in Fig. 3.1 can be constructed by using onlyfour feedbacks as follows:

• Feedback 1: x2 is better than x3.• Feedback 2: x1 is better than x2• Feedback 3: x4 is same with x2.• Feedback 4: x5 is worse then x4

Even though four comparisons are given, the mediator can compare nine valuepairs {(x3 � x2), (x3 � x4), (x3 � x1), (x5 � x2), (x5 � x4), (x4=x2), (x2 � x1),(x4 � x1)}.

The immediate question is how the mediator uses these models to generate betterbids for all the agents. As the mediator is unbiased, it would be willing to increasethe social welfare. To achieve this, it would try to increase one of the social welfaremetrics such as Nash product, maximizing the product of the utilities of the agents.However, it does not have quantitative measurement such as utilities. Furthermore,we might not be able to compare some value pairs in the constructed graph. Thisproblem is similar to the problem of negotiating with CP-nets [1, 2] where theagents try to negotiate with respect to the preference graph induced from a givenCP-net. In that preference graph, the nodes denote the outcomes and there are someincomparable outcomes. In those studies, the authors present some heuristics toobtain estimated utilities; consequently, the negotiating agents generate their offerand decide whether to accept the opponent’s counter offer by employing thoseestimated utilities.

We adopt a similar approach with those studies and generate estimated utilitiesfrom the constructed graph by using a scoring approach similar to the depthapproach proposed in [1, 2]. In their approach, depth of an outcome in a prefer-ence graph is estimated as the length of the longest path from the root node, soit indicates how far the outcome is from the least preferred outcome. Thus, theoutcomes whose depth is higher, is preferred over that whose depth is lower. Further,if two outcomes are at the same depth, it is assumed that these outcomes are equally


preferred by the user. Based on this intuition, they estimate the utility values betweenzero and one by applying the formula shown in Eq. (3.1).

U.x/ D Depth.x; PG/

Depth.PG/(3.1)

Since in that study the preference graph is induced from a given CP-net, there isonly one root node (the least preferred outcome). Therefore, it is straightforward toestimate the depth of an outcome in the preference graph by applying graph searchalgorithms. However, in our case we may not know which value is the least preferredvalue. Therefore, we estimate a score that is similar to the concept of depth butslightly different. The main principle is that if a value xm is more preferred overanother value xk , the score of xm would be higher than that of xk . If xm is lesspreferred than xk , the score of xm would be lower. If they are equally preferred,their score would be equal.

While the mediator generates its first bid randomly, it initiates the preferencegraphs for each issue with respect to the first bid. Each value in the first bid is addedseparately to the related graph (i.e. xi would be added to the graph belongs to Xissue). To illustrate this, assume that we have two issues: X and Y , and the first bidis (x3, y1). The preference model would consist of two preference graphs: one forX and another for Y . The former graph would have a node associated with x3 whilethe latter graph would have a node associated with y1. The score of the first node ineach preference graph is initiated as one (x3:SC D 1).

As the mediator mutates its previous bid by flipping one value of the issues andrequests the agents’ feedbacks, it needs to update the preference models for eachnegotiating agent. In the case of updating a preference model, only the preferencegraph associated with the issue whose value has been recently changed is taken intoaccount. Other preference graphs do not need to be updated. For instance, considerthat the mediator generates its second bid by changing x3 by x2 and asks this bid(x2, y1) to the agents. Since only the value of X is changed, the agents’ feedbacksreflect their preferences on that issue. If an agent gives its feedback as “better”, thatmeans that agent prefers x2 to x3. Therefore, only the preference graphs belongingto X issue should be updated in that case.

While updating the preference graph based on the agent’s feedback, in additionto adding edges between nodes the mediator estimates or updates the score of thenodes. Algorithm 1 shows how this process is performed. In this algorithm, theprevious value xp is the value of the issue in the previous bid while the currentvalue xc is the value of that issue in the current bid. If xc does not exists in thepreference graph, the mediator creates a node and links it to the node associatedwith xp based on the feedback and accordingly assigns a score for the xc . If thefeedback is “better” then its score will be higher than the score of xp . In that case,we increase the score of xp by one and assign it to the xc . For example, the scoreof the x2 would be equal to two (D x3:SC C 1). If the feedback is “same”, thenthe score of the current value would be equal to the score of the previous value. Forexample, in the further bid, if the mediator generates the bid by flipping x2 by x4 and


Algorithm 1: Pseudo-algorithm for updating the score of the nodes in thepreference graph when the mediator flips the previous value xp by the currentvalue, xc for a given issue and gets a feedback from the agent

if xc not exists thenif feedback is BET TER then

xc:SC xp:SC C 1 ;if feedback is WORSE then

xc:SC xp:SC � 1 ;if feedback is SAME then

xc:SC xp:SC ;

elseif feedback is BET TER and xp:SC >D xc:SC then

foreach xi 2 { Comparable(xc , xi ) n {xpS

AllLessPreferred(xp)} } doxi :SC xi :SC C xp:SC � xc:SC C 1 ;

endif feedback is WORSE and xp:SC D< xc:SC then

foreach xi 2 { Comparable(xp , xi ) n {xcS

AllLessPreferred(xc)} } doxi :SC xi :SC C xc:SC � xp:SC C 1 ;

end

if feedback is SAME thenif xp:SC < xc:SC then

foreach xi 2 { Comparable(xp , xi ) n {xcS

AllLessPreferred(xc)} } doxi :SC xi :SC C xc:SC � xp:SC ;

endxp:SC xc:SC ;

if xp:SC > xc:SC thenforeach xi 2 { Comparable(xc , xi ) n {xp

SAllLessPreferred(xp)} } do

xi :SC xi :SC C xp:SC � xc:SC ;end

end

the agent gives the feedback as “same”, the score of the x4 would be also equal totwo (=x2:SC ). If the feedback is “worse” then the score of the current value wouldbe equal to the score of the previous value minus one. Consider that the further bidincludes x5 and the feedback is “worse”. In that case, the score of x5 would be equalto one (=x4:SC -1).

When the current value already exists in the graph, the process might bemore complicated in the emergence of a conflict. The conflict may occur whenthe previous value and current value are incomparable before the feedback. In thatcase when the score of the previous value is higher than the score of the currentvalue and the feedback is “better”, we need to update the score of current value.If we only update the score of the current value, some inconsistencies may occur;therefore, we increase the score of all values related to the current value except the


Fig. 3.2 A samplepreference graph for issue Y

Fig. 3.3 After updating thegraph in Fig. 3.2

previous value and all values less preferred than the previous value. To illustratethis, consider that we have a graph shown in Fig. 3.2. According to this graph, thevalues y6 and y1 are incomparable. When the previous value is y6 and the currentvalue y1, if the feedback given by the agent is “better”, we need to update the scoreof the y1 and all values related to it except the nodes that are less preferred thany6 (e.g. y3) will be updated. These nodes to be updated are y5, y1 and y4. Theirscore will be increased by two (D 2 � 1 C 1). Then, the graph will look like thegraph drawn in Fig. 3.3. Similar update process will perform when the feedback is“worse” or “same”, and there is a conflict between the score of the previous andcurrent values with respect to the given feedback.

We scaled each score in a way that all scores will be greater than zero andthe highest score would be one. These scaled scores correspond to the estimated


utilities with respect to our heuristic approach. The mediator uses these estimatedutilities to find the values giving the Nash product. To illustrate this, consider thatwe have three negotiating agents need to have an agreement on two issues, say Wand Z whose domains are D(W)D{w1, w2, w3} and D(Z)D{z1, z2}. Accordingly,the mediator constructs three models consisting of two preference graphs (one forW and another for Z) for those agents after generating its first bid. During thenegotiation, the mediator updates these models based on the agents’ feedbacks asexplained above. When the mediator decides to use its knowledge and to choose thevalue that increases the social welfare in terms of Nash product, it calculates theproduct of the estimated utilities of the agents for each value and selects the valuethat maximizes the product. Assume that the estimated utilities of the values for Wissue are as follows:

• M1 (for the first agent): EU(w1) D 1:0; EU(w2) D 0:66; EU(w3) D 0:33.• M2 (for the second agent): EU(w1) D 0:5; EU(w2) D 1; EU(w3) D 1.• M3 (for the third agent): EU(w1) D 0:33; EU(w2) D 0:66; EU(w3) D 1.

Based on the estimated utilities above, the mediator estimates the product asP(w1/ D 0:17, P(w2/ D 0:44 and P(w3/ D 0:33 by multiplying EU(wi ). Accordingto this example, the mediator chooses w2 for W issue whose product is themaximum. As stated before, final scores should be greater than zero. It stems fromthe fact that when we estimate the product of those scores, the result would be zeroif one of them is equal to zero.

3.3.2 Feedback Based Protocol

We present two variants of feedback based protocol for multilateral negotiation. Thefirst protocol is called Feedback Based Protocol (FBP). According to this protocol,the mediator generates its first bid randomly and sends it to the negotiating agents.After each bid, each negotiating agent gives a feedback such as “better”, “worse”and “same” to the mediator by comparing the current bid with the mediator’sprevious bid. For its further bids, the mediator mutates its previous bid by flippingone of the issues intelligently. This process continues iteratively until reaching apredefined number of bids.

In order to mutate its previous bid intelligently, the mediator needs to decidewhich issue will be changed and which value will be used for that issue. It can usethe learnt model to generate values maximizing the product (Nash); but this may notresult well at the beginning since there is no sufficient knowledge about the agents’preferences. Therefore, the mediator follows an approach like searching smartlythe outcome space for a while and then using its learnt model to generate valuesmaximizing the product.

Until reaching half of the negotiation time, it changes it previous bid by followingthe procedure below:


1. Unused Values: The mediator checks the issues whether they contain any valuethat has not been used in its bids yet. If there exist such issues, it randomlychooses one of them and assigns one of the unused value for that issue.

2. Incomparable Values: If all the issue values are used before, the mediator checkswhether one of the learnt models includes some issues whose values cannotbe comparable with those values in the previous bid. If there are incomparablevalues, the mediator will choose one of them randomly. This allows the mediatorto learn more preferential information from the agents. For instance, considerthat the previous offer (w1,z2) and according to the model for the second agentw1 cannot be compared with w2. If the mediator flips w1 by w2 and sends (w2,z2)to the agents as a bid for their feedback, the mediator would be able to comparethese values in the next time.

3. Random Values: If there are not any unused and incomparable values, themediator chooses an issue randomly whose value may improve the bid for allagents. That is, the chosen value should not be worse than the current value inthe previous bid.

4. Nash Values: If none of the issue values cannot improve the previous bid for allagents, the mediator chooses an issue randomly and selects the value for thatissue whose product of estimated utility of the agents is the maximum (Nashvalue) with respect to the learnt preference model.

After passing the half of the negotiation time, the mediator mostly exploits itsknowledge. That is, it chooses an issue randomly and changes the issue value inthe previous bid by the issue value whose product of estimated utilities of theagents is the maximum with respect to the learn models. Moreover, it can stillsearch the outcome space as explained in the procedure above with a probability.This probability will drop by a certain amount over time and it becomes zeroat the end of the negotiation. According to this probability, the mediator eithersearches the outcome space or exploits its knowledge about the agents’ preferences.Equation (3.2) shows how we calculate the probability for search.

PR.Search/ D TotalRound � CurrentRound � 1TotalRound

(3.2)

Up to now, we explain how the mediator generates its bids and updates itspreference models for the agents based on the given feedbacks by those agents.It is time to talk about how the mediator decides the final agreement. The mediatorkeeps and updates “last recent better bid” with respect to the agents’ feedbacks andcompletes the negotiation with this bid. A bid is accepted as “last recent better bid” ifnone of the agents’ current feedback consists of “worse”. Accordingly, the mediatorupdates the last recently better bid after each feedback. According to this protocol,the last recent better bid will be one of recent bids that are generated mostly bychoosing the values maximizing the product of estimated utilities of the agents withrespect to the learnt models since the mediator has a tendency only to exploit itsknowledge towards to the end of the negotiation.


3.3.3 Feedback and Voting Based Protocol

Our second protocol is called Feedback and Voting Based Protocol (FVBP). Thisprotocol consists of two phases:

• Searching and learning: In this phase, the mediator generates its bids and modelsthe negotiating agents’ preferences based on their feedbacks in a similar way asthe mediator does in Feedback Based Protocol. The only difference is that in thisprotocol the mediator does not try to generate nash values in this phase. It onlymutates its previous offers by flipping one of the issues by using the heuristicssuch as unused values, incomparable values and random values that may improvethe previous bid for all agents as explained in Sect. 3.3.2. Furthermore, if thereis no such a value, it considers that it is time to pass the second phase and actsaccordingly.

• Voting with estimated Nash bids: In this phase, the mediator generates estimatednash bids maximizing the product of the estimated utilities of the agents withrespect to the learnt model and asks the negotiating agents to vote them either toreject or accept. Negotiating agents act according to the simple text mediatedprotocol explained in Sect. 3.2 and vote the mediator’s current Nash bid bycomparing with the most recent accepted bid. In our protocol, the negotiationagents adopt Hill-Climber approach to vote the bid. After proposing all estimatedNash bids, the mediator finalizes the negotiation.

It is worth noting that the mediator does not have to wait for reaching the givendeadline. If it realizes there is no need for further search in the first phase, itimmediately passes the second phase in which the estimated Nash bids are generatedby the mediator and voted by the negotiating agents. Consequently, the mediatoris able to complete negotiation earlier. Another advantage of this protocol is thatthe first mutually accepted bid would be chosen among the estimated Nash bidsrather than a random bid. This decreases the chance of unfair negotiation outcomein the end. Notice that the first mutually accepted bid has a great influence of thenegotiation outcomes in single text mediated protocol. To illustrate this, considerthere are three negotiating agents and the utilities of the first mutually accepted bidfor each agent are 1:0, 0:4, 0:5 respectively. Since the first agent already gets thebest bid for himself, it may not have a tendency to accept the mediator’s furtherbids (i.e. Hill Climber) even though there might exist better bids as far as all agents’preferences are concerned.

3.4 Experiments

To evaluate the proposed protocols, we have extended GENIUS [7], which is aplatform for bilateral negotiation. Our extension enables more than two agentsto negotiate on this platform. Consequently, we compare the performance of the


Table 3.1 Groupconfigurations and themaximum product of utilitiesof the agents in each group

Maximum product ofGroup Agents utilities (Nash product)

Group-1 (A1-A2-A3) 0.76Group-2 (A1-A4-A5) 0.61Group-3 (A2-A4-A6) 0.50Group-4 (A3-A5-A6) 0.64Group-5 (A1-A7-A6) 0.78

proposed protocols with the mediated single text protocol presented in [6] withrespect to the product of utilities of the agents on the agreement and negotiationduration. We first give a brief information about our experimental setup and thenshows our results in the following sections.

3.4.1 Experimental Setup

In our experiments, we use the party domain from the repository of GENIUS

platform. This domain consists of following six issues: food, drinks, locations,invitations, music, and cleanup. For each issue, there are three or four possiblevalues. For example, the music issue has three possible values: MP3, DJ and Bandwhile for invitation issue there are four values: plain, custom-handmade, custom-printed and photo. The total number of possible outcomes is 3;072. We asked sevenstudents and faculty members from Delft University of Technology about theirpreferences on party domain. These preferences were elicited by means of additiveutility functions by using GENIUS platform. For multilateral negotiation, we set upfive different groups where each group consists of three individuals. Note that inour experiment, agents negotiate on behalf of the individuals. Table 3.1 shows theconfiguration for each group and the maximum product of utilities of the agents ineach group.

– To investigate the performance of the proposed protocols, each group negotiatesunder four different negotiation settings. These are:

• Hill-Climber: In this setting, Mediated Single Text Negotiation Protocol [6] isemployed. Each negotiating agent in the group adopts a hill climber strategyto decide its vote (accept/reject).

• Annealer: This setting also uses Mediated Single Text Negotiation Pro-tocol [6] but the negotiating agents employ an Annealer strategy to decidetheir votes (accept/reject).

• Feedback: In this setting, Feedback Based Protocol (Sect. 3.3.2) governs thenegotiation and each negotiating agents give feedback truly with respect totheir preferences.


Table 3.2 Average product of utilities of the agents over 100 negotia-tions when deadline is 50 rounds

Group Hill-Climber Annealer Feedback Feedback and votinga

Group-1 0.42 0.42 0.65 0.71Group-2 0.37 0.40 0.48 0.47Group-3 0.25 0.23 0.30 0.30Group-4 0.53 0.46 0.62 0.64Group-5 0.47 0.48 0.56 0.57Overall: 0.41 0.40 0.52 0.54aIt completes the negotiation in 30 rounds on average

• Feedback and Voting: The last setting employs Feedback and Voting BasedProtocol (Sect. 3.3.3). In the voting phase, the negotiating agents vote themediator’s bids by employing Hill-Climber strategy. That is, they only acceptsan offer if its utility is greater than the utility of the most recent mutuallyaccepted bid.

In our experiments, each negotiation group negotiates 100 times in each nego-tiating setting described above. We evaluate the protocols in term of the productof the utilities of the agents and negotiation duration. Note that to achieve a faircomparison, for the same negotiation runs, the same random seed is used in allnegotiation settings (Hill-Climber, Annealer, Feedback, and Feedback and Voting).

3.4.2 Results

Table 3.2 shows the average product of the utilities of the agents over 100negotiation when the deadline is set as 50 rounds. The results highlighted in bold arethe statistically best settings. We have analyzed these negotiation results by usingANOVA (Analysis of Variance). It is seen that Feedback and Voting Based Protocoland Feedback Based Protocol outperforms Mediated Single Text Negotiation Proto-col with Hill-Climber and Annealer settings in each group and overall with respectto the product of the utilities of the agents on the agreement. Overall, there is nostatistically significant difference in the performance of Feedback and Feedbackand Voting. However, the performance of Feedback and Voting is statisticallysignificantly better than that of Feedback as far as the results for Group � 1

and Group � 2 are concerned. Furthermore, all protocols except Feedback andVoting complete negotiation at 50 rounds. Although Feedback and Voting completesnegotiation earlier (30 rounds), it outperforms others on average.

When we set the deadline as 250 rounds, we obtain the results in Table 3.3.Firstly, we observe that the performance of Mediated Single Text NegotiationProtocol with Annealer increases drastically when the number of rounds increaseswhile the performance of that with Hill-Climber does not change at all. As stated



Group Hill Climber Annealer Feedback Feedback and votinga



Group Hill Climber Annealer Feedback Feedback and votinga


before, the problem with Hill-Climber is when one of the agents gets a high utilityin the previous rounds, it will not accept any bids whose utility is less then theprevious one even though those offers might be win-win solutions for all agents.By contrast, Annealer has a tendency to accept worse bids for itself earlier so thatthe agents can find win-win bids later. The performance of Feedback Based Protocolslightly increases when it has longer negotiation duration. Further, there is no changein the performance of Feedback and Voting Based Protocol since it completes thenegotiation in 30 rounds on average. Under 95% confidence level, it can be saidthat overall performance of Annealer, Feedback, and Feedback and Voting is notstatistically significantly different from each other. That is, they perform equallybetter than Hill-Climber. It is worth noting that Feedback and Voting Based Protocoldoes not only result in good agreement for all parties, but also it completes thenegotiation earlier.

Table 3.4 shows the product of the utilities of the agents when the deadline isset as 500 rounds. It is seen that the performance of Annealer slightly increaseswhen the deadline goes up from 250 to 500. An interesting result is that overallthe performance of Mediated Single Text Negotiation Protocol with Annealer isbetter than our feedback based protocols when the deadline is 500. This stemsfrom that Annealer searches more in the outcome space and finds win-win bids.However, Feedback and Voting Based Protocol completes negotiation quite earlierthan Annealer (30 versus 500). Furthermore, its performance is also close to the


performance of the Annealer. If we both consider the performance and negotiationduration, it can be concluded that Feedback and Voting Based Protocol is apromising protocol that results in reasonably good agreements in a short time.

3.5 Discussion

In this paper, we have presented two variants of feedback based multilateralnegotiation protocol: Feedback Based Protocol, and Feedback and Voting BasedProtocol. In those protocols, a mediator agent generates bids and asks negotiatingagents for their feedback about those bids. Accordingly, the mediator generates andupdates a preference model for each negotiating agent by interpreting the agents’feedbacks during the negotiation. By using the learnt model, the mediator generatesbetter bids for all agents over time.

We have compared the performance of the proposed protocols with the per-formance of the mediated single text negotiation protocol presented in [6] in anexperimental setting in terms of both the product of utilities of the agents andnegotiation duration. Our results show that Feedback and Voting Based Protocoldoes not only complete the negotiation with a reasonably good agreement for allagents but also completes negotiation early. Furthermore, when the deadline is short,Feedback and Voting Based Protocol and Feedback Based Protocol outperforms themediated single text negotiation protocol in terms of the product of utilities of theagents on the agreements. However, when the deadline is long, the mediated singletext negotiation protocol with Annealers performs slightly better than our protocols.This stems from the fact that Feedback and Voting Based Protocol completesnegotiation quite earlier than that protocol, and the Annealer allows the protocolto search more space. When the time is crucial and it is significant to negotiate assoon as possible, it is reasonable to employ Feedback and Voting Based Protocol.

Chalamish and Kraus presents an automated mediator for bilateral negotiationsin which agents share their preferences with the mediator [3]. In that study, themediator monitors the negotiation and suggests possible acceptable agreements forboth participants when it is necessary to speed up the negotiation. By contrast, theprotocols presented in this paper supports multilateral negotiation where there aremore than two negotiating agents. Also, agents do not share their preferences withthe mediator because of the privacy issue; instead the mediator tries to learn theirpreference ordering in our study.

Hemaissia et al. propose a multilateral protocol for cooperative negotiationdomains particularly crisis management systems [5]. In that study, the preferencesare elicited by a multi-criteria decision aid tool, which allows the user to representpreferences involving interdependencies between issue while in our study we elicitthe preferences by means of additive utility functions and assume there is nopreferential interdependency between issues. According to their protocol, eachagent specifies their general constraints before the negotiation so that the mediatoragent can propose realistic offers that satisfy those constraints. Accordingly, the


mediator generates an offer that has not been proposed before and asks otheragents about their opinion. Each agent evaluates the offer and sends a feedbackto the mediator. Next time, the mediator generates its offer by considering thesefeedbacks. The feedback involves whether the agent accepts or rejects the bid anda recommendation to improve the bid or to specify the criteria that should not bechanged while in our study the feedbacks are simpler than theirs so that agents donot need any tool to generate their feedbacks. However, in that study the agents usea multi-criteria decision aid tool to evaluate offers and to generate recommendationsduring the negotiation.

Lopez-Carmona et al. presents a multiparty negotiation protocol taking intoaccount the cases where no possible unanimous agreement exits. According tothe proposed protocol, a mediator agent chooses an initial contract randomly andaccordingly proposes a mesh, a set of contracts, from the initial contract. Eachagent privately informs the mediator about their preferences on these contracts. Themediator agent aggregates the individual preferences on each contract by using theOrdered Weighted Averaging (OWA) operator and finds the preferred contract withrespect to the aggregated preferences. By applying a search method, the mediatoragent decides whether to continue the negotiation by generating a new mesh orcomplete the negotiation with the current preferred contract. While in that study theagents share their preferences on a set of bids with the mediator, in our case theagents do not need to share their preferences; instead they give a feedback aboutthe current bid such as “it is better than the previous offer”. In our study, one of thechallenges is to model agent’s preferences based on the given feedbacks.

As future work, we are planning to investigate the effect of the domain size andthe number of negotiating agents on the performance of the protocols. Furthermore,the current preference model does not consider that each agent may have a differentweight for the same issue. It would be interesting to improve the model such a waythat it can handle such cases.

Acknowledgements This research is supported by the Dutch Technology Foundation STW,applied science division of NWO and the Technology Program of the Ministry of EconomicAffairs; the Pocket Negotiator project with grant number VICI-project 08075 and the NewGovernance Models for Next Generation Infrastructures project with NGI grant number 04.17.We would like to thank Mark Klein for his help about mediated single text negotiation, and alsoMaaike Harbers and Wietske Visser for their valuable comments.

References

1. Aydogan, R., Yolum, P.: Effective negotiation with partial preference information. In: Pro-ceedings of the Ninth International Joint Conference on Autonomous Agents and MultiagentSystems (AAMAS), pp. 1605–1606 (2010)

2. Aydogan, R., Baarslag, T., Hindriks, K., Jonker, C.M., Yolum, P.: Heuristic-based approachesfor CP-nets in negotiation. In: Proceedings of the Forth International Workshop on Agent-basedComplex Automated Negotiations (ACAN 2011), Taipei (2011)


3. Chalamish, M., Kraus, S.: An automated mediator for multi-issue bilateral negotiations. Auton.Agents Multi-Agent Syst. 24(3), 536–564 (2012)

4. Endriss, U.: Monotonic concession protocols for multilateral negotiation. In: Proceedings ofthe Fifth International Joint Conference on Autonomous Agents and Multiagent Systems,pp. 392–399 (2006)

5. Hemaissia, M., Seghrouchni, A.E., Labreuche, C., Mattioli, J.: A multilateral multi-issue nego-tiation protocol. In: Proceedings of the Sixth International Joint Conference on AutonomousAgents and Multiagent Systems, pp. 939–946 (2007)

6. Klein, M., Faratin, P., Sayama, H., Bar-Yam, Y.: Protocols for negotiating complex contracts.IEEE Intell. Syst. 18, 32–38 (2003)

7. Lin, R., Kraus, S., Baarslag, T., Tykhonov, D., Hindriks, K., Jonker, C.M.: Genius: an integratedenvironment for supporting the design of generic automated negotiators. Comput. Intell.(2012). http://onlinelibrary.wiley.com/doi/101111/j.1467-8640201200463x/full

http://onlinelibrary.wiley.com/doi/101111/j.1467-8640201200463x/full

Chapter 4Decoupling Negotiating Agents to Explorethe Space of Negotiation Strategies

Tim Baarslag, Koen Hindriks, Mark Hendrikx, Alexander Dirkzwager,and Catholijn Jonker

Abstract Every year, automated negotiation agents are improving on variousdomains. However, given a set of negotiation agents, current methods allow todetermine which strategy is best in terms of utility, but not so much the reason ofsuccess. In order to study the performance of the individual elements of a negotiationstrategy, we introduce an architecture that distinguishes three components whichtogether constitute a negotiation strategy: the bidding strategy, the opponent model,and the acceptance condition. Our contribution to the field of bilateral negotiation isthreefold: first, we show that existing state of the art agents are compatible with thisarchitecture; second, as an application of our architecture, we systematically explorethe space of possible strategies by recombining different strategy components;finally, we briefly review how the BOA architecture has been recently appliedto evaluate the performance of strategy components and create novel negotiationstrategies that outperform the state of the art.

Keywords Acceptance condition • Automated bilateral negotiation • Biddingstrategy • BOA architecture • Component-based • Opponent model

This is an extension of research presented at The Fifth International Workshop on Agent-basedComplex Automated Negotiations (ACAN 2012).

T. Baarslag (�) • K. Hindriks • M. Hendrikx • A. Dirkzwager • C. JonkerInteractive Intelligence Group, Delft University of Technology,Mekelweg 4, Delft, The Netherlandse-mail: [email protected]; [email protected]; [email protected];[email protected]; [email protected]


61






62 T. Baarslag et al.

4.1 Introduction

In recent years, many new automated negotiation agents have been developed in thesearch for an effective, generic automated negotiator. There is now a large bodyof negotiation strategies available, and with the emergence of the InternationalAutomated Negotiating Agents Competition (ANAC) [4, 6], new strategies aregenerated on a yearly basis.

While methods exist to determine the best negotiation agent given a set ofagents [4, 6], we still do not know which type of agent is most effective in general,and especially why. It is impossible to exhaustively search the large (in fact, infinite)space of negotiation strategies; therefore, there is a need for a systematic way ofsearching this space for effective candidates.

Many of the sophisticated agent strategies that currently exist are comprised of afixed set of modules. Generally, a distinction can be made between three differentmodules: one module that decides whether the opponent’s bid is acceptable; onethat decides what set of bids could be proposed next; and finally, one that tries toguess the opponent’s preferences and takes this into account when selecting an offerto send out. The negotiation strategy is a result of the complex interaction betweenthese components, of which the individual performance may vary significantly. Forinstance, an agent may contain a module that predicts the opponent’s preferencesvery well, but utility-wise, the agent may still perform badly because it concedes fartoo quickly.

This entails that overall performance measures, such as average utility obtainedin a tournament, make it hard to pinpoint which components of an agent workwell. To date, no efficient method exists to identify to which of the componentsthe success of a negotiating agent can be attributed. Finding such a method wouldallow to develop better negotiation strategies, resulting in better agreements; the ideabeing that well-performing components together will constitute a well-performingagent.

To tackle this problem, we propose to analyze three components of the agentdesign separately. We show that most of the currently existing negotiating agentscan be fitted into the so-called BOA architecture by putting together three maincomponents in a particular way; namely: a Bidding strategy, an Opponent model,and an Acceptance condition. We support this claim by re-implementing, amongothers, the ANAC agents to fit into our architecture. Furthermore, we show that theBOA agents are equivalent to their original counterparts.

The advantages of fitting agents into the BOA architecture are threefold: first, itallows the study of the behavior and performance of the individual components;second, it allows to systematically explore the space of possible negotiationstrategies; third, the identification of unique interacting components simplifies thecreation of new negotiation strategies.

Finally, we demonstrate the value of our architecture by assembling, from alreadyexisting components, new negotiating agents that perform better than the agentsfrom which they are created. This shows that by recombining the best performingcomponents, the BOA architecture can yield better performing agents.

4 Exploring the Space of Negotiation Strategies 63

The remainder of this paper is organized as follows. Section 4.2 discusses thework related to ours. In Sect. 4.3, the BOA agent architecture is introduced, and weoutline a research agenda on how to employ it. Section 4.4 provides evidence thatmany of the currently existing agents fit into the BOA architecture, and discusseschallenges in decoupling existing negotiation strategies. Section 4.5 shows how theBOA architecture has been applied in education and research. Finally, in Sect. 4.6we discuss lessons learned and provide directions for future work.

4.2 Related Work

Since this paper introduces an component-based architecture, we have surveyedliterature that investigates and evaluates such components. There are three cate-gories of related work: literature detailing the architecture of a negotiating agent’sstrategy; work that discusses and compares the performance of components ofa negotiation strategy; and finally, literature that explores and combines a set ofnegotiation strategies to find an optimal strategy.

4.2.1 Architecture of Negotiation Strategies

To our knowledge, there is little work in literature describing, at a similar level ofdetail as our work, the generic components of a negotiation strategy architecture.For example, Bartolini et al. [10] and Dumas et al. [16] treat the negotiation strategyas a singular component. However, there are some notable exceptions.

Jonker et al. [27] present an agent architecture for multi attribute negotiation,where each component represents a specific process within the behavior of the agent,e.g.: attribute evaluation, bid utility determination, utility planning, and attributeplanning. There are some similarities between the two architectures; for example,the utility planning and attribute planning component correspond to the biddingstrategy component in our architecture. In contrast to our work however, Jonkeret al. focus on tactics for finding a counter offer and do not discuss acceptanceconditions.

Ashri et al. [2] introduce a general architecture for negotiation agents, discussingcomponents that resemble our architecture; components such as a proposal evaluatorand response generator resemble an acceptance condition and bidding strategyrespectively. However, the negotiation strategy is described from a BDI-agentperspective (in terms of motivation and mental attitudes).

Hindriks et al. [25] introduce an architecture for negotiation agents incombination with a negotiation system architecture. Parts of the agent architecturecorrespond to our architecture presented below, but their focus is primarily on howthe agent framework can be integrated into a larger system.


4.2.2 Components of Negotiation Strategy

Evaluation of the performance of components is important to gain understanding ofthe performance of a negotiation strategy, and to find new, better strategies.

The notion of an opponent model as a component of a negotiation strategy hasbeen discussed by various authors in different forms, including models that estimatethe reservation value [41], the (partial) preference profile [23], the opponent’sacceptance of offers [33], and the opponent’s next move [13]. To our knowledge,there is limited work in which the performance of different opponent models iscompared. Two examples are the work by Papaioannou et al. [34], who evaluatea set of techniques that predict the opponent’s strategy in terms of resultingperformance gain, as well as computational complexity; and Baarslag et al. [5, 7],who compare the performance and accuracy of preference modeling techniques.The BOA architecture focuses on opponent models which estimate the (partial)preference profile, because most existing available implementations fit in thiscategory; however, in principle, our architecture can accommodate for the othertypes of opponent models as well.

Regarding acceptance conditions, the performance of a set of acceptancestrategies that depend on parameters such as time and utility thresholds have beenanalyzed in [8].

Although we are not the first to identify the BOA components in a negotiationstrategy, our approach seems to be unique in the sense that we vary all of thesecomponents at the same time, thereby creating new negotiation strategies, andimproving the state of the art in doing so.

4.2.3 Negotiation Strategy Space Exploration

Various authors have aimed to explore the automated negotiation strategy space bycombining a set of negotiation strategies.

Faratin et al. [18] analyze the performance of pure negotiation tactics on singleissue domains in a bilateral negotiation setting. The decision function of thepure tactic is then treated as a component around which the full strategy is built.While they discuss how tactics can be linearly combined, the performance of thecombined tactics is not analyzed.

Matos et al. [32] employ a set of baseline negotiation strategies that are timedependent, resource dependent, and behavior dependent [18], all with varyingparameters. The negotiation strategies are encoded as chromosomes and combinedlinearly, after which they are utilized by a genetic algorithm to analyze theeffectiveness of the strategies. The fitness of an agent is its score in a negotiationcompetition. This approach analyzes acceptance criteria that only specify a utilityinterval of acceptable values, and hence do not take time into account; furthermore,the agents do not employ explicit opponent modeling.


Eymann [17] also uses genetic algorithms with more complex negotiatingstrategies, evolving six parameters that influence the bidding strategy. The geneticalgorithm uses the current negotiation strategy of the agent and the opponentstrategy with the highest average income to create a new strategy, similar toother genetic algorithm approaches (see Beam and Segev [11] for a discussionof genetic algorithms in automated negotiation). The genetic algorithm approachmainly treats the negotiation strategy optimization as a search problem in whichthe parameters of a small set of strategies are tuned by a genetic algorithm.We analyze a more complex space of newly developed negotiation strategies inour approach, as our pool of surveyed negotiation strategies consists of strategiesintroduced in the ANAC competition [4, 6], as well as the strategies discussed byFaratin et al. [18]. Furthermore, each strategy consists of components that can haveparameters themselves.

Finally, Ilany and Gal [26] take the approach of selecting the best strategy froma predefined set of agents, based on the characteristics of a domain. The differencewith our work is that they combine whole strategies, whereas the BOA architecturecombines the components of strategies. Our contribution is to define and implementan architecture that allows to easily vary all main components of a negotiating agent.

4.3 The BOA Agent Architecture

In the last decade, many different negotiation strategies have been introduced inthe pursuit of a versatile and effective automated negotiator (see related workin Sect. 4.2). Current work often focuses on optimizing the negotiation strategyas a whole. We propose to direct our attention to a component-based approach,especially now that we have access to a large repository of mutually comparablenegotiation strategies due to ANAC. This approach has several advantages:

1. Given measures for the effectiveness of the individual components of a negoti-ation strategy, we are able to pinpoint the most promising components, whichgives insight into the reasons for success of the strategy;

2. Focusing on the most effective components helps to systematically search thespace of negotiation strategies by recombining them into new strategies.

We make a distinction between two types of components in the sections below:elements that are part of the agent’s environment, and components that are part ofthe agent itself.

4.3.1 Negotiation Environment

We employ the same negotiation environment as in [4, 6, 31]; that is, we considerbilateral automated negotiations, where the interaction between the two negotiatingparties is regulated by the alternating-offers protocol [35]. The agents negotiate over


Fig. 4.1 The BOA architecture negotiation flow

a set of issues, as defined by the negotiation domain, which holds the informationof possible bids, negotiation constraints, and the discount factor. The negotiationhappens in real time, and the agents are required to reach an agreement (i.e., oneof them has to accept) before the deadline is reached. The timing of acceptance isparticularly important because the utility may be discounted, that is: the utility of anagreement may decrease over time.

In addition to the domain, both parties also have privately-known preferencesdescribed by their preference profile. While the domain is common knowledge,the preference profile of each player is private information. This means that eachplayer only has access to its own utility function, and is unaware of the opponent’spreferences. The player can attempt to learn this during the negotiation encounterby analyzing the bidding history, using an opponent modeling technique.

4.3.2 The BOA Agent

Based on a survey of literature and the implementations of currently existingnegotiation agents, we identified three main components of a general negotiationstrategy: a bidding strategy, possibly an opponent model, and an acceptancecondition (BOA). The elements of a BOA agent are visualized in Fig. 4.1. In orderto fit an agent into the BOA architecture, it should be possible to distinguish thesecomponents in the agent design, with no dependencies between them. An expositionof the agents we considered is given in the next section, which will further motivatethe choices made below.

1. Bidding strategy. A bidding strategy is a mapping from a negotiation trace toa bid. The bidding strategy determines the appropriate concessions to be made,depending on factors such as the opponent’s negotiation trace, a target threshold,time, discount factor, etc. The bidding strategy can consult the opponent modelby passing one or multiple bids to see how they compare within the estimatedopponent’s utility space.Input: opponent utility of bids, negotiation trace.Output: provisional upcoming bid.


2. Opponent model. An opponent model is a learning technique that constructs amodel of the opponent’s preferences. In our approach, the opponent model shouldbe able to estimate the opponent’s utility of any given bid.Input: set of possible bids, negotiation trace.Output: estimated opponent utility of a set of bids.

3. Acceptance Condition. The acceptance condition determines whether the bidthat the opponent presents is acceptable.Input: provisional upcoming bid, negotiation trace.Output: send accept, or send out the upcoming bid.

The components interact in the following way (the full process is visualizedin Fig. 10.1): when receiving the opponent’s bid, the BOA agent first updates thebidding history and opponent model to make sure that up-to-date data is used,maximizing the information known about the environment and opponent.

Given the opponent bid, the bidding strategy determines the counter offer byfirst generating a set of bids with a similar preference for the agent. The biddingstrategy uses the opponent model (if present) to select a bid from this set by takingthe opponent’s utility into account.

Finally, the acceptance condition decides whether the opponent’s action shouldbe accepted. If the opponent’s bid is not accepted by the acceptance condition, thenthe bid generated by the bidding strategy is offered instead. At first glance, it mayseem counter-intuitive to make this decision at the end of the agent’s deliberationcycle. Clearly, deciding upon acceptance at the beginning would have the advantageof not wasting resources on generating an offer that might never be sent out.However, generating an offer first allows us to employ acceptance conditions thatdepend on the utility of the counter bid that is ready to be sent out. This methodis widely used in existing agents [8]. Such acceptance mechanisms can make amore informed decision by postponing their decision on accepting until the laststep; therefore, and given our aim to incorporate as many agent designs as possible,we adopt this approach in our architecture.

4.3.3 Employing the BOA Architecture

We have implemented the BOA architecture as an extension of the GENIUS

framework [31]. GENIUS stands for Generic Environment for Negotiation withIntelligent multi-purpose Usage Simulation, and is a negotiation platform thatimplements an open architecture supporting heterogeneous agent negotiation. Theframework was developed as a research tool to facilitate the design of negotiationstrategies and to aid in the evaluation of negotiation algorithms. It provides aflexible and easy to use environment for implementing agents and negotiationstrategies as well as running negotiations. GENIUS can further aid the developmentof a negotiation agents by acting as an analytical toolbox, providing a variety oftools to analyze the negotiation agents performance, based on the outcome and


Fig. 4.2 The BOA architecture GUI

dynamics of the negotiation. The BOA architecture has been integrated seamlesslyinto the GENIUS framework, offering the user the ability to create and apply newlydeveloped components using a graphical user interface as depicted in Fig. 4.2. Fromthe perspective of GENIUS, a negotiation agent is identical to a BOA agent, andtherefore both types of agents can participate in the same tournament.

The framework enables us to follow at least two approaches: first of all, it allowsus to independently analyze the components of every negotiation strategy that fits into our architecture. For example, by re-implementing the ANAC agents in the BOAarchitecture, it becomes possible to compare the accuracy of all ANAC opponentmodels, and to pinpoint the best opponent model among them. Following thisapproach, we are able to identify a categories of opponent models that outperformothers [5, 7]; naturally, this helps to build better agents in the future.

Secondly, we can proceed to mix different BOA components, e.g.: replace theopponent model of the runner-up of ANAC by a different opponent model and thenexamine whether this makes a difference in placement. Such a procedure enables usto assess the reasons for an agent’s success, and makes it possible to systematicallysearch for an effective automated negotiator.

The first part of the approach gives insight in what components are best inisolation; the second part gives us understanding of their influence on the agentas a whole. At the same time, both approaches raise some key theoretical questions,such as:

1. Can the BOA components be identified in all, or at least most, current negotiatingagents?

2. How do we measure the performance of the components? Can a single bestcomponent be identified, or does this strongly depend on the other components?

3. If the individual components perform better than others (with respect to someperformance measure), does combining them in an agent also improve the agent’sperformance?

In this work we do not aim to fully answer all of the above questions; instead, weoutline a research agenda, and introduce the BOA architecture as a tool that can beused towards answering these questions.


Nonetheless, in the next section, we will provide empirical support for anaffirmative answer to the first theoretical question: indeed, in many cases thecomponents of the BOA architecture can be identified in current agents, and wewill also provide reasons for when this is not the case.

The answer to the second question depends on the component under con-sideration: for an opponent model, it is straightforward to measure its effec-tiveness [5, 7, 24]: the closer the opponent model is to the actual profile of theopponent, the better it is. The performance of the other two components of the BOAarchitecture is better measured in terms of utility obtained in negotiation (as has beendone for acceptance strategies in [8]), as there seems no clear alternative method todefine the effectiveness of the acceptance condition or bidding strategy in isolation.In any case, the BOA architecture can be used as a research tool to help answer suchtheoretical questions.

Regarding the third question: suppose we take the best performing biddingstrategy, equip it with the most faithful opponent model, and combine this withthe most effective acceptance condition; it would seem reasonable to assumethis combination results in an effective negotiator. We plan to elaborate on thisconjecture in future work (see also Sect. 4.6); however, Sect. 4.5 will already providea first step towards this goal by recombining components of ANAC agents to createmore effective agents than the original versions.

4.4 Decoupling Existing Agents

In this section we provide empirical evidence that many of the currently existingagents can be decoupled by separating the components of a set of state of the artagents. This section serves three goals: first, we discuss how existing agents canbe decoupled into a BOA agent; second, we argue that the BOA architecture designis appropriate, as most agents will turn out to fit in our architecture; third, we discussand apply a method to determine if the sum of the components—the BOA agent—isequal in behavior to the original agent.

4.4.1 Identifying the Components

In this section we identify the components of 21 negotiating agents, taken fromthe ANAC competition of 2010 [4], 2011 [9] and 2012. We selected these agentsas they represent the current state of the art in automated negotiation, having beenimplemented by various negotiation experts.

Since the agents were not designed with decoupling in mind, all agents hadto be re-implemented to be supported by the BOA architecture. Our decouplingmethodology was to adapt an agent’s algorithm to enable it to switch its components,without changing the agent’s functionality. A method call to specific functionality,


such as code specifying when to accept, was replaced by a more generic call tothe acceptance mechanism, which can then be swapped at will. The contract of thegeneric calls are defined by the expected input and output of every component, asoutlined in Sect. 4.3.2.

The first step in decoupling an agent is to determine which components canbe identified. For example, in the ANAC 2010 agent FSEGA [36], an acceptancecondition, a bidding strategy, and an opponent model can all be identified. Theacceptance condition combines simple, utility-based criteria (called ACconst andACprev in [8]), and can be easily decoupled in our architecture. The opponent modelis a variant of the Bayesian opponent model [5,7,23], which is used to optimize theopponent utility of a bid. Since this usage is consistent with our architecture (i.e., theopponent model provides opponent utility information), the model can be replacedby a call to the generic opponent model interface. The final step is to changethe bidding strategy to use the generic opponent model and acceptance conditionsinstead of its own specific implementation. In addition to this, the opponent modeland acceptance condition need to be altered to allow the other bidding strategies touse it. Other agents can be decoupled using a similar process.

Unfortunately, some agent implementations contain slight dependencies betweendifferent components. These dependencies needed to be resolved to separate thedesign into singular components. For example, the acceptance condition and biddingstrategy of the ANAC 2011 agent The Negotiator1 rely on a shared target utility.In such cases, the agent can be decoupled by introducing Shared Agent State(SAS) classes. A SAS class avoids code duplication, and thus performance loss, bysharing the code between the components. One of the components uses the SAS tocalculate the values of the required parameters and saves the results, while the othercomponent simply asks for the saved results instead of repeating the calculation.

Table 4.1 provides an overview of all agents that we re-implemented in ourarchitecture, and more specifically, which components we were able to decouple.In fact, we were able to decouple all ANAC 2010, and most ANAC 2011 and ANAC2012 agents.

There were two agents (ValueModelAgent [21] and Meta-Agent [26]) that werenot decoupled due to practical reasons, even though theoretically it is possible.The ValueModelAgent was not decoupled because there were unusually strongdependencies between its components. Decoupling the strategy would result incomputationally heavy components when trying to combine them with othercomponents, making them impractical to use. The ANAC 2012 Meta-Agent choosesan offer among 17 agents from the ANAC 2011 qualifying round. This agent wasnot decoupled because it requires the decoupling of all 17 agents, of which only 8optimized versions entered the finals.

The CUHKAgent, like ValueModelAgent, is heavily coupled with multiplevariables that are shared between the bidding strategy and acceptance condition.This makes it very hard to decouple and can make components unusable in

1Descriptions of all ANAC 2011 agents can be found in [6].


Tabl

e4.

1O

verv

iew

ofth

eB

OA

com

pone

nts

foun

din

ever

yag

ent

AN

AC

2010

BO

AA

NA

C20

11B

OA

AN

AC

2012

BO

A

FSE

GA

[36]

XX

XA

gent

K2

[30]

X¿

XA

gent

LG

X¿

XA

gent

K[2

9]X

¿X

BR

AM

Age

nt[2

0]X

–X

Age

ntM

RX

¿X

Age

ntSm

ith[3

7]X

XX

Gah

boni

nho

[12]

X–

XB

RA

MA

gent

2X

–X

IAM

craz

yHag

gler

[39]

X¿

XH

ardH

eade

d[3

8]X

XX

CU

HK

Age

nt[2

2]X

––

IAM

hagg

ler

[39]

XX

XIA

Mha

ggle

r201

1[4

0]X

¿X

IAM

hagg

er20

12X

¿X

Noz

omi

X¿

XN

ice

Tit

for

Tat

[9]

XX

XO

MA

CA

gent

[14]

X¿

XY

ushu

[1]

X¿

XT

heN

egot

iato

r[1

5]X

¿X

The

Neg

otia

tor

Rel

.X

XX

X:o

rigi

nalh

asco

mpo

nent

,whi

chca

nbe

deco

uple

d.¿

:ori

gina

lhas

nosu

chco

mpo

nent

,but

itca

nbe

adde

d.–

:no

supp

ortf

orsu

cha

com

pone

nt


combination with other components (e.g. variables might not properly be set).However, since CUHKAgent was placed first in the ANAC 2012 competition, wedecided to decouple its bidding strategy, allowing it to work with other acceptanceconditions and opponent models.

Four additional agents were only partially decoupled: AgentLG, BRAMAgent,BRAMAgent2, and Gahbininho. As is evident from Table 4.1, the only obstacle indecoupling these agents fully is their usage of the opponent model, as it can beemployed in many different ways. Some agents, such as Nice Tit for Tat, attempt toestimate the Nash point on the Pareto frontier. Other common applications include:ranking a set of bids according to the opponent utility, reciprocating in opponentutility, and extrapolating opponent utility. The generic opponent model interfaceneeds to sufficiently accommodate such requirements from the bidding strategy tomake interchangeability possible. For this reason we require the opponent modelinterface to be able to produce the estimated opponent utility of an arbitrarynegotiation outcome.

With regard to the opponent model, there are three groups of agents: first, thereare agents such as FSEGA [36], which use an opponent model that can be freelyinterchanged; second, there are agents such as the ANAC 2010 winner AgentK [28], which do not have an opponent model themselves, but can be extendedto use one. Such agents typically employ a bidding strategy that first decides upona specific target utility range, and then picks a random bid within that range. Theseagents can easily be fitted with an opponent model instead, by passing the utilityrange through the opponent model before sending out the bid. Lastly, there areagents, for example Gahboninho and BRAMAgent, that use a similarity heuristicwhich is not compatible with our architecture, as their opponent models do notyield enough information to compute the opponent utility of bids. For these typeof agents, we consider the opponent model part of the bidding strategy. AgentLGalso uses an opponent model which is not compatible with our BOA architecture;however, it has been adopted to be able to use other opponent models.

When decoupling the agents, we can distinguish different classes within eachcomponent, except for the bidding strategy component, which varies greatlybetween different agents. For instance, there are only three main types of oppo-nent models being used: Bayesian models, Frequency models, and Value models.Bayesian models are an implementation of a (scalable) model of the opponentpreferences that is updated using Bayesian learning [23,41]. The main characteristicof frequency based models is that they track the frequency of occurrence of issuesand values in the opponent’s bids and use this information to estimate the opponent’spreferences. Value models take this approach a step further and solely focus on thefrequency of the issue values. In practice, Bayesian models are computationallyintensive, whereas frequency and value models are relatively light-weight.

Similar to the opponent models, most agents use variations and combinations of asmall set of acceptance conditions. Specifically, many agents use simple thresholdsfor deciding when to accept (called ACconst in [8]) and linear functions that dependon the utility of the bid under consideration (ACnext.˛; ˇ/ [8]).


4.4.2 Testing Equivalence of BOA Agents

A BOA agent should behave identically to the agent from which its components arederived. Equivalence can be verified in two ways; first, given the same negotiationenvironment and the same state, both agents should behave in exactly identicalways; second, the performance in a real time negotiation of both agents should besimilar.

4.4.2.1 Identical Behavior Test

Two deterministic agents can be considered equivalent if they perform the sameaction given the same negotiation trace. There are two main problems in determiningequivalence: first, most agents are non-deterministic, as they behave randomly incertain circumstances; for example, when picking from a set of bids of similarutility; second, the default protocol in GENIUS uses real time [31], which is highlyinfluenced by CPU performance. This means that in practice, two runs of the samenegotiation are never exactly equivalent.

To be able to run an equivalence test despite agents choosing actions at random,we fixed the seeds of the random functions of the agents. The challenge of workingin real time was dealt with by changing the real time deadline to a maximum amountof rounds. Since time does not pass within a round, cpu performance does not playa role.

All agents were evaluated on the ANAC 2011 domains (see [6] for a domainanalysis). The ANAC 2011 domains vary widely in characteristics: the number ofissues ranges from 1 to 8, the size from 3 to 390,625 possible outcomes, and thediscount from none (1:0) to strong (0:424). Some ANAC 2010 agents, specificallyAgent Smith and Yushu, were not designed for large domains and were thereforerun on a subset of these domains.

The opponent strategies used in the identical behavior test should satisfytwo properties: the opponent strategy should be deterministic, and secondly, theopponent strategy should not be the first to accept, to avoid masking errors inthe agent’s acceptance condition. Given these two criteria, we used the standardtime-dependent tactics [18, 19] for the opponent bidding strategy. Specifically, weuse Hardliner (e D 0), Linear Conceder (e D 1), and Conceder (e D 2). Inaddition, we use the Offer Decreasing agent, which offers the set of all possiblebids in decreasing order of utility.

All original and BOA agents were evaluated against these four opponents, usingboth preference profiles defined on all eight ANAC 2011 domains. Both strategieswere run in parallel, making sure that the moves made by both agents were equiv-alent at each moment. After the experiments were performed, the results indicatedthat all BOA agents were exactly identical to their original counterparts except forAgentMR and AgentLG. Both these agents do not have identical behavior with itsBOA counter-part because of the order in which the components are called; their


Table 4.2 ANAC 2011 reference results of the original agents using our hardware (n D 10)

Agent Amsterdam Company Nice Meantrip Camera Car Energy Grocery acquisition Laptop or die utility

HardHeaded 0.891 0.818 0.961 0.664 0.725 0.747 0.683 0.571 0.757Gahboninho 0.912 0.659 0.928 0.681 0.667 0.744 0.726 0.571 0.736Agent K2 0.759 0.719 0.922 0.467 0.705 0.777 0.703 0.429 0.685IAMhaggler 2011 0.769 0.724 0.873 0.522 0.725 0.814 0.749 0.300 0.685BRAMAgent 0.793 0.737 0.815 0.420 0.724 0.744 0.661 0.571 0.683The Negotiator 0.792 0.744 0.913 0.524 0.716 0.748 0.674 0.320 0.679Nice Tit for Tat 0.733 0.765 0.796 0.508 0.759 0.767 0.660 0.420 0.676Value Model Agent 0.839 0.778 0.935 0.012 0.767 0.762 0.661 0.137 0.611

Best results are marked bold

implementation requires that they first test if the opponent’s bid is acceptable, andthen determine the bid to offer. As discussed above, this is exactly the opposite ofwhat the BOA agent does.

4.4.2.2 Similar Performance Test

Two agents can perform the same action given the same input, but may still achievedifferent results because of differences in their real time performance. When decou-pling agents, there is a trade-off between the performance and interchangeability ofcomponents. For example, most agents record only a partial negotiation history,while some acceptance strategies require the full history of the agent and/or itsopponent. In such cases, the agent can be constrained to be incompatible withthese acceptance strategies, or generalized to work with the full set of availableacceptance strategies. We typically elected the most universal approach, even whenthis negatively influenced performance. We will demonstrate that while there issome performance loss when decoupling existing agents, it does not significantlyimpact the negotiation outcome.

The performance of the BOA agents was tested by letting them participate in theANAC 2011 tournament (using the same setup, cf. [6]). The decoupled ANAC 2011agents replaced the original agents, resulting in a tournament with eight participants.For the other BOA agents this was not possible, as their original counterparts didnot participate in the ANAC 2011 competition. Therefore, for each of these agentswe ran a modified tournament in which we added the original agent to the poolof ANAC 2011 agents, resulting in a tournament with nine participants. Next, werepeated this process for the BOA agents and evaluated the similarity of the results.

For our experimental setup we used computers that were slower compared tothe IRIDIS high-performance computing cluster that was used to run ANAC 2011.As we were therefore unable to reproduce exactly the same data, we first recreatedour own ANAC 2011 tournament data as depicted in Table 4.2, which is used as ourbaseline to benchmark the decoupled agents. The difference in performance causedsmall changes compared to the official ANAC 2011 ranking, as Agent K2 movedup from 5th to 3rd place.


Table 4.3 Differences in overall utility and time of agreement between the original agents andtheir decoupled version

Diff. Time Agr. SD Time Agr. Diff. Utility SD Utility

Agent K [29] 0.001 0.003 0.006 0.006Agent Smith [37] 0.010 0.010 0.004 0.006FSEGA [36] 0.001 0.004 0 0.003IAMcrazyHaggler [39] �0:0044 0.012 0.003 0.013IAMhaggler [39] 0.003 0.015 0.002 0.011Nozomi 0.003 0.009 0.004 0.008Yushu [1] 0.002 0.004 0.002 0.005Agent K2 [30] 0.002 0.009 0.001 0.005BRAMAgent [20] 0.004 0.011 0 0.006Gahboninho [12] 0.001 0.008 0.006 0.005HardHeaded [38] �0:003 0.003 �0:009 0.004IAMhaggler2011 [40] �0:010 0.013 �0:002 0.003Nice Tit for Tat [9] 0.006 0.010 �0:008 0.005The Negotiator [15] 0 0.002 0 0.004BRAMAgent2 0.002 0.011 �0:015 0.012IAMhaggler2012 �0:005 0.006 �0:013 0.003OMAC Agent [14] 0.003 0.003 0.012 0.015

Positive difference means the BOA agent performed slightly better

Table 4.3 provides an overview of the results. We evaluated the performance interms of the difference in overall utility as well as the difference in time of agreementbetween the original and the BOA agents. The table does not list the agents that werenot decoupled, and we also omitted The Negotiator Reloaded from the test set, asthis agent was already submitted as a fully decoupled BOA agent.

From the results, we can conclude that the variation between the original andthe BOA version is minimal; the majority of the standard deviations for both thedifference in overall utility and time of agreement are close to zero. The largestdifference between the original and decoupled agents with regard to the averagetime of agreement is 0.010 (Agent Smith); and for the average utility the largestdifference is 0.015 (BRAMAgent2). Hence, in all cases the BOA agents and theiroriginal counterparts show comparable performance.

4.5 Applications of the BOA Architecture

The BOA architecture has already been widely applied since it was first released.Since its implementation in 2011, the BOA architecture has been used in theANAC competitions that followed. In ANAC 2012, the BOA agent The NegotiatorReloaded reached the finals and finished overall third and received the reward forbest performing agent in non-discounted domains. In ANAC 2013, two agents thatused the BOA architecture reached the finals. The agent Inox finished fourth, andThe Fawkes agent won the 2013 competition.


The BOA architecture has also found its way into the classroom. At academicinstitutes such as Bar-Ilan University, Ben-Gurion University of the Negev, Maas-tricht University, and Delft University of Technology, GENIUS and the BOAarchitecture have been integrated into artificial intelligence courses, where part ofthe syllabus covers automated negotiation and the creation of negotiation strategies.2

The BOA framework offers the students an easier and more structured way todevelop a negotiation strategy, and causes them to think more critically about thecomponents they design themselves, which in turn helps them understand the innerworkings of a negotiation strategy.

The BOA framework also allows us to search the large space of negotiationstrategies [5,7]. Section 4.5.1 describes techniques integrated in the BOA frameworkthat aid in this search by scaling down the negotiation strategy space. Section 4.5.2describes an application of this technique, where we employ the BOA framework toimprove upon existing ANAC strategies.

4.5.1 Scaling the Negotiation Space

Suppose that two negotiating BOA agents A and B have identical biddingmechanisms and the same opponent modeling technique, so that only theiracceptance criteria differs. Furthermore, suppose agent A accepted in the middleof the negotiation, while agent B accepted somewhere towards the end. The agentsaccepted at a different time during the negotiation, but their bidding behavior willbe identical up to the point of the first acceptance. The only difference between thecomplete traces is that the trace of agentA is cut-off in the middle of the negotiation.

In the BOA architecture we exploit this property by running all acceptanceconditions in parallel while we record when each acceptance condition accepts.This drastically reduces the amount of different component combinations, as anyamount of acceptance conditions can be investigated during one negotiation session.We refer to this approach as multi-acceptance criteria (MAC). Note that a similartechnique cannot be applied for the bidding strategy and the opponent model, asboth components directly influence the negotiation trace.

In addition, a large number of acceptance conditions varying only in theirparameter value can be tested during the same negotiation thread. This techniquecan then be used to easily optimize a parameter of a single acceptance condition.Note that this approach assumes that checking additional acceptance conditionsdoes not introduce a large computational overhead. In practice we found that thecomputational overhead was less than 5%, even when more than 50 variants ofacceptance conditions were used at the same time.

2Educational material for the BOA architecture can be freely downloaded from ii.tudelft.nl/genius/#Education.

ii.tudelft.nl/genius/#Education.

ii.tudelft.nl/genius/#Education.


Table 4.4 All acceptance conditions that were used in the experimentto search the negotiation strategy space

Acceptance condition Range Increments

ACcombi.T;MAXW / T2 [0.95, 0.99] 0.01ACnext.˛; ˇ/ ˛ 2 [1.0, 1.05] 0.05

ˇ 2 [0.0, 0.1] 0.05ACopt:stop – –ACAgentLG – –ACOMAC – –ACTheNegotiatorReloaded – –

4.5.2 Improving the State of the Art

Using the scaling methods discussed in the previous section, we give a practicalapplication of the BOA architecture to show how it can be employed to explore thenegotiation strategy space. To do so, we considered the original bidding strategy ofevery ANAC agent, and attempted to find a better accompanying opponent modeland acceptance condition.

4.5.2.1 Searching the Negotiation Space

We used the following combinations of BOA components:

(B) For the bidding strategies, we used all ANAC agents that we were able tosuccessfully decouple (see Table 4.1).

(O) For our opponent model set, we selected the best Bayesian model (IAMhag-gler Bayesian Model [39]), frequency model (Smith Frequency Model [37], andthe best value model (CUHKAgent Value Model [22]) as identified in [7].

(A) All acceptance conditions of the top four agents of ANAC 2012 were used,except for CUHKAgent as it could not be decoupled. In addition, we used a setof baseline acceptance criteria, such as ACcombi.T;MAXW / [8], and an optimalstopping acceptance condition ACopt:stop based on Gaussian process strategyprediction as discussed in [3]. Table 4.4 provides an overview of all 15 testedacceptance conditions.

For each bidding strategy, we ran a tournament on a subset of the ANAC 2012domains against the eight ANAC 2012 agents. Note that even if MAC is applied, thespace to be explored can still be impractically large. This is already problematic fora limited amount of domains and agents. To illustrate, ANAC 2011 consists of 448negotiation sessions [6] which may all last 3 min. In worst case, it requires 22 h torun a single tournament, and almost 4 weeks for running it 28 times, as we did forthe similarity test discussed in Sect. 4.4.2.2.

We opted to use a representative subset of the domains to improve scalability. Thefollowing domains were used: Barter (80), IS BT Acquisition (size 384), Barbecue


(1,440), Phone (1,600), Energy (small) (15,625), and Supermarket (112,896). Sincethe ANAC 2010 agents are not compatible with discounts and reservation values,these were removed from the domains. To further improve scalability, a rounds-based protocol was used with a deadline of 3000 rounds, and we used the scalabilityoptimization techniques as discussed in Sect. 4.5.1. The complete tournament isrepeated five times to improve reliability of the results.

4.5.2.2 Experimental Results

From the 19 ANAC agents considered in this work, we were able to considerablyimprove 16, as depicted in Table 4.5. This table shows the optimal acceptancecondition and opponent model for each agent, as well as their scores in thetournament. Due to scalability issues, some agents were only run on the foursmallest domains instead of all six domains. Therefore, we show the results forthese four domains, as well as for all domains.

Besides the utility gain, the overview also indicates the agent’s ranking beforeand after the optimization of the components. As is evident from the results,most agents were significantly improved by swapping their components with theoptimized versions. To illustrate: IAMcrazyHaggler’s ranking improves from thetwelfth place to the fourth when it employs IAMhaggler’s opponent model, andoptimal stopping as its acceptance mechanism.

The only agents we were not able to improve are Yushu, The Negotiator andBRAMAgent2. There are two main reasons for this: the first reason is that some ofthese agents do not use an opponent model at all, or because their bidding techniquedoes not benefit much from one. The second reason is that these agents alreadyemploy acceptance criteria that perform well, or have an acceptance strategy that istightly coupled with their biddings strategy.

An interesting pattern in the results, is that nearly all agents were improved byusing the acceptance condition ACopt:stop. For the opponent model, the IAMhagglerBayesian Model is often best, although the results indicate that the differencesbetween the opponent models are minimal; that is, a better acceptance strategy oftenresults in a larger gain than an improved opponent model.

All in all, the results demonstrate that the BOA architecture not only assists inexploring the negotiation strategy space and to strongly improve existing agents,but it also helps to identify which components of the agent are decisive in itsperformance.

4.6 Conclusion and Future Work

This paper introduces an architecture that distinguishes the bidding strategy,the opponent model, and the acceptance condition in negotiation agents, andrecombines these components to systematically explore the space of automated


Tabl

e4.

5R

esul

tsof

the

optim

ized

BO

Aag

ents

,whe

nte

sted

onbo

thnD6

dom

ains

andnD4

dom

ains

OM

AC

Dif

f 6R

ank 6

pre

Ran

k 6po

stD

iff 4

Ran

k 4pr

eR

ank 4

post

CU

HK

Age

nt–

AC

next.1;0/

0.00

11

10.

081

11

Gah

boni

nho

–A

CA

gent

LG

0.00

62

20.

007

22

The

Neg

.Rel

.C

UH

KA

Cop

t:sto

p0.

037

32

0.02

63

2O

MA

CA

gent

CU

HK

AC

Age

ntL

G0.

014

44

0.00

55

5A

gent

K2

IAH

AC

opt:s

top

0.04

25

40.

042

64

Age

ntK

CU

HK

AC

opt:s

top

0.04

76

40.

044

75

IAM

hagg

ler2

011

IAH

AC

opt:s

top

0.02

27

70.

008

99

IAM

hagg

ler2

012

Smit

hA

Cop

t:sto

p0.

059

84

0.07

711

5H

ardH

eade

dIA

HA

Cop

t:sto

p0.

134

93

0.13

313

2B

RA

MA

gent

–A

Cop

t:sto

p0.

036

107

0.03

710

8N

ozom

iSm

ith

AC

opt:s

top

0.16

011

40.

155

144

IAM

craz

yHag

gler

IAH

AC

opt:s

top

0.19

012

40.

186

165

FSE

GA

CU

HK

AC

opt:s

top

––

–0.

027

128

Age

ntSm

ithIA

HA

CO

MA

C–

––

0.10

315

9IA

Mha

ggle

rIA

HA

Cop

t:sto

p–

––

0.07

28

4N

ice

Tit

for

Tat

IAH

AC

opt:s

top

––

–0.

025

42

The

Dif

f nco

lum

nin

dica

tes

the

utili

tyga

inof

the

agen

twhe

nco

uple

dw

ithth

eop

timal

com

pone

nts

liste

din

the

OM

and

AC

colu

mn.

Ran

k npr

ein

dica

tes

the

rank

ofth

eor

igin

alag

ent,

whi

leR

ank n

post

give

sits

rank

ing

afte

rop

timiz

atio

n


negotiation strategies. The main idea behind the BOA architecture is that we canidentify several components in a negotiating agent, all of which can be optimizedindividually. Our motivation in the end is to create a proficient negotiating agent bycombining the best components.

We have shown that many of the existing negotiation strategies can be re-fittedinto our architecture. We identified and classified the key components in them,and we have demonstrated that the original agents and their decoupled versionshave identical behavior and similar performance. Finally, we discussed severalapplications of the BOA architecture, one of which was to recombine differentcomponents of the ANAC agents, and we have demonstrated this significantlyimproved their performance.

One obvious direction of future research is to look at any of the BOA componentsin isolation. After identifying the best performing components, we can turn ourattention to answer whether combining effective components leads to better overallresults, and whether an optimally performing agent can be created by taking thebest of every component. Another interesting question then is which of the BOAcomponents turns out to be most important with regard to the overall performanceof an agent. Our architecture allows us to make these questions precise and providesa tool for answering these questions.

Another possible improvement is extend the focus of current work on preferenceprofile modeling techniques to a larger class of opponent modeling techniques,such as strategy prediction. Also, an agent is currently equipped with a singlecomponent during the entire negotiation session. It would be interesting to runmultiple BOA components in parallel, and use recommendation systems to electthe best component at any given time.

Acknowledgements This research is supported by the Dutch Technology Foundation STW,applied science division of NWO and the Technology Program of the Ministry of EconomicAffairs. It is part of the Pocket Negotiator project with grant number VICI-project 08075.

References

1. An, B., Lesser, V.: Yushu: a heuristic-based agent for automated negotiating competition.In: Ito, T., Zhang, M., Robu, V., Fatima, S., Matsuo, T. (eds.) New Trends in Agent-based Complex Automated Negotiations, Series of Studies in Computational Intelligence,pp. 145–149. Springer, Berlin (2012)

2. Ashri, R., Rahwan, I., Luck, M.: Architectures for negotiating agents. In: Proceedings of the3rd Central and Eastern European conference on Multi-agent Systems, pp. 136–146. Springer,Berlin (2003)

3. Baarslag, T., Hindriks, K.V.: Accepting optimally in automated negotiation with incompleteinformation. In: Proceedings of the 2013 International Conference on Autonomous Agents andMulti-agent Systems (AAMAS ’13), pp. 715–722. International Foundation for AutonomousAgents and Multiagent Systems, Richland (2013)


4. Baarslag, T., Hindriks, K., Jonker, C.M., Kraus, S., Lin, R.: The first automated negotiatingagents competition (ANAC 2010). In: Ito, T., Zhang, M., Robu, V., Fatima, S., Matsuo, T.(eds.) New Trends in Agent-based Complex Automated Negotiations, Series of Studies inComputational Intelligence, pp. 113–135. Springer, Berlin (2012)

5. Baarslag, T., Hendrikx, M., Hindriks, K., Jonker, C.: Measuring the performance of onlineopponent models in automated bilateral negotiation. In: Thielscher, M., Zhang, D. (eds.) AI2012: Advances in Artificial Intelligence. Volume 7691 of Lecture Notes in Computer Science,pp. 1–14. Springer, Berlin (2012)

6. Baarslag, T., Fujita, K., Gerding, E.H., Hindriks, K., Ito, T., Jennings, N.R., Jonker, C., Kraus,S., Lin, R., Robu, V., Williams, C.R.: Evaluating practical negotiating agents: results andanalysis of the 2011 international competition. Artif. Intell. 198, 73–103 (2013)

7. Baarslag, T., Hendrikx, M., Hindriks, K., Jonker, C.: Predicting the performance of opponentmodels in automated negotiation. In: 2013 IEEE/WIC/ACM International Conference onIntelligent Agent Technology, 2, pp. 59–66 (2013)

8. Baarslag, T., Hindriks, K., Jonker, C.: Acceptance conditions in automated negotiation. In Ito,T., Zhang, M., Robu, V., Matsuo, T., eds.: Complex Automated Negotiations: Theories, Models,and Software Competitions. Volume 435 of Studies in Computational Intelligence, pp. 95–111.Springer, Berlin (2013)

9. Baarslag, T., Hindriks, K., Jonker, C.: A tit for tat negotiation strategy for real-time bilateralnegotiations. In: Ito, T., Zhang, M., Robu, V., Matsuo, T. (eds.) Complex Automated Negotia-tions: Theories, Models, and Software Competitions. Volume 435 of Studies in ComputationalIntelligence, pp. 229–233. Springer, Berlin (2013)

10. Bartolini, C., Preist, C., Jennings, N.: A generic software framework for automated negotiation.In: First International Conference on Autonomous Agent and Multi-Agent Systems, Citeseer(2002)

11. Beam, C., Segev, A.: Automated negotiations: a survey of the state of the art. Wirtschaftsinfor-matik 39(3), 263–268 (1997)

12. Ben Adar, M., Sofy, N., Elimelech, A.: Gahboninho: strategy for balancing pressure andcompromise in automated negotiation. In: Ito, T., Zhang, M., Robu, V., Matsuo, T. (eds.)Complex Automated Negotiations: Theories, Models, and Software Competitions. Volume 435of Studies in Computational Intelligence, pp. 205–208. Springer, Berlin (2013)

13. Carbonneau, R., Kersten, G., Vahidov, R.: Predicting opponent’s moves in electronic negotia-tions using neural networks. Expert Syst. Appl. 34(2), 1266–1273 (2008)

14. Chen, S., Weiss, G.: An efficient and adaptive approach to negotiation in complex environ-ments. In: Proceedings of the 20th European Conference on Artificial Intelligence, pp. 228–233(2012)

15. Dirkzwager, A., Hendrikx, M., Ruiter, J.: The negotiator: a dynamic strategy for bilateralnegotiations with time-based discounts. In: Ito, T., Zhang, M., Robu, V., Matsuo, T. (eds.)Complex Automated Negotiations: Theories, Models, and Software Competitions. Volume 435of Studies in Computational Intelligence, pp. 217–221. Springer, Berlin (2013)

16. Dumas, M., Governatori, G., Ter Hofstede, A., Oaks, P.: A formal approach to negotiatingagents development. Electron. Commer. Res. Appl. 1(2), 193–207 (2002)

17. Eymann, T.: Co-evolution of bargaining strategies in a decentralized multi-agent system. In:AAAI Fall 2001 Symposium on Negotiation Methods for Autonomous Cooperative Systems,126–134 (2001)

18. Faratin, P., Sierra, C., Jennings, N.R.: Negotiation decision functions for autonomous agents.Rob. Auton. Syst. 24(3–4), 159–182 (1998). Multi-Agent Rationality

19. Fatima, S.S., Wooldridge, M., Jennings, N.R.: Multi-issue negotiation under time constraints.In: AAMAS ’02: Proceedings of the First International Joint Conference on AutonomousAgents and Multiagent Systems, pp. 143–150. ACM, New York (2002)

20. Fishel, R., Bercovitch, M., Gal, Y.: BRAM agent. In: Ito, T., Zhang, M., Robu, V., Matsuo,T. (eds.) Complex Automated Negotiations: Theories, Models, and Software Competitions.Studies in Computational Intelligence, 435 pp. 213–216. Springer, Berlin (2013)


21. Frieder, A., Miller, G.: Value model agent: a novel preference profiler for negotiation withagents. In: Ito, T., Zhang, M., Robu, V., Matsuo, T. (eds.) Complex Automated Negotiations:Theories, Models, and Software Competitions. Volume 435 of Studies in ComputationalIntelligence, pp. 199–203. Springer, Berlin (2013)

22. Hao, J., Leung, H.F.: Abines: an adaptive bilateral negotiating strategy over multiple items.In: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on WebIntelligence and Intelligent Agent Technology - WI-IAT ’12, vol. 02, pp. 95–102. IEEEComputer Society, Washington, DC (2012)

23. Hindriks, K.V., Tykhonov, D.: Opponent modelling in automated multi-issue negotiation usingBayesian learning. In: Proceedings of the 7th International Joint Conference on AutonomousAgents And Multiagent Systems - AAMAS ’08, vol. 1, pp. 331–338. International Foundationfor Autonomous Agents and Multiagent Systems, Richland (2008)

24. Hindriks, K.V., Tykhonov, D.: Towards a quality assessment method for learning preferenceprofiles in negotiation. In: Ketter, W., Poutré, H., Sadeh, N., Shehory, O., Walsh, W. (eds.)Agent-Mediated Electronic Commerce and Trading Agent Design and Analysis. Volume 44 ofLecture Notes in Business Information Processing, pp. 46–59. Springer, Berlin (2010)

25. Hindriks, K.V., Jonker, C., Tykhonov, D.: Towards an open negotiation architecture for hetero-geneous agents. In: Klusch, M., Pechoucek, M., Polleres, A. (eds.) Cooperative InformationAgents XII. Volume 5180 of Lecture Notes in Computer Science, pp. 264–279. Springer, Berlin(2008)

26. Ilany, L., Gal, Y.: Algorithm selection in bilateral negotiation (accepted). In: Proceedings ofThe Sixth International Workshop on Agent-based Complex Automated Negotiations (ACAN2013) (2013)

27. Jonker, C., Robu, V., Treur, J.: An agent architecture for multi-attribute negotiation usingincomplete preference information. Auton. Agent Multi-Agent Syst. 15, 221–252 (2007)

28. Kawaguchi, S., Fujita, K., Ito, T.: Compromising strategy based on estimated maximum utilityfor automated negotiation agents competition (ANAC-10). In: Mehrotra, K., Mohan, C., Oh,J., Varshney, P., Ali, M. (eds.) Modern Approaches in Applied Intelligence. Volume 6704 ofLecture Notes in Computer Science, pp. 501–510. Springer, Berlin (2011)

29. Kawaguchi, S., Fujita, K., Ito, T.: Agentk: compromising strategy based on estimatedmaximum utility for automated negotiating agents. In: Ito, T., Zhang, M., Robu, V., Fatima,S., Matsuo, T. (eds.) New Trends in Agent-Based Complex Automated Negotiations. Volume383 of Studies in Computational Intelligence, pp. 137–144. Springer, Berlin (2012)

30. Kawaguchi, S., Fujita, K., Ito, T.: Agentk2: compromising strategy based on estimatedmaximum utility for automated negotiating agents. In: Ito, T., Zhang, M., Robu, V., Matsuo,T. (eds.) Complex Automated Negotiations: Theories, Models, and Software Competitions.Volume 435 of Studies in Computational Intelligence, pp. 235–241. Springer, Berlin (2013)

31. Lin, R., Kraus, S., Baarslag, T., Tykhonov, D., Hindriks, K., Jonker, C.M.: Genius: an integratedenvironment for supporting the design of generic automated negotiators. Comput. Intell. http://dx.doi.org/10.1111/j.1467-8640.2012.00463.x (2012)

32. Matos, N., Sierra, C., Jennings, N.: Determining successful negotiation strategies: an evolution-ary approach. In: Proceedings International Conference on Multi Agent Systems, pp. 182–189(1998)

33. Oshrat, Y., Lin, R., Kraus, S.: Facing the challenge of human-agent negotiations via effec-tive general opponent modeling. In: Proceedings of The 8th International Conference onAutonomous Agents and Multiagent Systems, vol. 1, pp. 377–384. International Foundationfor Autonomous Agents and Multiagent Systems, Richland (2009)

34. Papaioannou, I., Roussaki, I., Anagnostou, M.: Multi-modal opponent behaviour prognosisin e-negotiations. In: Proceedings of the 11th International Conference on Artificial NeuralNetworks Conference on Advances in Computational Intelligence, vol. Part I, pp. 113–123.Springer, Berlin (2011)

35. Rubinstein, A.: Perfect equilibrium in a bargaining model. Econometrica 50(1), 97–109 (1982)

http://dx.doi.org/10.1111/j.1467-8640.2012.00463.x

http://dx.doi.org/10.1111/j.1467-8640.2012.00463.x


36. Serban, L.D., Silaghi, G.C., Litan, C.M.: Agent FSEGA - time constrained reasoning modelfor bilateral multi-issue negotiations. In: Ito, T., Zhang, M., Robu, V., Fatima, S., Matsuo,T. (eds.) New Trends in Agent-based Complex Automated Negotiations, Series of Studies inComputational Intelligence, pp. 159–165. Springer, Berlin (2012)

37. van Galen Last, N.: Agent smith: opponent model estimation in bilateral multi-issue negotia-tion. In: Ito, T., Zhang, M., Robu, V., Fatima, S., Matsuo, T. (eds.) New Trends in Agent-basedComplex Automated Negotiations, Series of Studies in Computational Intelligence, pp. 167–174. Springer, Berlin (2012)

38. van Krimpen, T., Looije, D., Hajizadeh, S.: Hardheaded. In: Ito, T., Zhang, M., Robu,V., Matsuo, T. (eds.) Complex Automated Negotiations: Theories, Models, and SoftwareCompetitions. Volume 435 of Studies in Computational Intelligence, pp. 223–227. Springer,Berlin (2013)

39. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Iamhaggler: a negotiation agent forcomplex environments. In: Ito, T., Zhang, M., Robu, V., Fatima, S., Matsuo, T. (eds.) NewTrends in Agent-based Complex Automated Negotiations, Series of Studies in ComputationalIntelligence, pp. 151–158. Springer, Berlin (2012)

40. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Iamhaggler2011: a gaussian processregression based negotiation agent. In: Ito, T., Zhang, M., Robu, V., Matsuo, T. (eds.) ComplexAutomated Negotiations: Theories, Models, and Software Competitions. Volume 435 ofStudies in Computational Intelligence, pp. 209–212. Springer, Berlin (2013)

41. Zeng, D., Sycara, K.: Bayesian learning in negotiation. Int. J. Hum. Comput. Syst. 48, 125–141(1998)

Chapter 5A Dynamic, Optimal Approach for Multi-IssueNegotiation Under Time Constraints

Fenghui Ren, Minjie Zhang, and Quan Bai

Abstract Multi-issue negotiation can lead negotiators to bi-beneficial outcomeswhich are not applicable in single issue negotiation. In a multi-issue negotiation,a negotiator’s preference has a significant impact on the negotiation result. Mostexisting multi-issue negotiation strategies are based on an assumption that anegotiator will fix its predefined preference throughout a negotiation, and thenegotiator’s concern on negotiated issues will not be impacted for any reason. Verylittle work has been done to consider a situation in which a negotiator may modifyits preference during a negotiation. The motivation of this paper is to introducea novel optimal bi-lateral multi-issue negotiation approach to handle the situationwhere a negotiator may modify its preference dynamically during a negotiation,and to lead the negotiation result to a bi-beneficial outcome. In order to do so, anagent behavior prediction method, an agent preference prediction method, and twooptimal offer generation methods are proposed. Experimental results indicate goodperformance of all proposed methods, and a significant improvement is achieved onall negotiators’ utilities.

Keywords Algebraic analysis • Geometric analysis • Multi-issue negotiation •Preference • Regression analysis

F. Ren (�) • M. ZhangSchool of Computer Science and Software EngineeringUniversity of Wollongong, Wollongong, Australiae-mail: [email protected]; [email protected]

Q. BaiSchool of Computing and Mathematical SciencesAuckland University of Technologies, Auckland, New Zealande-mail: [email protected]


85




86 F. Ren et al.

5.1 Introduction

Multi-issue negotiation is an active research direction in the field of multi-agentsystems and distributed artificial intelligence. Literatures [1–3] have indicatedsignificant achievements in this area. In [4], Fatima et al. pointed out that theprocedure of multi-issue negotiation plays a critical role in determining negotiationresults. In general, there are three main procedures in multi-issue negotiation [5],which are the package deal procedure, simultaneous procedure and sequentialprocedure. In the package deal procedure, all issues are bundled and discussedtogether; in the simultaneous procedure, all issues are discussed simultaneouslybut independent of each other; and in the sequential procedure, all issues arediscussed one after another. By considering the time complexity and optimality, thepackage deal procedure is highly encouraged since it can outperform the other twoprocedures in most situations. In this paper, we focus our attention on the packagedeal procedure in multi-issue negotiations.

The most significant feature of multi-issue negotiation by using the packagedeal procedure is that it can always lead the negotiation results to bi-beneficialnegotiation outcomes (if this is applicable), i.e. both negotiation participators canincrease their utilities from the outcome. Because the bi-beneficial negotiationoutcome is not reachable in single issue negotiation [3, 6], so multi-issue nego-tiation becomes important and valuable in practice. Many researchers have paidattention to the optimal negotiation outcome in multi-issue negotiation and someapproaches have been successfully developed [2, 4, 5]. However, most existingapproaches mainly focus on static negotiation environments, in which negotiatorspredefine their preferences on negotiated issues and do not modify their preferencesthroughout the negotiation. After studying and analyzing peoples’ real behavioursin traditional markets on multiple issue bargaining, we noticed that usually peoplewould like to modify their preferences during a negotiation when the negotiationenvironment changes. Also, in an electronic marketplace, negotiators usuallymodify their preferences directly after the market information is updated. In orderto successfully lead a negotiation result to a bi-beneficial outcome in an open anddynamic negotiation environment, an optimal approach for multi-issue negotiationunder time constraints is proposed in this paper.

The proposed approach contains three major steps, which are (1) opponenthistorical offers regression, (2) opponent preference estimation, and (3) optimaloffer generation. In the first step, one or multiple quadratic regression function/sis/are generated to optimally fit an opponent’s historical offers. The major differencebetween our regression method and other machine-learning-based methods [7–10]or Bayesian-based methods [10–14] is that our method does not require anadditional training process or any domain knowledge on the negotiation issues, butonly uses the historical offers of the current negotiation. Therefore the proposedregression method is very suitable to estimate opponents’ negotiation behaviours ina dynamic environment by considering the facility and flexibility. In the second step,an opponent’s preference on all negotiated issues is predicted based on regression

5 Dynamic, Optimal, Multi-Issue Negotiation Under Time Constraints 87

functions estimated in the first step. The preference estimation method in this step isbased on a simple assumption that the opponent normally gives more concessionto its low concern issues and give less concession to its high concern issues.By analyzing differences between the opponent’s concessions in all negotiatedissues, the opponent’s preference can be estimated. In the third step, based onthe estimated preference, an optimal offer will be generated (if it is applicable)by employing two proposed methods, which are the geometric method and thealgebraic method, to benefit all negotiation participators.

The rest of this paper is organized as follows. Section 5.2 proposes a historical-offer regression method for multi-issue negotiation; Sect. 5.3 introduces a method toestimate a negotiator’s preference; Sect. 5.4 introduces two methods to dynamicallygenerate bi-beneficial offers; Sect. 5.5 demonstrates experiments by using theproposed methods and discusses the experimental results; and Sect. 5.6 concludesthis paper.

5.2 Historical-Offer Regression

In this section, a historical offer regression method for multi-issue negotiation isintroduced. It is an extended work based on our previous work on agent behaviourprediction in single-issue negotiation [15].

5.2.1 Simple Behaviours Regression

For negotiations in open and dynamic environments, agents may employ differentnegotiation strategies according to their own expectations. A negotiation strategyspecifies the sequences of actions that the negotiation participator plan to makeduring a negotiation. When a negotiation environment changes, agents may alsomodify their own negotiation strategies in order to maximize their utilities. In [2],four commonly used time-dependent negotiation strategies are introduced, whichare Boulware, Conceder, Linear and Sit and Wait [2].

In Fig. 5.1, we illustrate the four possible negotiation strategies. Let the x-axisindicate the negotiation time and the y-axis indicate the concession that an agentmakes. During a negotiation, an agent can employ different negotiation strategiesto make concession by considering the negotiation time. Details of the four generalnegotiation strategies are as follows:

• Boulware: the rate of change in the slope is increasing, corresponding to smallerconcession in the early stages but large concession in the later stages.

• Conceder: the rate of change in the slope is decreasing, corresponding to largeconcession in the early stages but smaller concession in the later stages.

88 F. Ren et al.

Algorithm 1: Multiple regression algorithm

Input: Historical utility set OU D fOut jt D 1 : : : T g, and all Out have been normalized to Œ0; 1�.Threshold � 2 Œ0; 1�.Output: Multiple regression function set R D fRj .t/jj D 1 : : : J g. Each regression functionindicates a kind of behaviour performed by the agent in a certain period, and is in the form ofRj .t/ D aj � t 2 C bj � t C cj ; t 2 Œtminj ; tmaxj �.Initialization: Initializing the set U� and R to ;.for each utility Out in the set OU do

U� U�TfOutg

if the size of U� is smaller than two thengo to the next iteration

end ifgenerate the quadratic regression function, namely R.t/� by using the set U� and theregression approach introduced in Sect. 5.2.1.initializing avg to 0for each utility Out in the set U� doavg avgC jOut � ut j1

end foravg avg

sizeof .U�/

if avg > � thenresetting the set U� to ;R R

TfR.t/�gend if

end forreturn the set R

time

conc

essi

on

Boulware Linear

Conceder

Sit and Wait

Fig. 5.1 Four commonnegotiation strategies

• Linear: the rate of change in the slope is zero, corresponding to making constantconcession.

• Sit and Wait: the rate of change in the slope and the slope itself are always zero,corresponding to not making any concession but just waiting for the opponent’sconcession.


Table 5.1 The relationship between negotiation strategies andcoefficients

Strategy name Coefficient Function

Boulware a > 0 R.t/ D a � t 2 C b � t C cConceder a < 0 R.t/ D a � t 2 C b � t C cLinear a D 0 and b ¤ 0 R.t/ D b � t C cSit and wait a D b D 0 and c ¤ 0 R.t/ D c

Since the curves of an agent’s possible behaviours illustrated in Fig. 5.1 aremonotonic, so we introduce a quadratic regression function to predict an agent’sbehaviours as follows:

R.t/ D a � t 2 C b � t C c (5.1)

where R.t/ indicates the utility that an agent gains at the t th negotiation round,and � is the negotiation deadline (0 � t � � ). Parameters a, b and c arecoefficients of the function R.t/, and are independent of t . It is noticed that theproposed quadratic regression function can simulate the four common negotiationstrategies mentioned above by assigning different values to the coefficients, asshown in Table 5.1. Because the aim of this regression approach is to employ anopponent’s historical offers to generate a particular regression function R.t/ to fitthe opponent’s historical offers optimally, so we must ensure that the differencesbetween the regression results and the real offers in all negotiation rounds areminimized. Let set OU D f but 0 jt 0 2 Œ1; t �g be the opponent’s real historical offers inthe previous t rounds and the function R.t/ is the regression function on the set OU.It is assumed that all distances ".t 0/ between the real offers but 0 and the regressionresults ut 0 (where ut 0 is the value of function R.t/ when t D t 0) in all negotiationrounds obey the Normal Distribution. Let ".t 0/ D but 0 �ut 0 , then the joint probabilitydensity function for all ".t/ in the previous t negotiation rounds is:

L".t/ D�

1

�p2�

�texp

(� 1

2�2

tXt 0D0

Œbut 0 � ut 0 �2

)(5.2)

where L".t/ indicates the joint probability that all offers but 0 may happen. Becauseeach but 0 comes from the historical record, so we must let L".t/ to be its maximumvalue. Obviously, in order to maintain L".t/ to the maximum value,

Ptt 0D0Œbut 0 �ut 0 �2

should achieve its minimum value. Let

Q.a; b; c/ DtX

t 0D0Œbut 0 � at 02 � bt 0 � c�2 (5.3)

Then we calculate the partial derivative forQ.a; b; c/ on a, b and c, respectivelyand let their results equal zero.

90 F. Ren et al.

8<:@Q

@aD �2Pt

t 0D0.but 0 � at 02 � bt 0 � c/t 02 D 0@Q

@bD �2Pt

t 0D0.but 0 � at 02 � bt 0 � c/t 0 D 0@Q

@cD �2Pt

t 0D0.but 0 � at 02 � bt 0 � c/ D 0

(5.4)

Then the above equations can be simplified to:

8<:.Pt

t 0D0 t04/aC .Pt

t 0D0 t03/bC .Pt

t 0D0 t02/c DPt

t 0D0 t02but 0

.Pt

t 0D0 t03/aC .Pt

t 0D0 t02/bC .Pt

t 0D0 t0/c DPt

t 0D0 t0 but 0

.Pt

t 0D0 t02/aC .Pt

t 0D0 t0/bC tc DPt

t 0D0 but 0(5.5)

Let Cu, Ca, Cb and Cc be the coefficient matrices for Eq. (5.5),

Cu D0@

Ptt 0D0 t

04Pt

t 0D0 t03

Ptt 0D0 t

02Ptt 0D0 t

03Pt

t 0D0 t02

Ptt 0D0 t

0Ptt 0D0 t

02Pt

t 0D0 t0 t

1A ; Ca D

0@

Ptt 0D0 t

02but 0 Ptt 0D0 t

03Pt

t 0D0 t02Pt

t 0D0 t0 but 0 Pt

t 0D0 t02

Pt0tD0 t

0Ptt 0D0 but 0 Pt

t 0D0 t0 t

1A (5.6)

Cb D0@

Ptt 0D0 t

04Pt

t 0D0 t02but 0 Pt

t 0D0 t02Pt

t 0D0 t03

Ptt 0D0 t

0 but 0 Ptt 0D0 t

0Ptt 0D0 t

02Pt

t 0D0 but 0 t

1A ; Cc D

0@

Ptt 0D0 t

04Pt

t 0D0 t03

Ptt 0D0 t

02but 0Ptt 0D0 t

03Pt

t 0D0 t02

Ptt 0D0 t

0 but 0Ptt 0D0 t

02Pt

t 0D0 t0

Ptt 0D0 but 0

1A(5.7)

Because Cu ¤ 0, then parameters a, b and c have a unique solution which is:

a D Ca=Cu; b D Cb=Cu; c D Cc=Cu (5.8)

Then by employing these parameters, we can find a particular function R.t/ torepresent the opponent’s historical offers in the previous t rounds. Furthermore, wecan also predict the opponent’s possible offer in the next round simply by replacingt by t C 1.

For example, assume a seller’s initial price for selling a car is $5;000, and thereservation price is $3;000. If the seller employs the Boulware negotiation strategy,and sets its negotiation deadline to the 20th negotiation round, then the seller willgenerate offers $5;000; $4;995; $4;980; $4;955 and $4;920 in the first five rounds.By adopting the regression method introduced in this subsection, a buyer agent canestimate that the seller’s possible price for the 6th round is $4;875. It can be seenthat when agents perform common negotiation strategies, the proposed regressionanalysis approach can successfully estimate their behaviours.

5.2.2 Complex Behaviours Prediction

In the previous subsection, we introduced four common negotiation strategies andproposed a quadratic regression method to estimate agent behaviours. However, inreality, agents may perform complex behaviours which are beyond these simplebehaviours. For example, agents may perform the Boulware negotiation strategy


Fig. 5.2 An example of complex agent behaviour

when the negotiation environment is disadvantageous to themselves, and shift tothe Conceder strategy when the environment improves. In Fig. 5.2, we illustratean example of a complex agent behaviour. Obviously, such a complex behaviourcannot be represented by a single quadratic regression function. In order to solvethis issue, we introduce a multiple regression method to represent complex agentbehaviours. The pseudocode of the multiple regression algorithm is displayed inAlgorithm 1. The input of this algorithm is an negotiator’s historical offers, and theoutput of this algorithm is a series of quadratic regression functions to optimally fitthe negotiator’s offers. The basic procedure of the multiple regression algorithm isas follows.

Step 1 Initialize both the regression data set U� and the regression function set Rto the empty set ;;

Step 2 If the input set OU is empty, then terminate the algorithm and output the setR. Otherwise, move forward to Step 3;

Step 3 Move a historical offer from the set OU to the set U� according to the timeseries. If the size of the set U� is smaller than two, then repeat Step 3. Otherwisemove forward to Step 4;

Step 4 Generate a temporal regression function R.t/� by using the dataset U�and the regression method introduced in Sect. 5.2.1;

Step 5 Calculate the average distance between the historical offers in set U� andthe regression results. If the average distance is smaller than a predefined value,then add the temporal function R.t/� to the output set R, empty the set U�, andmove back to Step 2. Otherwise, move back to Step 2 directly;

In Fig. 5.3, we illustrate a regression result by applying the multiple regressionalgorithm on the example displayed in Fig. 5.2. It can be seen that all regression

92 F. Ren et al.

Fig. 5.3 An example of multiple regression

Table 5.2 Multiple quadratic regression functions

Index Regression function Domain

1 R.t/ D �0:085 � t 2 C 0:59 � t � 0:004 t 2 Œ1; 4�2 R.t/ D 0:037 � t 2 � 0:789 � t C 3:573 t 2 Œ4; 11�3 R.t/ D �0:023 � t 2 C 0:87 � t � 7:097 t 2 Œ11; 17�4 R.t/ D 0:024 � t 2 � 1:093 � t C 12:536 t 2 Œ17; 24�5 R.t/ D �0:026 � t 2 C 1:551 � t � 21:76 t 2 Œ24; 30�6 R.t/ D 0:017 � t 2 � 1:176 � t C 21:045 t 2 Œ30; 36�7 R.t/ D �0:01 � t 2 C 0:869 � t � 17:745 t 2 Œ36; 42�8 R.t/ D 0:004 � t 2 � 0:365 � t C 9:551 t 2 Œ42; 47�9 R.t/ D 0:004 � t 2 � 0:355 � t C 8:396 t 2 Œ47; 50�

functions fit the agent’s complex behaviour very well. The regression functionsgenerated by the multiple regression algorithm are listed in Table 5.2. From thisexample, we demonstrated that by employing the multiple regression algorithm,complex agent behaviours can also be represented by quadratic regression functions.

5.3 Preference Prediction

In this section, we introduce an approach to predict an agent’s preference in bilateralmulti-issue negotiation. A negotiation preference indicates an agent’s emphaseslevel on the negotiated issues when more than one negotiation issue are considered.Usually, a negotiation preference is represented linearly as a serial of weight values


[4], and each weight value indicates the agent’s concern on a particular issue.The greater/lower of a weight value is assigned to an issue, the more/less of concernwill be paid on the issue. Let Wt D fwmt jm 2 Œ1;M �g (

PMmD1 wmt D 1) be an

agent’s preference on all negotiated issues, where wmt is the agent’s concern on themth issue, and M is the total number of issues in a multi-issue negotiation. In orderto estimate the agent’s preference, firstly for each single issue m (m 2 Œ1;M �), weadopt the multiple quadratic regression algorithm introduced in the previous sectionto generate a set of regression functions Rm D fRmj .t/jj D 1 : : : J g to specify theagent negotiation behaviors. Each Rmj .t/ is in the form of Eq. (5.9).

Rmj .t/ D amj � t 2 C bmj � t C cmj ; t 2 Œtminm;j ; tmaxm;j � (5.9)

Now negotiators may have different preferences on issues and/or may changetheir preferences when the negotiation environment changes. By consideration ofthe real world situation, we make the following assumption.

A negotiator gives concessions to issues in a multi-issue negotiation based onits preference. The more/less significant an issue, the less/more concession will begiven on the issue, and vice versa.

For example, in a two-issue negotiation scenario, a buyer and a seller bargainover a car’s price and warranty. If the buyer is more concerned about the price,then he/she will give little concession on the price, but may make a large concessionon the warranty. If the seller considers both the price and the warranty equally, thenhe/she will make similar concessions on the two negotiated issues. Based on theabove assumption, we can inspect an agent’s historical offers on each issue andthen predict the agent’s preference. Firstly, an agent’s modifications on issue m atround t is represented by the derivative of the regression functionRmj .t/, namely cmt(cmt 2 Cm). cmt is defined as follows.

cmt D @Rmj .t/

@tD 2amj � t C bmj (5.10)

It is noticed that the greater cmt , the less significant that issue m is considered byan agent at round t , and the more concession the agent would like to give on theissue at round t . Let wmt (wmt 2 Wt) be the agent’s concern on issue m at the roundt , wmt is calculated as follows.

wmt D 1=cmtPMnD1 1=cnt

DQMnD1;n¤m cntPM

nD1.QMpD1;p¤n c

pt /

(5.11)

Then by calculating the agent’s concerns on all negotiated issues, i.e. Wt, the agent’spreference at round t is estimated.

94 F. Ren et al.

5.4 Optimal Offer Generation

In this section, we introduce methods to generate an optimal offer in bilateral multi-issue negotiation by employing the predicted preference in the previous section.

Before introducing the proposed methods, we firstly define some notations.Let Agents p and q be the two negotiators. For one agent (either Agent p orq), we assume that it already knows its own negotiation strategy, utility functionand preference at any particular negotiation round t , namely �t , U.t/ and Wt Dfwmt jm D t : : :M g .PM

mD1 wmt D 1/, respectively. Secondly, by employing theprediction method introduced in Sect. 5.3, the agent can estimate its opponent’spreference at round t , namely Wo

t D fwo;mt jm D 1 : : :M g .PMmD1 wo;mt D 1/.

Furthermore, according to the “pie splitting” theory [16], if the whole utility of anitem is 1 and one negotiator claims u (u 2 Œ0; 1�) out of 1, then the other negotiator’sutility is 1 � u. In multi-issue negotiation, the situation on each single issue can beas being treated similar to the “pie splitting” game, i.e. if an agent claims umt utilityfor issue m at the round t , then its opponent can only get .1 � umt / utility.

Let set Ut D fumt jm D 1 : : :M; umt 2 Œ0; 1�g be an agent’s utilities on all issues atround t according to its utility function U.t/. Normally, for any negotiation roundt , umt D U.t/ and

PMmD1 umt � wmt D U.t/. Let set U�t D fum�t jm D 1 : : :M; um�t 2

Œ0; 1�g be the agent utilities on all issues at round t by adopting the optimal offer.Then the purpose of this paper is to find the set U�t to benefit all negotiationparticipators, i.e. maximizing an agent’s utility and also minimizing the opponent’sloss as much as possible. Let Inequality (5.12) indicate such a requirement for theoptimal offer U�t as follows.

8<:

PMmD1 um�t � wmt � PM

mD1 umt � wmt .a/

PMmD1.1 � um�t / � wo;mt � PM

mD1.1 � umt / � wo;mt .b/(5.12)

Equation (5.12) indicates that the optimal offer U�t should provide more utility toboth negotiation participators than an agent’s original offer Ut. In order to solvethis problem, firstly, we transform Inequality (5.12) to Inequality (5.13), then wewill introduce two methods, i.e. a geometric method and an algebraic method in thefollowing two subsections.

8<:

PMmD1 um�t � wmt � PM

mD1 umt � wmt � 0 .a/

PMmD1 um�t � wo;mt � PM

mD1 umt � wo;mt � 0 .b/(5.13)

5.4.1 A Geometric Method

In this subsection, we introduce a geometric method to calculate the solution forInequality (5.13), and try to equally increase both negotiators’ utilities. In order to


Fig. 5.4 Lines A and B have an intersection. (a) By > Ay and Bx > Ax, (b) By > Ay andBx < Ax, (c) By < Ay and Bx > Ax, and (d) By < Ay and Bx < Ax

simplify the discussion, we specify the size of negotiation issues to two (M D 2).Then Inequality (5.13) can be rewritten as follows:

(w1t � u1�t C w2t � u2�t � ut � 0 .a/

wo;1t � u1�t C wo;2t � u2�t � .1 � uot / � 0 .b/(5.14)

where

(ut D w1t � u1t C w2t � u2tuot D wo;1t � .1 � u1t /C wo;2t � .1 � u2t /

(5.15)

Let the x-axis indicate the negotiator’s utility on issue 1, and the y-axis thenegotiator’s utility on issue 2, then some possible situations of Inequality (5.14)are illustrated in Fig. 5.4. Let Line A be the line indicated by the function w1t �u1�t C w2t � u2�t � ut D 0 and Line B be the line indicated by the functionwo;1t � u1�t C wo;2t � u2�t � .1 � uot / D 0, then Line A indicates an agent’s utilityfunction at negotiation round t , and Line B indicates the opponent’s utility function.

96 F. Ren et al.

Let Point P (if it is applicable) be the interaction between Line A and Line B ,then Point P is a solution to satisfy Inequality (5.14). However, because negotiatorsmay have different preferences, it is possible to find other points to increase bothnegotiators’ utilities together.

According to the geometric meaning of Inequality (5.14), a point located aboveLine A will enlarge an agent’s utility, and a point located below Line B will enlargethe opponent’s utility. If we can find a point, namely Point O , which is locatedabove Line A as well as below Line B , then both negotiators’ utilities can beincreased at the same time. The distance between Point O and Line A/B indicatesthe increment on the agent’s/opponent’s utility. Theoretically, more than one point(if this is applicable) may be found to increase both negotiators’ utilities, in order toequally and maximally enlarge both negotiators utilities. We consider three possiblecases, which are (1) Line A and Line B are not parallel, (2) Line A and Line B areparallel, and (3) Line A and Line B are identical. We will discuss the existence ofPoint O and the approach to calculate Point O (if it is applicable) in each case.

Before we start the discussion on the three possible cases, some useful pointswhich may help us to solve the problem are firstly defined. Let Point Ax bethe intersection between Line A and the x-axis, and Point Ay be the intersectionbetween Line A and the y-axis. Let Point Bx be the intersection between Line Band the x-axis, and PointBy be the intersection between LineB and the y-axis, thenPoints Ax, Ay, Bx, By, and P is defined as follows.

8<:Ax D

�utw1t; 0

�; Ay D

�0; ut

w2t

�; Bx D

�1�uotwo;1t

; 0�;

By D�0;

1�uotwo;2t

�; P D

�utw

o;2t �w2t .1�uot /

w1t wo;2t �w2t wo;1t;

w1t .1�uot /�utwo;1t

w1t wo;2t �w2t wo;1t

�:

(5.16)

5.4.1.1 Line A and Line B Are not Parallel

If Line A and Line B are not parallel, in order to equally and maximally increaseboth negotiators’ utilities, we propose that Point O is the center of the inscribedcircle of triangle P �Ay�By (ifAy < By) or triangle P �Ax�Bx (ifAx < Bx).The reason behind such a proposal is because the center of the inscribed circle hasthe maximal and equal distance to the three edges of the triangle, which indicatesboth negotiators’ utilities can be equally maximized. In Fig. 5.4, we illustrate fourpossible cases when Line A and Line B intersect. For the cases illustrated inFig. 5.4a–c, if the three vertices of the triangle are located at point .xa; ya/, point.xb; yb/ and point .xc; yc/, and the opposite sides of the triangle have lengths a, b,and c, then the incenter is at point .ox; oy/, which can be calculated by Eq. (5.17).For the case illustrated in Fig. 5.4d, if Point P is out of the first quadrant andAy > Ax, because Line A is located above Line B in the first quadrant, then itis not possible to find the Point O to increase both negotiators’ utilities together.

In general, if Line A and Line B have an intersection, it means that bothnegotiators have (1) different preferences on the negotiation issues, and (2) different


Fig. 5.5 LinesA andB do not have an intersection. (a)By > Ay andBx > Ax and (b)By < Ayand Bx < Ax

acceptance areas on the negotiation outcome. So Line A and Line B will havedifferent slopes and different intersections with the x-axis and y-axis. For example,in a two-issue negotiation related to a car’s price and warranty, a buyer’s initialoffer on the two issues is .$4;000; 5 years/, reservation offer is .$5;500; 2 years/and the preference on the two issues is .0:7; 0:3/. On the other hand, a seller’sinitial offer is .$6;000; 3 years/, the reservation offer is .$5;000; 4 years/, andthe seller’s preference is .0:5; 0:5/. Because the seller and the buyer have differentconsiderations on both the preference and acceptance areas of the two negotiatedissues, Point O exists and can be determined by employing Eq. (5.17).

8<:ox D axaCbxbCcxc

aCbCc ;

oy D ayaCbybCcycaCbCc

(5.17)

where8<ˆ:a D p

.xb � xc/2 C .yb � yc/2;b D p

.xc � xa/2 C .yc � ya/2;c D p

.xa � xb/2 C .ya � yb/2:(5.18)

5.4.1.2 Line A and Line B Are Parallel

If Line A and Line B do not intersect, then Line A may be located below or aboveLine B . If Line A is located below Line B , according to Fig. 5.5a, any point on themiddle line (LineC ) between LineA and LineB can be the optimal point. However,in order to decrease the impact caused by inaccurately estimating the opponent’sutility function and preference, we set the middle point on Line C as the PointO.ox; oy/ as follows.

98 F. Ren et al.

8<:ox D AxxCBxx

4;

oy D AyyCByy

4:

(5.19)

where Axx and Bxx is the x-axis value of Points Ax and Bx, and Ayy and Byy isthe y-axis value of Point Ay and By.

For the case illustrated in Fig. 5.5b, because Line A is located above Line B ,it is impossible to find a point located above Line A as well as below Line B ,so the Point O does not exist. In general, if Line A and Line B are parallel, thismeans that both negotiation participators have a similar preference on negotiatedissues but different acceptance areas on the negotiation outcome. Let us use theexample in Sect. 5.4.1.2 again. If the seller modifies its preference to .0:7; 0:3/,which is the same as the buyer’s preference, then Line A and Line B will beparallel. Nevertheless, because both negotiators have different acceptance areas onthe negotiation outcome, the optimal offer still exists. However, as shown in Fig. 5.5,if negotiators’ acceptance areas do not intersect, an optimal offer does not exist.

5.4.1.3 Line A and Line B Are Identical

If Line A and Line B are identical, it is impossible to find the Point O tobenefit all negotiators. When Line A and Line B are identical, this indicatesthat both negotiation participators have a similar preference, and also the sameacceptance area on the negotiation outcome. Taking the example we used before,if the seller modifies its initial offer to .$5;500; 2 years/, its reservation offer to.$4;000; 5 years/, and its preference to .0:7; 0:3/, the seller and buyer will have thesame preference and acceptance area on the negotiation outcome, then the offer tobenefit both negotiators at the same time does not exist.

In summary, for the four cases discussed above, if Point O exists, by setting u1�tto the x-axis value of Point O and u2�t to the y-axis value of Point O , the optimaloffer for an agent at round t is generated. However, if Point O does not exist, wecan only adopt the original offer, i.e. u1t D u2t D U.t/.

5.4.2 An Algebraic Method

The geometric method introduced in the previous subsection might be hard toimplement when the number of negotiated issues is greater than three. In thissubsection, we introduce an algebraic method to calculate the optimal offer.

The principle of the algebraic method is as follows. Generally, it is proposed thatan agent’s optimal offer at round t should satisfy two requirements. (1) The optimaloffer should maximize the agent’s profit as much as possible. The reason behindthis requirement is based on the consideration that all negotiators in a bargainsituation are selfish and seek maximal benefits [17]. And (2) the optimal offer should


minimize the opponent’s loss. The reason behind this consideration is based on thesituation that the opponent will not accept an offer which damages its benefit toomuch. In order to make an optimal offer more acceptable for the opponent so as toincrease the negotiation efficiency, the second requirement should also be satisfiedas much as possible.

In order to find out the optimal offer by using an algebraic method, we employLagrange Multipliers. The purpose of Lagrange Multipliers is to try to maximize afunction f .x/ by considering a constraint g.x/ D c, where x indicates variables andc is a constant.

Let Eq. (5.20)(a) be an agent’s increment on utility after employing the optimaloffer U�t at round t , then Eq. (5.20)(a) is the function that the agent tries to maximize.Let Eq. (5.20)(b) be an opponent’s loss on utility if it accepts the optimal offer U�t ,then Eq. (5.20)(b) is the constraint that the agent tries to satisfy. We use variable cto indicate the amount of possible loss for the opponent. In order to minimize theopponent’s loss, we set c’s default value to 0. If an optimal offer does not exist undersuch a constraint, the constraint will be increased based on a predefined criteria. Forexample, the constraint can be enlarged gradually by 0:1 until an optimal offer isachieved or until, c D 1. Based on the above consideration, we let Eq. (5.20)(a)be the f .x/ function in a Lagrangian and Eq. (5.20)(b) be the g.x/ function in theLagrangian, then in order to find the optimal offer U�t , the Lagrangian is defined inEq. (5.21).

(f .U�t / D PM

mD1 um�t � wmt � PMmD1 umt � wmt .a/

f o.U�t / D PMmD1 um�t � wo;mt � PM

mD1 umt � wo;mt .b/(5.20)

where U�t D fum�t jm D 1 : : :M g.

�.U�t ; �/ D f .U�t /C � � .f o.U�t / � c/ (5.21)

Firstly, by setting the partial derivative for �.U�t ; �/ on each variable in set U�tand � to zero, respectively, we can get Eq. (5.22) as follows:

8ˆˆ<ˆˆˆ:

@�.U�t ;�/

@u1�tD @f .U�

t /

@u1�tC � � @f o.U�

t /

@u1�tD 0

:::@�.U�

t ;�/

@um�t

D @f .U�t /

@um�t

C � � @f o.U�t /

@um�t

D 0

:::@�.U�

t ;�/

@uM�t

D @f .U�t /

@uM�t

C � � @f o.U�t /

@uM�t

D 0

@�.U�t ;�/

@�D f o.U�t / � c D 0

(5.22)

Equation (5.22) contains M C 1 variables and formulas, and by solvingEq. (5.22), we may get three possible results, which are:

100 F. Ren et al.

1. No solution. Then we can enlarge an opponents’ losses (the value of c) accordingto a predefined criteria and recalculate the optimal offer;

2. A single solution U�t D fum�t jm D 1 : : :M g, where um�t indicates the offer onthe mth issue. Then U�t is the optimal solution;

3. Multiple solutions, namely set MU�t D fU�t;ig. Then the optimal solution U�t canbe found as follows:

8U�t;i 2 MU�t ; 9U�t 2 MU�t ) f .U�t / � f .U�t;i/ (5.23)

5.5 Experiment

In this section, we examine our proposed bilateral multi-issue negotiation approachthrough comparing with the NDF negotiation approach [2].


All experiments are performed on a DELL OPTIPLEX GX620 machine. In orderto simply the experiments, we employ three agents (one seller and two buyers)in a two-issue negotiation. Because the proposed geometric optimization methodcan easily handle two-issue negotiation and can illustrate the negotiation outcomeby a 2-D graph, we will adopt the geometric optimization method in experiments.Of course, agents can choose either the geometric method or the algebraic method,according to their specifications and applications. The correctness of the algebraicmethod was proved mathematically in Sect. 5.4.2.

The experimental setup is described as follows: A seller agent seller1 wantsto sell a car and also provides warranty for a number of years, and both buyeragents buyer1 and buyer2 want to purchase the car from seller1. In order tomake the negotiation results more reliable, the buyer agent and the seller agentsadopt different initial offers, reservation offers, preferences, negotiation strategies,deadlines. Also, in order to make the negotiation results more comparable, twobuyer agents (i.e. buyer1 and buyer2) adopt the same negotiation parameters duringthe negotiation. All negotiators’ parameters are listed in Table 5.3. More specifically,the buyer’s initial price is randomly selected within Œ$1;600; $2;400�, and the initialwarranty is randomly selected within [4 years, 6 years]. The buyer’s reservationprice is randomly selected within Œ$2;400; $3;600�, and the buyer’s reservationwarranty is randomly selected within [2.4 years, 3.6 years]. In order to ensureboth negotiation participators have an agreement zone, the seller’s initial offer isset to the buyer’s reservation offer, and the seller’s reservation offer is set to thebuyer’s initial offer. All negotiators’ negotiation strategies are randomly selectedamong the Boulware, Conceder and Linear negotiation strategies. The negotiationdeadlines are randomly selected within Œ16; 24� rounds, and all agents’ preferences


Table 5.3 Negotiation parameters

Initial offer Reserved offer Preference

Agent name Price Warranty Price Warranty Price Warranty

Buyer 1 and 2 [$1,600, $2,400] [4 y, 6 y] [$2,400, $3,600] [2.4 y, 3.6 y] [0, 1] [0, 1]Seller 1 [$2,400, $3,600] [2.4 y, 3.6 y] [$1,600, $2,400] [4 y, 6 y] [0, 1] [0, 1]

Strategy Deadline PMP

Buyer 1 and 2 [0, 2] [16, 24] 0 %Seller 1 [0, 2] [16, 24] [0 %, 100 %]

are also generated randomly. During the negotiation, all agents employ Rubinstein’salternating offering protocol [16] as the negotiation protocol and the package dealprocedure as the negotiation procedure. All agents keep their negotiation parametersprivate. In order to simulate an open and dynamic negotiation environment, seller1will randomly modify its preference. The probability that seller1 will modify itspreference is indicated by the factor Probability Modify Preference (PMP). Thevalue of PMP is randomly selected between Œ0%; 100%�. If PMP D 0%, thisindicates that seller1 will not modify its preference at all. If PMP D 50%, thisindicates that seller1 has half likely to modify its preference. And ifPMP D 100%,this indicates that seller1 will definitely modify its preference in each negotiationround. Let wp indicate seller1’s concern on the price, and ww indicate seller1’sconcern on the warranty, then the new preference is generated randomly by seller1,i.e. wp D rand.0; 1/ and ww D 1 � wp . Both seller1 and buyer2 will employ theNDF negotiation model to generate their counter-offers, and buyer1 will employthe proposed geometric method to generate its counter-offers. The NDF negotiationapproach is explained briefly as follows.

In the NDF negotiation approach, an agent’s response in each negotiation roundis defined as follows [2]:

As.t; ptOa!a/ D

8<ˆ:

Quit if t > �a;

Accept if Ua.ptOa!a/ � Ua.pt0

a!Oa/;Offer pt

0

a!Oa at t 0 otherwise:

(5.24)

where �a is Agent a’s deadline, pt0

a!Oa is the counter-offer from Agent a to OpponentOa at the round t 0, and Ua.ptOa!a/ indicates Agent a’s utility on the given offer

ptOa!a from Opponent Oa at the round t . Agent a’s counter-offer pt0

a!Oa and evaluationfunction Ua.ptOa!a/ are defined as follows.

pta!Oa D RPa �

�t

�a

��aC IP a �

"1 �

�t

�a

��a#(5.25)

102 F. Ren et al.

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%1

1.02

1.04

1.06

1.08

1.1

1.12

1.14

1.16

1.18

1.2

PMP

Ratio

a b

c d

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0

0.2

0.4

0.6

0.8

1

1.2

1.4

PMP

Ratio

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

PMP

Ratio

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0

PMP

0.5

1

1.5

2

2.5

Ratio

Fig. 5.6 Experimental results comparison. (a) Buyer’s utility, (b) Seller’s utility, (c) Round, and(d) Time

and,

U a.ptOa!a

/ D ptOa!a�RPa

IP a �RPa(5.26)

where IP a indicates Agent a’s initial offer, RPa indicates Agent a’s reservationoffer, and �a indicates Agent a’s negotiation strategy. For more detail about theNDF negotiation model, please refer to [18, 19].

5.5.2 Experimental Results

By repeating the experiments 1;000 times (parameters being selected randomlyfrom the domains displayed in Table 5.3), we summarize the experimental resultsin Fig. 5.6 to illustrate the improvement by adopting the proposed geometricoptimization method. The x-axis indicates probabilities that the seller agent maymodify its preference in each negotiation round, namely PMP, and the y-axis


indicates the ratio between the negotiation outcomes by using the proposed methodand by using the NDF approach. In Fig. 5.6a, it can be seen that when the value ofPMP shifts from 0 to 100%, buyer1’s increment on utility decreases from 13 to 3%.This experimental result indicates that the more likely that an opponent will modifyits preference during a negotiation, the more difficult for an agent to find theoptimal offer to increase the agent’s utility. That is because when the opponentfrequently modifies its preference, it will be more difficult for the agent to accuratelyestimate the opponent’s preference, so the effectiveness of the optimal offer will bedecreased. However, in reality, an opponent may not modify its preference veryfrequently during a negotiation, hence the increment on utility in real-world caseswill not be lower than 9%.

In Fig. 5.6b, the seller’s utilities by negotiating with different buyers are alsocompared. It can be seen that by negotiating with buyer1, seller1’s utility isimproved by at least 20% compared with the negotiation outcome with buyer2.The reason behind such a performance is because the seller1 can decide whetheran optimal offer from the buyer1 will be accepted based on its own benefit. Forexample, if the buyer1 wrongly estimated the seller1’s preference at a certainnegotiation round, and generate an incorrect optimal offer, then the seller1 canreject this offer, and reply with a counter-offer. However, the buyer1 will not alwayswrongly estimate the seller1’s preference and generate an incorrect optimal offer.If the buyer1’s estimation is correct at a certain round, then the seller1 will acceptthe optimal offer, so as the seller1’s utility is improved similarly (the maximalimprovement) for all PMPs.

In Fig. 5.6c, it can be seen that the number of negotiation rounds decreases by20% on average when the geometric method is employed. However, as shown inFig. 5.6d, the proposed geometric method needs more computational time to findthe optimal offers. The computational time spent on the geometric method is 1:8times as much as the computational time spent on the NDF approach.

Based on the above experimental results, it can be concluded that by adoptingthe geometric method, all negotiators’s utilities can be increased compared withthe original offer, and the negotiation round also decreases. However, the proposedapproach needs more computational time to calculate the optimal offer than the NDFapproach.

5.5.3 Case Study

Because of page limitations, we adopt only one example to demonstrate theproposed negotiation approach, which includes the opponent’s behavior prediction,opponent’s preference prediction, and optimal offer calculation. All negotiationparameters of this example are listed in Table 5.4. During the negotiation, seller1will modify its preference randomly, and the probability of seller1 modifying itspreference in each negotiation round is 30%. In Table 5.5, seller1’s preference ineach negotiation round is listed in the column “Seller1’s Preference”.

104 F. Ren et al.

Table 5.4 Negotiation parameters for the study case

Initial offer Reserved offer Preference

Agent name Price Warranty Price Warranty Price Warranty

Buyer 1 and 2 $2,090.9 5.1 y $3,001.5 3.45 y 0.66 0.34Seller 1 $3,591.96 3.1 y $2,260.88 5.29 y Dynamic change Dynamic change

Strategy Deadline PMP

Buyer 1 and 2 1.77 22 0 %Seller 1 1.33 21 30 %

Table 5.5 Seller1’s preference and buyer1’s estimation

Seller1’s preference Buyer1’s estimation

Negotiation round Price Warranty Price Warranty

1–5 0.886 0.114 0.880 0.1206 0.988 0.012 0.866 0.1347 0.499 0.501 0.483 0.5178–10 0.166 0.834 0.158 0.84211 0.320 0.680 0.300 0.70012–13 0.176 0.824 0.167 0.833

Fig. 5.7 Negotiation between buyers1 and Seller1. (a) Buyer1’s estimation, (b) Buyer1’s view, (c)Seller1’s view, and (d) Utilities comparison

Firstly, by adopting the multiple regression method introduced in Sect. 5.2,buyer1 can estimate seller1’s utility functions for each single issue. The thresholdof acceptable error for multiple regression is set to 0:05. The regression graph isdisplayed in Fig. 5.7a, and all multiple regression functions are listed in Table 5.6.


Table 5.6 Buyer1’sregression functions onseller1’s utility function

Index Regression function Domain

Regression functions on price

1 R.t/ D �0:002t2 � 0:013t C 1:649 Œ1; 5�

2 R.t/ D �0:11t C 2 [5, 6]3 R.t/ D 0:258t2 � 2:278t C 6:390 Œ6; 8�

4 R.t/ D �0:087t2 C 1:166t � 0:855 Œ8; 11�

5 R.t/ D 0:41t � 1:969 Œ11; 12�

Regression functions on warranty

7 R.t/ D �0:017t2 � 0:097t C 1:23 Œ1; 5�

8 R.t/ D �0:712t C 3:414 Œ5; 6�

9 R.t/ D �0:161t2 C 2:288t � 7:572 Œ6; 8�

10 R.t/ D �7:893E � 4t2 � 0:025t C 0:793 Œ8; 10�

11 R.t/ D �0:241t C 2:668 Œ10; 11�

12 R.t/ D 0:082t � 0:562 Œ11; 12�

Regression functions on overall utility

13 R.t/ D 0:573t C 1 Œ1; 2�

14 R.t/ D �0:003t2 � 0:029t C 1:605 Œ2; 5�

15 R.t/ D 0:0673t2 � 0:81t C 3:602 Œ5; 7�

16 R.t/ D �0:512 � t C 4:235 Œ7; 8�

17 R.t/ D 0:258t � 1:155 Œ8; 9�

18 R.t/ D �0:011t2 C 0:148t C 0:426 Œ9; 12�

Secondly, by adopting the proposed preference estimation method introduced inSect. 5.3, buyer1 can estimate seller1’s preference in each negotiation round. Theestimation results are listed in Table 5.5 in the column “Buyer1’s Estimation.” It canbe seen that the estimated preferences are very close to seller1’s real preferences.

Lastly, by adopting the geometric method introduced in Sect. 5.4, buyer1 cancalculate the optimal offer in each negotiation round (if it is applicable). Thedetailed negotiation procedure between buyer1 and seller1 is displayed in Fig. 5.7.In Fig. 5.7b, it can be seen that in the 8th, 10th and 12th negotiation rounds, buyer1modifies its original offers on the car’s price and warranty to increase seller1’sutility. However, the total utility of buyer1 is not damaged.

Contrary to the buyer’s view, it can be seen in Fig. 5.7c that after buyer1modifies its offers on the two issues, seller1’s utility is improved considerably.In Fig. 5.7d, both buyer1’s and seller1’s utilities in each negotiation round aredisplayed. Obviously, it can be seen that, in the 8th, 10th and 12th rounds, seller1’sutility was improved significantly. Finally, when agreement is achieved in the 13thround, buyer1’s utility was 0:706, and seller1’s utility was 0:599. In order toshow the improvement by adopting the geometric method, we also illustrate thenegotiation process between buyer2 and seller1 in Fig. 5.8. Buyer2’s negotiationparameters are exactly the same as buyer1’s, and seller1’s negotiation parametersare not changed. In Fig. 5.8a, b, it can be seen that buyer2 only employs the NDFapproach to generate counter-offers in each negotiation round, and all negotiatorsdo not gain any extra benefit during the negotiation. Finally, in Fig. 5.8c, it can be

106 F. Ren et al.

Fig. 5.8 Negotiation between buyers2 and Seller1. (a) Buyer2’s view, (b) Seller1’s view, (c)Utilities comparison, and (d) Buyer1’s optimal offer

seen that buyer2 failed to enlarge its own and seller1’s utilities. When agreementis achieved in the 16th round, buyer2’s utility is 0:551, and seller1’s utility is0:389. By comparing these negotiation outcomes with the outcomes by using thegeometric method, it can be seen that after adopting the geometric method, thebuyer agent’s utility is improved by 28%, and the seller agent’s utility is improvedby 54%. The number of negotiation rounds is also decreased by 19%. However, thecomputational time spent by the geometric method is around 60% more than theNDF approach.

In detail, for example, in the 7th round, when seller1’s preference is 0:166 onprice and 0:834 on warranty, seller1 sends an offer .$4;896:99; 4:13 year/ to bothbuyers. If buyer1 or buyer2 accepts this offer, then seller1 will get 0:768 utility intotal, which is 1:98 on price and 0:527 on warranty. In the 8th round, by employingthe NDF negotiation approach, buyer2 rejects seller1’s offer, and generates acounter-offer .$2;210:92; 4:84 year/, which claims 0:868 utility on all issues foritself. If seller1 accepts this offer, seller1 will get utilities .�0:038; 0:205/, i.e.0:165 in total. Contrary to buyer2, in the 8th round, buyer1 estimates that seller1’spreference is .0:158; 0:842/, and sends a counter-offer .$1;325:26; 3:89 year/ asa response. If seller1 accepts this offer, buyer1 will get utilities .1:84; 0:27/, i.e.0:868 in total, and seller1 will get utilities .�0:73; 0:64/, i.e. 0:413 in total. Withoutdamaging its own utility, buyer1’s offer increases seller1’s utility by 0:248 comparedwith buyer2’s offer. The calculation procedure for buyer1 at the 8th negotiationround by using the geometric method is illustrated in Fig. 5.8d.


In this section, we illustrated experimental results by applying the proposedgeometric method. From both statistical results and the individual case study, we canconfidently say that the proposed bilateral optimal multi-issue negotiation approachsuccessfully improves both negotiators’ utilities and decreases the negotiationrounds. However, it needs more computational time to perform sophisticatedcalculations.

5.6 Conclusion

In this paper, we proposed an approach for bilateral multi-issue negotiations inopen and dynamic environments. Firstly, a multiple regression method is introducedto capture an opponent’s negotiation behaviour based only on the opponent’shistorical offers. Secondly, a preference estimation method is introduced to predictthe opponent’s preference dynamically during a negotiation. Thirdly, two methodsare introduced to find the optimal offer dynamically. The geometric method canillustrate the optimal offer details by a 2D graph directly, but the complexity ofthe geometric method will increase as the number of issues becomes greater thanthree. By contrast, the algebraic method easily handles large numbers of issues,but the selection of the constraint value will impact the negotiation result. To sumup, a negotiator should choose a suitable method according to its own specificationand requirement in a negotiation. Finally, the geometric method is evaluated bya series of experiments. Both the statistical results and a case study demonstrateimprovements of the proposed method on both negotiator’s utilities.

References

1. Bosse, T., Jonker, C., Treur, J.: Experiments in human multi-issue negotiation: analysis andsupport. In: Proceedings of 3rd International Joint Conference on Autonomous Agents andMulti-Agent Systems (AAMAS04), pp. 671–678 (2004)

2. Fatima, S., Wooldridge, M., Jennings, N.: An agenda-based framework for multi-issuenegotiation. Artif. Intell. 152(1), 1–45 (2004)

3. Lai, G., Li, C., Sycara, K.: A General Model for Pareto Optimal Multi-Attribute Negotiations,Rational, Robust, and Secure Negotiations in Multi-Agent Systems Studies in ComputationalIntelligence, 89, 2008, pp 59–80 (2006)

4. Fatima, S., Wooldridge, M., Jennings, N.: Optimal negotiation of multiple issues in incompleteinformation settings. In: Proceedings of 3rd International Joint Conference on AutonomousAgents and Multiagent Systems (AAMAS04), pp. 1080–1087 (2004)

5. Fatima, S., Wooldridge, M., Jennings, N.: Multi-issue negotiation with deadlines. J. Artif.Intell. Res. (JAIR) 27, 381–417 (2006)

6. Lai, G., Sycara, K., Li, C.: A pareto optimal model for automated multi-attribute negotiations.In: Proceedings of 6th International Conferrence on Autonomous Agents and Multi-AgentSystems (AAMAS07), pp. 1040–1042 (2007)

7. Chajewska, U., Koller, D., Ormoneit, D.: Learning an agent’s utility function by observingbehavior. In: Proceedings of 18th International Conferrence on Machine Learning (ICML01),pp. 35–42 (2001)

108 F. Ren et al.

8. Coehoorn, R., Jennings, N.: Learning on opponent’s preferences to make effective multi-issue negotiation trade-offs. In: Proceedings of 6th International Conference on ElectronicCommerce, ICEC 2004, pp. 59–68 (2004)

9. Gal, Y., Pfeffer, A.: Predicting people’s bidding behavior in negotiation. In: Proceedings of5th International Conference on Autonomous Agents and Multiagent Systems (AAMAS06),pp. 370–376 (2006)

10. Zeng, D., Sycara, K.: Bayesian learning in negotiation. Int. J. Human-Comp. Stud. 48(1),125–141 (1998)

11. Zeng, D., Sycara, K.: Benefits of learning in negotiation. In: Proceedings of 14th NationalConference on Artificial Intelligence and 9th Innovative Applications of Artificial IntelligenceConference (AAAI/IAAI97), pp. 36–42 (1997)

12. Narayanan, V., Jennings, N.: An adaptive bilateral negotiation model for e-commerce settings.In: Proceedings of the Seventh IEEE International Conference on E-Commerce Technology,pp. 34–41 (2005)

13. Narayanan, V., Jennings, N.: Learning to negotiate optimally in non-stationary environments,Cooperative Information Agents X, of Lecture Notes in Computer Science, vol. 4149,pp. 288–300. Springer Berlin (2006)

14. Brzostowski, J., Kowalczyk, R.: Predicting partner’s behaviour in agent negotiation. In:Proceedings of 5th International Conference on Autonomous Agents and Multiagent Systems(AAMAS06), pp. 355–361 (2006)

15. Ren, F., Zhang, M.: Prediction of partners’ behaviors in agent negotiation under open anddynamic environment. In: Proceeding of 2007 IEEE/WIC/ACM International Conferences onWeb Intelligence and Intelligent Agent Technology Workshops, pp. 379–382 (2007)

16. Rubinstein, A.: Perfect equilibrium in a bargaining model. Econometrica 50(1), 97–109 (1982)17. Kraus, S.: Strategic Negotiation in Multiagent Environments. MIT, Cambridge, MA (2001)18. Faratin, P., Sierra, C., Jennings, N.: Negotiation decision functions for autonomous agents.

J. Robot. Auton. Syst. 24(3–4), 159–182 (1998)19. Li, C., Sycara, K., Giampapa, J.: Dynamic outside options in alternating-offers negotiations. In:

Proceedings of 38th Annual Hawaii International Conference on System Sciences (HICSS05),1–10 (2005)

Chapter 6On Dynamic Negotiation Strategyfor Concurrent Negotiation over Distinct Objects

Khalid Mansour and Ryszard Kowalczyk

Abstract This paper addresses the problem of generating counteroffers by a buyeragent negotiating with multiple seller agents concurrently over multiple distinctnegotiation objects. Each object has one provider and characterized by multipleissues, i.e., attributes. Most previous works address negotiation strategies for simplersituations where an agent negotiates with multiple opponents for the purpose ofsecuring an agreement over a single object with either a single negotiation issue ormultiple negotiation issues. We propose a novel dynamic negotiation strategy thatworks in a more complicated negotiation scenario. The strategy involves adaptationof both the initially generated counteroffers and the issues’ counteroffers weightmatrix during negotiation. The proposed dynamic strategy takes into considerationthe behaviors of the current opponents in terms of their recent concessions tofine-tune the negotiation strategy of the agent in real-time. The initial experimentalresults show that the proposed mechanism is effective in terms of both the agreementand the utility rates when compared with a static strategy.

Keywords Automated negotiation • Coordination • Dynamic strategy

This work was initially presented at The Fifth International Workshop on Agent-Based ComplexAutomated Negotiations (ACAN2012)/AAMAS.

K. Mansour (�) • R. KowalczykFaculty of Information & Communication Technologies, Swinburne University of Technology,Melbourne, VIC, Australiae-mail: [email protected]; [email protected]


109



110 K. Mansour and R. Kowalczyk

6.1 Introduction

Automated negotiation is a main interaction mechanism in multi-agent systemswhere autonomous agents exchange offers and counteroffers [1]. Automation ofnegotiation is a multi-disciplinary research domain which borrows from other fieldssuch as economy, artificial intelligence and game theory [2]. For more than a decade,automated negotiation domain has witnessed a large momentum and a wide interestfrom researchers [3–5].

We extend our previous work [6, 7] by considering the one-to-many negotiationcase where a buyer agent seeks to procure more than one distinct object, giventhat each object has multiple negotiation issues (attributes) and one provider. Thisnegotiation scenario applies in a monopolistic market.

Many possible application domains can be represented by the one-to-manynegotiation such as the supply chain domain, task allocation, order fulfillmentproblems, e-commerce [8] in which buyers frequently seek to procure multipledistinct objects, e.g., raw materials. In the cloud computing context, providers andrequesters of resources can use automated negotiation to negotiate resource leasingcontracts [3].

Since each negotiation object in the described scenario is characterized bymultiple negotiation issues, the buyer agent needs to have an agreement over eachissue of each object to be considered an agreement over that object. In otherwords, there are agreements at the issue-level and agreements at the object-level.For our experimental evaluation, we consider agreements at the object-level whichimplicitly considers the agreements at the issue-level.

A dynamic negotiation strategy is a strategy that adapts its parameters duringnegotiation to reflect the dynamicity of the opponents’ behaviors or some othernegotiation-related environmental factors. To this end, we propose a dynamiccounteroffer strategy (DCS) that manages multi-bilateral concurrent negotiations.The DCS involves adaptation of both the initially generated counteroffers and theissues’s counteroffers weight matrix during negotiation. The adaptation processdepends on the opponents’ behaviors (level of cooperation) of the current negoti-ation encounter in terms of their recent concessions. The proposed strategy aimsto generate counteroffers that result in providing more utility for the buyer agentthan the utility provided by the counteroffers at the first time. In addition, theDCS adapts the issues’ counteroffers weight matrix by exchanging the weights ofdifferent issues in the matrix to help in improving the agreement rate. We evaluateour strategy by conducting experiments that compare our strategy with a staticstrategy. The initial results show that our strategy is more effective and robust thanthe static strategy where the buyer agent does not change the negotiation parametersduring negotiation.

The rest of the paper is organized as follows. Section 6.2 reviews the relatedwork. Section 6.2.1 presents the negotiation model while Sect. 6.2.2 discuses thecoordination approach. Section 6.3 presents the experimental settings and discussesthe results. Finally, Sect. 6.4 concludes the paper and outlines the future work.

6 Concurrent Negotiations over Distinct Objects 111

6.2 Related Work

The one-to-many negotiation is an alternative mechanism to the single-sided auctionprotocol. The one-to-many negotiation approach offers more flexibility for bothbuyers and sellers in terms of expressing their preferences through the exchangeof offers and counteroffers [3, 9–11].

One of the first explicit architectures for the one-to-many negotiation waspresented in [12]. The architecture describes a buyer agent with sub-negotiatorsand a coordinator where each sub-negotiator can negotiate with one seller agent.The study proposes four different coordination strategies: (1) desperate strategy inwhich the buyer agent accepts the first acceptable offer and quits negotiations withall other sellers (2) patient strategy where the sub-negotiators can make temporaryagreements with the seller agents during negotiation and the buyer agent holdson these temporary agreements until all the sub-negotiators reach agreements oruntil the deadline is reached, then the buyer agent selects the agreement with thehighest utility (3) optimized patient which is similar to the patient strategy exceptthat it does not accept a new agreement with less utility than the highest acceptedone (4) finally, the manipulation strategy in which the coordinator changes thenegotiation strategies of its sub-negotiators during negotiation. The second andthird coordination strategies work fine under the assumption that the buyer agenthas the privilege of conducting temporary agreements and pays no penalty forreneging from any temporary agreement. We consider the fourth approach wherethe buyer agent changes its negotiation strategy during negotiation and addressesthe problem of generating counteroffers to be dispatched by the buyer’s delegatesin each negotiation round. The buyer agent delegates can be designed to generatecounteroffers and adapt their strategy according to the behaviors of their opponents.However, that approach is left for future work.

Several studies were published to address some aspects of one-to-many nego-tiation [7, 10, 12]. Most published works focus on the situation where agentsnegotiate over a single continuous issue (e.g., price) for the purpose of securing oneagreement, while this paper investigates the problem of one-to-many negotiationover objects with multiple issues for the purpose of securing multiple agreements.Our work is similar to some existing work [11, 13] in terms of choosing thecoordination approach that changes the negotiation strategy during negotiation.For example, a decision making technique for changing the negotiation strategiesduring negotiation depending on information of previous negotiations regarding theagreement rates and the utility rates is proposed in [11]. Our approach investigatesthe multi-object/multi-issue negotiation domain and is based on the progress of thecurrent negotiation counterparts and does not rely on information from previousnegotiations.

Some heuristic methods were proposed to estimate the expected utility inboth a synchronized multi-threaded negotiations and a dynamic multi-threadednegotiations [14]. The synchronized multi-threaded negotiations model considersthe existing outside options for each single thread, while the dynamic multi-threaded


negotiations considers also the uncertain outside options that might arrive in thefuture. In both cases, the methods assume a knowledge of the probability distributionof the reservation prices of the opponents. In many cases, this kind of information isnot available. Other works in the literature consider negotiation over multiple issues[15,16] while the focus was mainly on bilateral encounters over a single object withmultiple issues.

Some other works in the literature [6, 17–19] investigate the scenario wherean agent negotiates concurrently with several sellers over one object characterizedby several issues. The focus of the studies varies, for example [17] investigates asituation where a seller agent negotiates with several buyers, while [18] investigatesthe possible negotiation protocol(s) that approximate the Pareto-efficient best selleroutcome as defined by the Vickrey auction mechanism. Our work is different sincewe are targeting a different negotiation scenario.

Negotiation over multiple objects/multiple issues is rarely addressed in theliterature. We address that and propose a dynamic coordination strategy by changingsome negotiation strategy parameters during negotiation depending on the changesin the concessions offered by the current opponents.

Different types of weight sets or matrices have been used for different purposesin the automated negotiation context. For example, to calculate the utility weightedaverage for a set of offers for a group of issues with linear relationship, weightswere assigned to each issue to reflect the relative importance of particular issue inthe issue set, e.g., [20]. The utility weight vector is typically preassigned at startof negotiation and usually remains static during negotiation. In another study [4],weights are used to mix different tactics (e.g., Boulware, linear, tit-for-tat etc.)to generate an offer. For a dynamic negotiation strategy, the weights used to mixdifferent negotiation tactics can be changed according to the similarity degreebetween the last offers from the opponents. The work in [4] considers bilateralnegotiation and demonstrates a simple example for changing the tactics weightsduring negotiation.

Our approach uses the issues’ counteroffers weight matrix to distribute thecounteroffer values amongst the negotiating delegates in each negotiation roundgiven that the weights are adapted according the behaviors of the current opponents.

Since we are investigating the problem of procuring multiple objects, thecombinatorial auction comes in place as a method of procuring multiple objects[21]. In our setting, we assume that an agent is seeking certain types and numbersof objects to fulfill its demand and at the same time we assume there are nocomplementarities (i.e., objects do not substitute each other) between differentobjects under negotiation. An advantage of adopting the negotiation approach isthat by exchanging of offers and counteroffers between agents, that allows formore flexibility in searching for the appropriate characteristics of certain objects byexplicitly asking for specific values for issues of each object. In addition, automatednegotiation can decrease the time needed to procure objects when compared to thecombinatorial auction and there is no need for an auctioneer in the case of usingnegotiation.


O1

S1 S2

Buyer agent

Coordinator

m=n|J | >1

Sn

O2 Om

d1 d2 dn

Fig. 6.1 One-to-manynegotiation

Our work demonstrates that an agent is able to take advantage of negotiating withmultiple opponents concurrently over multiple objects with multiple issues takinginto consideration the different behaviors of the current opponents.

6.2.1 Negotiation Model

We consider a buyer agent and a set of seller agents S D fs1; s2; : : : ; sng, see Fig. 6.1.The buyer agent negotiates concurrently with the seller set S. We assume that theseller agents are independent in their actions, i.e., they do not exchange information.The buyer agent has a set of delegate negotiators D D fd1; d2; : : : ; dng. The buyeragent creates and destroys delegate negotiators during negotiation as a responseto the number of the seller agents who enters or leaves negotiation. Each delegatedi negotiates with a seller si . The possible negotiation issues over which D and Snegotiate are included in the set J D fj1; j2; : : : ; jgg and each issue ji 2 J must bean issue of negotiation by at least one negotiation pair, i.e., (di ; si ).

To make our negotiation framework more comprehensive, we introduce thenegotiation object set (O). The negotiation object is any item over which agentshave interest to negotiate over. A negotiation object represents either a physicalitem (e.g., a printed book) or non-physical item, e.g., a web service. The set ofobjects is O D fo1; o2; : : : ; omg, where m is the number of objects in the currentnegotiation encounter. Each object oi in the set O represents an object of negotiation.The illustration of the idea is shown in Fig. 6.1.

We assume that each negotiation delegate is responsible to negotiate over oneobject, and at the same time many delegates can negotiate over one object, but adelegate cannot negotiate over more than one object concurrently, see function fdin Eq. (6.1).

In our model, each negotiation delegate is mapped into an object, a deadlinetmax 2 N

� and an offer generation strategy � 2 . Each object is mapped into a


Table 6.1 Issues’ counteroffers weight matrix (W)

Price Delivery_time Response_time Reliability

Service A 0.27 0.33 0 0.49Service B 0.24 0.35 0.35 0.51Service C 0.26 0 0.33 0Service D 0.23 0.32 0.32 0

negotiation issue set (Jl 2 2J). Finally, each issue is mapped into a set of constraints,e.g., the reservation intervals (Œmin;max�), the counteroffer distribution weightetc. The number and types of constraints vary. Equation (6.1) shows the formalrepresentation of the three functions (i.e.,fd ; fo; fj ).

fd W D �! .O � N� �/

fo W O �! 2J

fj W J �! .Œmin;max� � :::/(6.1)

In each negotiation round, the buyer agent may need to execute one or more ofthe functions (i.e., fd ; fo; fj ), Eq. (6.1), to reflect some changes in the environment.At the start of a negotiation process, all the functions in Eq. (6.1) are executed.For example, using fd , a delegate di can be assigned a currency converter webservice as a negotiation object, 30 negotiation rounds as a tmax and a time-dependentcounteroffer generation tactic. For the currency converter web service object, theprice and response time can be assigned as negotiation issues using fo. Finally,for the price and response time issues, reservation values are assigned using fj .Similar assignments can be done to the rest of delegates, objects and issues. At thestart of a new negotiation round, the three functions can be executed again for anydelegate, object or issue depending on the dynamicity of negotiation, for example,the mechanism of counteroffer generation or any of its parameters can be changed.An arrival of new outside option causes creation of a new delegate and executingthe relevant assignments.

As a data structure representation, we propose a matrix data structure to representinformation related to some negotiation variables. In our model, we use the issues’counteroffers weight matrix (W) to store the weights of the counteroffers foreach issue, see Table 6.1. At any negotiation round, the buyer agent calculates aglobal counteroffer value (cvji ) for every issue (ji 2 J) and divides the calculatedcounteroffer values amongst the negotiation delegates responsible for negotiationover objects having the issue (ji ) as part of their object’s issue negotiation set.We multiply an issue column i with the cvji for the purpose of computing thevalues of the actual counteroffers should be allocated to the delegates responsible fornegotiation over the issues in the column i . For example, if the buyer agent decidesto allocate $145 as a global value for the price issue at the current negotiation round,i.e., cvprice D 145, then the counteroffer price values assigned to each service inthe current negotiation round according to Table 6.1 are $39.15 for service A, $34.8for service B, $37.7 for service C and $33.35 for service D.


A zero entry in any cell of the matrix W means that the issue of that particularissue column is not an element in the set of issues of that particular row object. Forexample, the reliability issue is not a negotiation issue for the service C in Table 6.1.

In each negotiation round, the matrix W at time t � 1 may differ from the matrixW at time t in terms of the values of its cells, hence the matrix is not a static onebut rather a dynamic one.

For a given W of size (a1 � a2), then

a2XjD1

Wi;j ¤ 1;

a1XiD1

Wi;j D 1 (6.2)

Equation (6.2) shows that the total weight of each column in the matrix W equalsto 1 and the total weights of each row is not equal to 1. The total weight of a rowmight equal to 1 by chance only and it is irrelevant to the counteroffer distributioncalculations.

Agents use the alternating offers protocol [22] in which agents exchange offersand counteroffers in each negotiation round. Each agent has a deadline tamax bywhich the agent must accept an offer or withdraw from negotiation. In addition,each agent has a reservation value for each negotiation issue. The reservationvalue of an issue is the minimum/maximum acceptable value for a certain issueduring negotiation. Negotiation deadlines, reservation values and utility structuresare considered private information for each agent.

6.2.2 Coordination Approach

During multi-bilateral concurrent negotiation, the buyer agent needs to coordinateits actions against its opponents in each negotiation round in a way to achievethe goal of the negotiation process in terms of reaching valuable agreements.Coordinating the buyer’s actions in that context means managing the buyer’snegotiation strategy during negotiation.

Formally, let ˝a be the negotiation strategy of an agent a, then ˝a DhIV a;RV a; T a;ai, where IV a;RV a; T a;a stands for the initial offer value(s),the reservation value(s), the deadline(s) and the set of offer generation strategies ofan agent a respectively.

Our representation of an agent’s strategy ˝a is similar to its representationin [23], the difference is that the fourth part of the strategy components in [23]represents the ˇ value in the time dependent tactics [4] while the fourth part in ourrepresentation (a) has a more general representation which indicates any possibleoffer generation method, e.g., trade-off, time-dependent, behavior dependent etc.and their associated parameters.

Any change to one or more ˝a components during negotiation means a changein agent a’s negotiation strategy. Our focus in this paper is on the last element of˝a, i.e., a. A change in a implies any change in the type of offer generation


mechanism (e.g., from time-dependent to tit-for-tat) and/or change to any parameterthat affects the amount of calculated offer/counteroffer values (e.g., ˇ value in time-dependent tactics) or the amount of counteroffer share amongst the common issuesof different objects, e.g., change in W matrix etc.

• Definition 1. A common negotiation issue is an issue ji 2 J s.t. at least twosubsets Jk; Jl 2 2J exist where ji 2 Jk \ Jl .

In other words, a common issue is an issue that is common amongst multiple objects.For example, multiple services can have the price issue as a common issue.

Managing the values of the generated counteroffers and reordering or modifyingthe weights in the columns of the matrix W are our interest here. Managingthe values of the generated counteroffers aims to minimize the amount of theoffered concessions to increase the utility of a possible agreement. On the otherhand, reordering or modifying the weights in the matrix W aims to help delegatesnegotiating with tough opponents to reach an agreement.

It is normal that different opponents can have different behaviors on differentissues. The DCS benefits from such a fact. Considering that the buyer agentcalculates the counteroffers for each issue in each negotiation round and that thecounteroffer values are to be divided amongst negotiation objects having commonissues in their negotiation issue subsets by considering the weights of issues in thematrix W, we propose to reorder or modify the weights in weights vectors (i.e., thecolumns of the matrix W) in the matrix W to reflect the relative behaviors (level ofcooperation) of the opponents.

We assume that the initial issues counteroffers’ weight matrix (W) is populatedfrom domain knowledge or from previous negotiation encounters. In our case,we assume that the common issues are comparable in their valuation when weadopt the matrix reordering approach. Saying that the real values of the issuesare comparable, it means that the real values of the common issue do not differsignificantly. For example, in case of the price issue for web services, the prices forall web services under negotiation should be similar, say the prices range from $80to $100. This assumption is realistic in some real life scenarios. In case of choosingto modify the weights in the matrix, that assumption can be ignored.

To this end, our coordination approach considers the different behaviors (in termsof their recent concessions) of the seller agents on the common issues in eachnegotiation round as a dynamic variable for controlling the reordering of the weightsin the matrix W or modifying their values.

An agreement is accepted for a certain object if there is an acceptable agreementsover all issues related to that object. If there are more than one object undernegotiation, then an agreement per object is necessary for the buyer agent to haveone global agreement.

The vector of issue values offered by an agent s to an agent (or a buyer’s delegate)d at time t is denoted by xts!d . The particular value for an issue ji offered by anagent s to an agent d at time t is represented by xts!d Œji �. The buyer agent calculatesits utility from an agreement over an issue according to Eq. (6.3).


ud .xs!d Œji �/ D(.xs!d Œji ��RV d

ji/=.IV d

ji�RV d

ji/; If .IV d

ji> RV d

ji/

.RV dji� xs!d Œji �/=.RV

dji� IV d

ji/; If .IV d

ji< RV d

ji/;

(6.3)

where ud.xs!d Œji �/ stands for the buyer’s utility from having agreement over anissue ji of a certain object at time t . TheRV d

ji; IV d

jistand for the buyer’s reservation

value and the buyer’s initial value for the issue ji respectively. The next step is tofind the weighted average utility of each object. The final utility is calculated bytaking the average utility of all objects since we assume that all objects have thesame weight or the same importance.

Algorithm 1 summarizes the main steps of the proposed dynamic negotiationstrategy, the dynamic counteroffer strategy (DCS). The genCounteroffer() proce-dure uses a certain method to generate the initial counteroffer values for the issues.For example, an agent may choose a certain offer generation technique such astime-dependent tactic or tit-for-tat etc. In our experiments, the buyer agent uses thetime-dependent tactics to its initial counteroffer offers.

Algorithm 1: DCS1: while .t <D tmax/ & (no agreement) do2: genCounteroffer()3: adaptCounteroffer()4: min_Max_Swap()5: end while

For the part of managing the initially generated counteroffers by the buyer agent,Algorithm 2 (adaptCounteroffer()) summarizes the main steps for manipulating acounteroffer value for an issue. The Algorithm requires the vector of the first-orderdifferences of the offered concessions from both the seller agents (CS

ji) and the

buyer agent’s delegates (CDji

) on a vector of common issue in the previous twonegotiation rounds. Our heuristic finds the difference between their sums (Cji ) andsubtracts the positive difference (if any) from the original generated counteroffer(see Algorithm 2) for that issue. The steps are repeated for all common issues.For example, let us take the price issue (consider it a common issue) to explainthe idea. If we have two seller agents and the offers from the last two roundsare of t�3

s D f21; 25g and of t�1s D f18; 21g and the delegates’ counteroffers

are of t�2d D f10; 10g and of t

d D f13; 12g then the first-order differences areCSji

D of t�3s � of t�1

s D f3; 4g and the CDji

D of t�2d � of t�4

d D f3; 2g for theseller agents and the buyer’s delegates respectively. The difference between the twofirst-order differences is Cji D CS

ji� CD

jiD f0; 2g. The total positive difference

over the price issue is 2, then we deduct 2 from the amount of the initially generatedcounteroffer.

To help the buyer agent’s delegates facing tough opponents over certain commonissues, the Min_Max_Swap./ method is used to reorder the weights in the Wmatrix, see Algorithm 3. As mentioned before, under the assumption that the real


values of the common issues of different objects are comparable, we can reorder theweights in the matrix W, otherwise we need to modify the weights in the matrix.

For example, assume that the current weight vector for the price common issueis wp D f0:27; 0:26; 0:24; 0:23g and the C t

jiD f4; 2; 1;�1g. We note that the most

difficult opponent is the seller agent (s4) who negotiates with the delegate d4 andthe most favorable one is s1 who negotiates with the delegate d1. The second pair inthe comparison is the opponents s2 and s3 which they are favorable and unfavorablerespectively. Given the current situation, we apply the Min_Max_Swap./ algo-rithm to reorder the weights in the price vector to provide more concessions to theunfavorable seller agents and less concessions to the favorable seller agents. In otherwords, some resources are taken from the delegate who is negotiating with the mostfavorable seller agent to the delegate who is negotiating with the most unfavorablesituation.

Algorithm 2: adaptCounteroffer()Require: CS

ji

Require: CDji

1: count_offji D genCounteroffer./2: Cji D .C S

ji� CD

ji/

3: if sum.Cji / > 0 then4: count_offji D count_offji � sum.Cji /5: end if6: return count_offji7: return Cji

Algorithm 3: Min_Max_Swap./Require: CjiRequire: wji1: for (k D 1 to Length.wji / Div 2) do2: wkji D swap.wji ; Cji /

3: wji D wkji4: end for5: return wji

Table 6.2 shows the current status in terms of the current Cji and the corre-sponding weights of the common issues and opponents. In the first iteration, theweights corresponding to the sellers s1 and s4 are exchanged and the result isw0p D f0:23; 0:26; 0:24; 0:27g. In the second iteration, the weights correspondingto the sellers s3 and s4 are exchanged resulting in w00p D f0:23; 0:24; 0:26; 0:27g.The w00p issues weight vector is used to distribute the counteroffer value in thenext negotiation round. The same steps are repeated for all other common issues.


Table 6.2 Weights reordering example

Current status First iteration Second iteration

Cji wp S Cji wp S Cji wp S

4 0.27 s1 4 0.23 s1 4 0.23 s12 0.26 s2 2 0.26 s2 2 0.24 s21 0.24 s3 1 0.24 s3 1 0.26 s3�1 0.23 s4 �1 0.27 s4 �1 0.27 s4

The Algorithm Min_Max_Swap./ can be executed iff the positions of themaximum and minimum values in the C t

jiis different from their positions in C t�1

ji.

For example, if C tji

D f4; 2; 1;�1g and C t�1ji

D f3; 2; 1; 0g then the weightscorresponding to the seller agents s1 and s4 are not swapped.

If the real value of a common issue is different significantly from object toanother such as the price of a flight and the price of taking a taxi between two nearplaces in a certain city, then we need to change the weights taking into considerationthe originally populated matrix. For example if the original weights of two commonissues are 0.6 and 0.2, then it is impractical just to swap the two weights. A differentapproach is needed to keep the balance between the counteroffers of same issue(i.e., price) of different objects that have large differences in their reservation values.However, that is left as a future work.

6.3 Experiments

To evaluate our proposed dynamic strategy, DCS, we use the exploratory studiesevaluation method [24] and propose several hypotheses which will be eithersupported or negated by the experimental results.

The dependent variables in our experiments are the utility of an accepted offerand the number of agreements. The independent variables are the negotiationdeadlines, the offer generation strategies and the concession convexity parameter.

In all our experiments, we consider all objects as one bundle for counting theagreements i.e., the agreements are connected and an agreement over each object isnecessary to have one global agreement.

The number of experiment repetitions that is used to test each hypothesis is 1,000times. The results are averaged and the Mann–Whitney test [25] is used to ensurethat the difference between the results are statistically significant at 95 % confidencelevel.


a (t)

b=10

b=5

b=2

b=1b=0. b=0.5 b=0.2

1.0

0.8

0.6

0.4

0.2

5 10 15 20time

Fig. 6.2 Concession patterns

6.3.1 Settings

The negotiation settings are described as follows.

1. Time-dependent tactics:Each seller agent selects a random ˇ value for the concession function (˛.t/ D.min.t; tamax/=t

amax/

1=ˇ [4]) from the interval Œ0:05; 10� and a random deadlinefrom the interval Œ10; 50�. Figure 6.2 shows the concession curve patterns fordifferent ˇ values. The deadline in Fig. 6.2 is 20 rounds.

2. Tit-For-Tat tactic:We use the random absolute tit-for-tat [4], ı D 1 and R.M/ D 0. When a selleragent uses the mixed strategy, it selects a random value from the interval Œ0:1; 0:9�to determine the mixing weight between the time-dependent and behavior-dependent tactics.

3. Unless stated to the contrary, the two buyer agents (the one uses the DCS and theone uses the SS) and the seller agents select their deadline time from the sameinterval. In all cases, the two buyer agents use the same deadline.

4. At the start of each negotiation encounter, a ˇ value is selected randomly fromthe interval Œ0:05; 1� and assigned to the two buyer agents except for testinghypothesis number 3 in which certain ˇ values are assigned differently to thetwo agent types.

6.3.2 Hypotheses

Hypothesis 1. The length of the deadline is an irrelevant factor for the DCSmechanism to outperform the static strategy when the seller agents use the time-dependent tactics to generate their offers.

Hypothesis 1 states that the DCS strategy outperforms the static strategy undervarious negotiation deadlines when the seller agents use the time-dependent tacticsto generate their offers.


Hypothesis 2. The length of the deadline is an irrelevant factor for the DCSmechanism to outperform the static strategy when the seller agents use the mixedstrategy (mixing of time-dependent and behavior-dependent) to generate theiroffers.

Hypothesis 2 states that the DCS strategy outperforms the static strategy undervarious negotiation deadlines when the seller agents use the mixed strategy togenerate their offers.Hypothesis 3. The concession convexity degree is an irrelevant factor for the DCSmechanism to outperform the static strategy.

Hypothesis 3 states that the DCS outperforms the SS under different concessioncurve convexities.

6.3.3 Results and Discussions

This section shows the experimental results for the above hypotheses and discussesthe results. For the experiments regarding hypotheses 1 and 3, we use the time-dependent-tactics for all agents, while we use a mixed strategy for the seller agentsregarding experiments testing hypothesis 2.

Hypothesis 1. Figure 6.3 shows that the DCS mechanism outperforms the staticstrategy (SS) under all negotiation deadlines. Figure 6.3a shows that the DCSmechanism outperforms the static strategy (SS) in terms of utility gain whileFig. 6.3b shows that the average agreements of DCS mechanism are better than theaverage agreements of the SS. We note that both, the utility rate and agreement ratefor both strategies are lower when the buyer’s deadlines are near the interval limits.The reason is that the randomly selected sellers’ deadlines will have large differencefrom the buyer’s deadline with high probability when the buyer’s deadlines are nearthe deadline interval limit (i.e., near 10 or near 50) and that will negatively affectthe number of agreements between the buyer agent and the seller agents. A lowernumber of agreements results in a lower utility rates.

average utility0.20

a b

0.60.50.40.30.20.1

0.10

0.05

10 20 30 40 50deadlines

10 20 30 40 50deadlines

0.15

average agreements

SSDCS

SSDCS

Fig. 6.3 Results for testing hypothesis number 1. (a) Utility rate. (b) Agreement rate


average utility

0.20

0.25 0.60.50.40.30.20.1

a b

0.10

0.05

10 20 30 40 50 10 20 30 40 50deadlines deadlines

0.15

average agreements

SSDCS

SSDCS


average utility average agreements

0.20

0.8

0.6

0.4

0.2

a b

0.10

0.05

0.5 1.0 1.5 2.0b

0.5 1.0 1.5 2.0b

SSDCS

0.15

SSDCS


Hypothesis 2. Figure 6.4 shows the experimental results. The results show thatunder various buyer’s deadline and when seller agents use the mixed strategy, theperformance of the DCS strategy is better than the performance of the static strategySS in terms of both, the utility rate (see Fig. 6.4a) and agreement rate (see Fig. 6.4b).We also note here that both strategies perform worse when the deadlines of the buyeragents are near the deadline interval limits for the reason stated in hypothesis 1.Hypothesis 3. Figure 6.5 shows that the DCS strategy outperforms the staticstrategy (SS) in both, the utility rate (see Fig. 6.5a) and agreement rate (seeFig. 6.5b) for all the ˇ values shown in the figure. When the ˇ value is small,both the number of agreements and the total utility will be negatively affected asshow in Fig. 6.5a, b. The reason is that the buyer agent does not concede enoughwhen using low ˇ values (see Fig. 6.2) which results in low agreement numbers andconsequently low utility rate.


6.4 Conclusions and Future Work

This paper investigates the negotiation scenario where a buyer agent negotiates withmultiple independent seller agents over multiple distinct negotiation objects. Eachobject has multiple negotiation issues and a single provider.

We propose a novel dynamic counteroffer strategy (DCS) that adapts both, theinitially generated counteroffers and the issues’ counteroffers weight matrix duringnegotiation as a response to the behaviors of the opponents on the common issues interms of their recent concessions.

The DCS involves two main steps: first, adapt the initially generated counterof-fers and second exchange the weights in the issues’ counteroffers weight matrix.Finally the buyer agent distributes the adapted counteroffers on the buyer agent’sdelegates using the modified issues counteroffers’ weights matrix. We comparedour strategy with a static strategy using the utility rate and the agreement rate as theperformance criteria.

The initial results show that our proposed dynamic strategy is more effective andat the same time more robust when compared to the static strategy.

We need to extend our work and conduct more experiments that involves differentconcession curves and/or different tit-for-tat strategies. In addition, we need toinvestigate the situation of modifying the weights in the issues’ counteroffers weightmatrix rather than reordering them. Comparing the DCS with other non-staticstrategies such as the Bayesian learning strategy is also important.

Since we investigate the situation where a buyer agent has one provider per adistinct object, we also plan to study the situation where the buyer agent aims toprocure multiple distinct negotiation objects, given that each object has multipleproviders.

Finally, since each object has multiple issues and there exists a possibility thatagents have divergent preferences over issues, there is a potential for using the trade-off mechanism since it can improve the social welfare of the agents.

References

1. Lomuscio, A., Wooldridge, M., Jennings, N.R.: A classification scheme for negotiation inelectronic commerce. Group Decis. Negot. 12, 31–56 (2003)

2. Jennings, N.R., Faratin, P., Lomuscio, A.R., Parsons, S., Wooldridge, M., Sierra, C.: Automatednegotiation: prospects, methods and challenges. Group Decis. Negot. 10, 199–215 (2001)

3. An, B., Lesser, V., Irwin, D., Zink, M.: Automated negotiation with decommitment for dynamicresource allocation in cloud computing. In: 9th International Conference on AutonomousAgents and Multiagent Systems (AAMAS 2010), Toronto, pp. 981–988 (2010)

4. Faratin, P.: Automated service negotiation between autonomous computational agents. Ph.D.thesis, University of London (2000)

5. Fatima, S., Wooldridge, M., Jennings, N.R.: Optimal negotiation strategies for agents withincomplete information. In: Meyer, J.-J., Tambe, M. (eds.) Intelligent Agent Series VIII: Pro-ceedings of the 8th International Workshop on Agent Theories, Architectures, and Languages(ATAL 2001). Volume 2333 of LNCS, pp 53–68. Springer, Berlin (2001)


6. Mansour, K., Kowalczyk, R.: A meta-strategy for coordinating of one-to-many negotiationover multiple issues. In: Wang, Y., Li, T. (eds.) Foundations of Intelligent Systems, Shanghai,pp 343–353. Springer, Berlin (2012)

7. Mansour, K., Kowalczyk, R., Vo, B.Q.: Real-time coordination of concurrent multiple bilateralnegotiations under time constraints. LNAI 6464, 385–394 (2010)

8. Wong, T.N., Fang, F.: A multi-agent protocol for multilateral negotiations in supply chainmanagement. Int. J. Prod. Res. 48(1), 271–299 (2010)

9. An, B., Sim, K.M., Miao, C.Y., Shen, Z.Q.: Decision making of negotiation agents usingMarkov chains. Multiagent and Grid Syst. 4, 5–23 (2008)

10. Nguyen, T., Jennings, N.: Managing commitments in multiple concurrent negotiations. Elec-tron. Commerce Res. Appl. 4(4), 362–376 (2005)

11. Nguyen, T.D., Jennings, N.R.: Coordinating multiple concurrent negotiations. In: The ThirdInternational Joint Conference on Autonomous Agents and Multi Agent Systems, New York,USA, pp. 1062–1069 (2004)

12. Rahwan, I., Kowalczyk, R., Pham, H.H.: Intelligent agents for automated one-to-manye-commerce negotiation. In: Twenty-Fifth Australian Computer Science Conference, Mel-bourne, Australia, 197–204 (2002)

13. Nguyen, T.D., Jennings, N.R.: Concurrent bi-lateral negotiation in agent systems. In: Proceed-ings of the Fourth DEXA Workshop on E-Negotiations (2003)

14. Cuihong, L., Giampapa, J., Sycara, K.: Bilateral negotiation decisions with uncertain dynamicoutside options. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 36(1), 31–44 (2006)

15. Faratin, P., Sierra, C., Jennings, N.R.: Using similarity criteria to make issue trade-offs inautomated negotiations. Artif. Intell. 142(2), 205–237 (2002)

16. Ros, R., Sierra, C.: A negotiation meta strategy combining trade-off and concession moves.Auton. Agents Multi-Agent Syst. 12(2), 163–181 (2006)

17. Gerding, E., Somefun, D., La Poutré, J.: Multi-attribute bilateral bargaining in a one-to-manysetting. Proc. of the AMEC VI Workshop, New York, USA, 3435, 129–142 (2004)

18. Hindriks, K.V., Tykhonov, D., Weerdt, M.M.: Qualitative one-to-many multi-issue negotiation:approximating the QVA. Group Decis. Negot. 21(1), 49–77 (2010)

19. Ng, S., Sulaiman, M., Selamat, M.: Intelligent negotiation agents in electronic commerceapplications. J. Artif. Intell. 2(1), 29–39 (2009)

20. An, B.O.: Automated negotiation for complex multi-agent resource allocation. Ph.D. thesis,University of Massachusetts, Amherst (2011)

21. de Vries, S., Vohra, R.V.: Combinatorial auctions: a survey. INFORMS J. Comput. 15(3),284–309 (2003)

22. Osborne, M., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)23. Fatima, S.: An agenda-based framework for multi-issue negotiation. Artif. Intell. 152(1), 1–45

(2004)24. Cohen, P.: Empirical Methods for Artificial Intelligence. MIT Press, Cambridge (1995)25. Mann, H., Whitney, D.: On a test of whether one of two random variables is stochastically

larger than the other. Ann. Math. Stat. 18, 50–60 (1947)

Chapter 7Reducing the Complexity of Negotiations OverInterdependent Issues

Raiye Hailu and Takayuki Ito

Abstract We consider automating negotiations over a matter that has multipleissues and each issue can take any one of the multiple possible values for that issue.We propose a rule that can be used during evaluation of contracts that reduces thenumber of possible bids from agents and hence increases the number of agents thatcould participate in the negotiation. We assume that each constraint correspondsto one evaluation criterion. The rule states that when evaluating contracts by acriterion, only contracts that satisfied previous criteria are considered. This iscommon practice in real life situation. That is, humans when evaluating possibleoptions, often reduce the possibilities that have to be evaluated at each step byeliminating those that did not satisfy the previous criteria. We show how to use therule by adapting a negotiation scenario from literature. The negotiation is betweenan employer and candidate employee. We also explore using the monetary values asweights for constraints of agents.

Keywords Multi agent systems • Negotiation • Non linear utility spaces

R. Hailu (�)Department of Computer Science and Engineering, Nagoya Instituteof Technology, Nagoya, Aichi, Japane-mail: [email protected]

T. ItoSchool of Techno-Business Administration, Nagoya Instituteof Technology, Nagoya, Aichi, Japane-mail: [email protected]


125



126 R. Hailu and T. Ito

7.1 Multiple Interdependent Issues

For the sake of automation we abstract the matter in which the negotiation is doneover as follows; We assume the negotiation matter can be represented by one or moreissues. Each issue can have multiple possible values. Hence, each combination ofissue values we get by assigning a value to each issue is a possible contract for thenegotiation. We view negotiation as the process of selecting the optimal contractfrom among these possible contracts.

We assume the optimal contract to be the one that maximizes social welfare;.In other words, the contract with the highest total utility. Total utility of a contract isthe sum of utility value of each agent for the contract. Some researchers have alsoconsidered other optimality measures like fairness but currently we only focus onmaximizing the total utility.

Present day network infrastructure (LANs and WANs) make communicationbetween agents simple. However, the task of representing all possible contractsfor a negotiation including the utility value of each agent for each contract so thatthe selection of the optimal contract can be automated and locating the optimalcontract efficiently are challenges yet to be fully solved. This is because theissues are interdependent. When the issues are independent agents can negotiateover the issues one by one and still reach at an optimal contract. As described in[7] in these kind of negotiations the main focus is on what strategy an agent uses tomaximize its utility while still arriving at an agreement: conceding method. This isthe main theme of competitions like ANAC [1] where agents use the bid exchangingprotocol. But when the issues are interdependent the computational complexity ofpreference elicitation and identification of the optimal contract become the mainresearch focus.

7.2 Grouping Contracts and Bidding Based DealIdentification

The idea proposed by Ito and Klein [6] is to group similar contracts when assigningutility values. That is, rather than dealing with each contract one by one, intervalsof the issue values are used. Agents create their utility space by creating many suchconstraints. It is possible for constraints to overlap. The utility of contracts in theoverlap region is the sum of the utility of the constraints that overlapped. Figure 7.1shows such utility space.

The Bidding based deal identification algorithm [6] was proposed to avoidmatching of the entire utility space of agents to locate the optimal contract. But itfaces certain limitations. Agents randomly sample their utility space and adjust thesesamples by simulated annealing to generate their bids. Then each agent submits

7 Reducing the Complexity of Negotiations Over Interdependent Issues 127

Fig. 7.1 Utility space

his bids to a mediator agent who exhaustively matches the bids to find those thatintersect. Such an intersection which has the maximum total utility is selected to bethe deal.

The problem is that computational cost of exhaustive matching increases expo-nentially (NoOfAgentsNoOfBids). One may solve the problem by limiting thenumber of bids from agents. But this has a problem. Not only it affects the optimalityof the contracts identified, but it can also make the negotiations fail. When the dealidentification is not able to identify any contract, the negotiation is said to be a failedone [6].

Some researchers have proposed negotiation protocols to overcome the describedshortcomings and other weaknesses of the bidding based deal identification algo-rithm but a conclusive solution to the problems is yet to be found.

The threshold adjusting algorithm [2] makes agents bid in multiple rounds ratherthan once. In each round the threshold value is lowered. The threshold value is theminimum allowable utility value of a bid. The bidding is stopped at the round adeal is found. This has the advantage of limiting the amount of private informationrevealed to a third party. Hattori and Ito [4] reduces failure rates by iterativelynarrowing down the region of the contract space that the agents generate their bidsfrom. Measures that reduce high failure rates that arise when agents use narrowconstraints were discussed in [8]. In [3] an algorithm that exploits agents sensitivityto identify the optimal contract correctly and efficiently was proposed.


Fig. 7.2 Advantage of the subset rule

7.3 Subset Rule

The rule is that each new defined constraint should be a subset of the constraintdefined before it. This means that the second constraint can only contain some(possibly all) of the contracts in the first constraint, the third constraint can onlycontain some (possibly all) of the contracts in the second constraint and so on. (seeFig. 7.2b). Intuitively this means that each constraint corresponds to a criterion thatthe users use to evaluate the contracts. The first constraint (the widest constraint)is the minimum criteria that the contracts acceptable by the user should satisfy.The second constraint is the second criterion that the contracts should satisfy.It is possible that other contracts that does not satisfy the first criteria satisfy thesecond one. But as a principle the user does not consider contracts that did notsatisfy previous constraints. The advantage of the rule is that it decreases the numberof possible bids as can be seen by comparing Fig. 7.2a,b. Another advantage of therule is that for every new constraint added the number of contracts that have to beevaluated decreases.

7.4 Experimental Evaluation

We evaluated the bid reduction obtained when applying the subset rule. We used theconstraint generation method used in [6] but by modifying it in order to make someconstraints satisfy the subset rule; significant bid reduction was observed. Moreover,we were able to conduct negotiations with high optimality by using just a few bidsfrom each agent.


7.4.1 Experiment Settings

The constraint generation methods compared were Random generation (Ran) andSubset rule based generation. In both cases for a negotiation with I number of issueseach agent defines 4 � I constraints. Each issues has 10 possible values representedby the numbers 0–9.

One example of a constraint in a 3 issue negotiation is (C: Œ4; 7�Œ3; 6�Œ0; 9�). Eachinterval corresponds to one issue. This constraint contains all contracts that have thevalues 4–7 for Issue 1, the values 3–8 for Issue 2 and the values 0–9 for Issue 3. Thisconstraint is said to have width of 4, because each of the first and second intervalscontain four of the issue values. In the experiments a constraint is defined so that allintervals have an equal width with exception of the intervals defined over the entireissue value like the third interval in the example constraint. Moreover, this constraintis said to be a 2-Issue constraint because we can check whether a contract belongs tothe constraint or not by just using its values for Issue 1 and Issue 2. Intuitively, thismeans the constraint is a function of only the first two issues. Similarly, one coulddefine 1-Issue constraints and 3-Issue constraints.

In the experiments the utility for a constraint is randomly chosen from numberswhich are multiples of 10 with the maximum being 100. The two constraintgeneration methods differ in how they position the constraints and the width theyassign to them.

7.4.1.1 Ran

This is the constraint generation method used in [6]. As mentioned above fora negotiation with I number of issues each agent defines 4 � I number ofconstraints. These are comprised of 4 1-Issue constraints, 4 2-Issue constraints,4 3-Issue constraints, . . . 4 I -Issue constraints. The width of each constraint ischosen randomly from the values 1–6. The constraints are positioned randomly.

7.4.1.2 Subset Rule Based

Unlike the Ran method all the 4 � I constraints are I -Issue constraints. There areI groups of the constraints. Each group contains four constraints that satisfy thesubset rule. Two types of groups were used in the experiments. 8to2s and 6to1s.

8to2s

In this setting each of the base (first), the second, the third and the last constrainthas a width of 8,6,4 and 2 respectively. The following is an example of a groupin a negotiation over two issues. C1:[2,9] [2,9] C2:[3,8][3,8] C3:[4,7][4,7] andC4:[5,6][5,6].


Fig. 7.3 No. of bids

6to1s

In this setting the each of the base (first), the second, the third and the last constrainthas a width of 6,4,2 and 1 respectively. In this case a group covers relatively smallerarea in the utility space than the 8to2s case. This means, there are more possiblepositions to place a group in the utility space. Which in turn means the agents utilityspace will be more dissimilar than the 8to2s case. The following is an example ofa group in a negotiation over two issues. C1:[1,6] [1,6] C2:[2,5][2,5] C3:[3,4][3,4]and C4:[4,4][4,4].


Figure 7.3 shows the number of bids generated when Ran and Subset rule (8to2s and6to1s) constraint generation methods were created. As can be seen, the number ofbids generated when the rule is applied is significantly lower than the Random case.For bid generation the procedures described in Sect. 7.2 was used. For adjustingrandom samples, simulated annealing (SA) with initial temperature of 10 was used.

Figure 7.4 shows the optimality of the contract the mediator identified for thetwo type of constraint generation methods. In the negotiations there were 7 agents.Each was allowed to submit only 5 bids.

Generally an optimality of greater than 0.8 was obtained for negotiations betweenagents who applied the Subset rule. But it was not possible to locate any dealcontracts for negotiation between agents that used Ran. Five bids per agent is simplynot enough to locate any deal let alone an optimal deal.

7.5 Case Study

7.5.1 Applying the Subset Rule

We will evaluate the effect of the rule that by adapting a negotiation scenariodescribed in [5]. The negotiation is between an employer (E) and a candidateemployee (C). They negotiate over the issues how many days the employee is going


Fig. 7.4 Experimental results

to work (Wd ) and the number of days of child care provided by the Employer (Ce).Working days can be from 1 to 5; Wd W Œ1::5�. Number of child care days can bebetween 0 and 2;Ce W Œ1::2�. We will observe the difference of the resulting utilityspaces when the constraints are not made to satisfy subset rule and when they aremade to. The candidate’s utility space is used for discussion.

The candidate has promised to his/her partner that he/she will look after theirchild for 2 days of the five working days. This promise can be fulfilled either byworking less than 5 days, or by making the employer provide child care or bycombination of the two. Hence, Cc >D 2; Cc � 5 �Wd C Ce. Cc is the numberof child care days the candidate managed to provide. The constraints correspondingto this condition are shown in (7.1).

Wd W Œ1::3�Ce W Œ0::2�Wd W Œ4::4�Ce W Œ1::2�Wd W Œ5::5�Ce W Œ2::2� (7.1)

Next the candidate prefers to work many days a week. For example, working for 5days is preferred to working for just 1 day. To define constraint for this condition,we divide the contracts in to two. Those with Wd > 3, and those with Wd � 3.We assume that contracts with more than three working days satisfy the conditionof working many days. The constraint corresponding to this condition is shownin (7.2).

Wd W Œ4::5�Ce W Œ0::2� (7.2)


Fig. 7.5 With and withoutapplying subset rule

The last one is that the candidate prefers the child care to be provided by theemployer. That is contracts with Ce D 2 are preferred to contracts with Ce D 0.The constraint is shown in (7.3).

Wd W Œ1::5�Ce W Œ2::2� (7.3)

Applying the subset rule means, when making new constraint by taking only thepart of it that has intersection with the previous constraint. That means, contracts in(7.2) that also do not belong to (7.1) will be dropped. The same is done for (7.3)also.

7.5.2 Desirable and Undesirable Effects

The effect of applying the subset rule can be seen by comparing Fig. 7.5a,b. Whileits effect around region (D) is a desirable one. Its effect on the region around (U)is not that useful or even erroneous. In region (D) contracts that should have zeroutility have zero utility unlike the case when the rule is applied. In region (U) therule might have unnecessarily reduced the utility of contracts.

7.5.3 Exploring the Use of Monetary Values as WeightsFor Constraints

When attempting to apply the negotiation framework discussed so far the firstproblem we encounter is on what values to use as weights for constraints of agents.Here we will try to use monetary values as weights to constraints.

Assume that the expected salary of working for 1 day is $100 and the estimatedcost of child care for 0, 1 and 2 days to be $0, $20 and $25 respectively. Then,roughly the weight for a constraint is the difference of the expected monetary gain


Fig. 7.6 Monetary weights

and the incurred cost of contracts satisfying the constraint. But before proceedingwe have to solve two problems.

The first is, since a constraint might be satisfied by many contracts, we can notfind a single value that can represent the monetary gain of the contracts correctly.As a result we have chosen to use the value of the contract with the minimummonetary gain. Hence the weight of constraint 1 (at least 2 days of child care) ischosen to be $100. The second problem is that when using money the weight ofconstraints may not be independent. For example, normally we would choose theweight of constraint 2 (working more number of days) to be $400. But since itoverlaps with constraint 1, it would for example give a utility of $500 for the contract(4,3) which is an over estimation. Using a weight $300, would give us result thatconforms to our first choice of using the value of the contract with the minimummonetary gain.

Again for constraint 3 (prefers child care to be provided by Employer) one mightbe inclined to consider the monetary gain from the working days of the contractssatisfying it. But as it overlaps with the previous two constraints it suffices to usethe $25 as the weight of the constraint. The value $25 is the money “saved” bythe candidate by not providing child care himself. Figure 7.6 shows the final totalweight (utility) of the three distinct regions in the utility space.

7.6 Conclusion and Future Works

We proposed a rule that can be used during grouping of contracts (definingconstraints) that reduces the no of possible bids from agents and hence increases thenumber of agents that could participate in the negotiation. The rule simulates what


humans commonly do when evaluating possible options. That is, when evaluatingpossible options, we often reduce the possibilities that have to be evaluated at eachstep by eliminating those that did not satisfy the previous criteria. The experimentalevaluations show that applying the rule can greatly reduce the number of bids fromagents. This reduction means that the negotiation system can support more numberof agents.

The reason why this rule works can be understood by noticing that in largecontract spaces agents are highly unlikely to have local maximums (bids) at thesame regions. For example in 100 contracts contract space the probability thattwo agents pick the same contract is about zero (1/100). This gets worse as thenumber agents and the contract space grows. Therefore, the constraint generationmechanism should guide agents in a way that they will attain local maxima atsimilar locations. The subset rule does exactly that. But it does it while still keepingthe individuality of agents as only the locations of the local maxima are similar(probabilistically) but the exact utility of the this local maxima is entirely dependenton the agent. In the experiments random values were used for each constraint weightvalue.

One possible concern that needs to be addressed is how to make sure agentsfollow the subset rule when defining their constraints. Currently we are startingto develop a system to support such negotiations. In the system, the mediator isnot just responsible for identifying the deal contract but also designing the UserInterface negotiators use to define their constraints. Through that UI the mediatorcan validate their constraints to check weather the subset rule and other domainspecific rules are being followed or not.

The subset rule significantly reduced the number of bids, but this alone does notsolve the problem completely. The computational cost of exhaustive matching stillrises exponentially with the number of agents. We want to look ways to solve thisproblem.

References

1. Baarslag, T., Jonker, C.M.: The First Automated Negotiating Agents Competition (ANAC2010). New Trends in Agent-based Complex Automated Negotiations, Series of Studies inComputational Intelligence (2010)

2. Fujita, T., Hattori, M.: An approach to implementing a threshold adjusting mechanism in verycomplex negotiations a preliminary result. KICSS, pp. 185–192 (2007)

3. Hailu, T.: Efficient Deal Identification For the Constraints Based Utility Space Model. TheAAMAS Workshop on Agent-based Complex Automated Negotiations (2011)

4. Hattori, M., Ito, T.: Using iterative narrowing to enable multi-party negotiations with multipleinterdependent issues. AAMAS, pp. 1043–1045 (2007)

5. Hindriks, C., Dmytro, T.: Eliminating interdependencies between issues for multi-issuenegotiation. CIA, pp. 301–316 (2006)

6. Ito, T., Klein, M.: Multi-issue negotiation protocol for agents exploring nonlinear utility spaces.IJCAI, pp. 1347–1352 (2007)


7. Klein, P., Sayama, Y.: Negotiating Complex Contracts. MIT Sloan Research Paper No. 4196,(2007)

8. Marsa-Maestre, M., Velsaco, E.: Effective bidding and deal identification for negotiations inhighly nonlinear scenarios. AAMAS, pp. 1057–1064 (2009)

Chapter 8Evaluation of the Reputation Network UsingRealistic Distance Between Facebook Data

Takanobu Otsuka, Takuya Yoshimura and Takayuki Ito

Abstract In recent years, such SNS services as Facebook, GoogleC, and Twitterhave become very popular. In such services, many sources of information are postedand shared, although user rankings are hardly considered. In this paper, for webpages we consider an evaluation technique, such as HIT and PageRank, for SNSuser evaluation applications and propose an algorithm using a user’s real distance.We consider various parameters, including user distance, favorites, and the numbersof friends in SNSs in our evaluation technique. We propose a new reputationnetwork to measure the reliability of SNS information.

Keywords Evolutionary computation • Knowledge representation • Networksimulation and modelling • Reputation network

8.1 Introduction

In SNSs, much information that is not useful is spread as false rumors, spam,etc. Malicious application information is spread applications cooperation withinSNSs, which is represented by FaceBook. The theft of the private information ofusers continues to increase. Other examples include users who use stolen accountinformation and send spam to others. Therefore, we must learn how to rank usersto verify information. In many present services, verification, based on information

T. Otsuka (�) • T. ItoCenter for Green Computing, Nagoya Institute of Technology, Showa-ku,Nagoya 466-8555, Japane-mail: [email protected]; [email protected]

T. YoshimuraMaster of Information Engineering, Nagoya Institute of Technology,Showa-ku, Nagoya 466-8555, Japane-mail: [email protected]


137




138 T. Otsuka et al.

contributed by users to spam analysis software or viewing, is identifying harmfulinformation. However since it cannot respond to the increase in the number ofusers, not all the harmful information can be eliminated. Also when using thesetechniques, users who have made many useful contributions are ranked to supportthat harmful materials are excluded from higher-ranked users and to process thecheck of the hoer of the user’s post. However, in the ranking technique using theactual range between information in the state where geo-location can’t be operatedfrom the outside, it is hard to perform ranking operation by malicious user. Someresearch on reputation networks has evaluated user reliability. Social Tie computesthe social depth by the community to which a user belongs [3, 13]. Another methodcomputes the relation among users as a trust network by a VCG mechanism [18].Some algorithms rank users by their relationships with friends and their affiliationcommunities. We consider whether it can contribute to the accuracy of a user’s rankby treating the actual distance among users as a parameter. When ranking a userwith the technique employed to rank web pages, we can apply it by replacing thelink element used for the ranking technique of web pages.

• When the information posted by users is shared (share) = output link (authorities)• When the information posted others is shared (reshare) = input link (hubs)

It was mentioned above. The names of the share/reshare of the SNS servicesare:

• Share by button in the lower part of FaceBook post• Share by button in the lower part of GoogleC post• Retweet of posted information on Twitter

The remainder of this paper is organized as follows. In Sect. 8.2, we present anoverview of reputation mechanisms and define a simple scoring mechanism and itsproblems. In Sect. 8.3, we show the reputation network using distance. In Sect. 8.4,we show the Parameter setup items. In Sect. 8.5, we demonstrate our currentexperimental results in which we present the correlations between simple scoresand Distance-HITS, Distance-PageRank. In Sect. 8.6, we discussion for usefulnesson this study on SNS services. Finally, we summarize our paper and show futurework in Sect. 8.7.

8.2 Related Work

8.2.1 Reputation Mechanism

A user evaluation is used by various techniques during auctions. Generally, thereputation mechanisms used in online auctions and such shops as eBay and Yahoo!auctions are simple scoring mechanisms, where buyers and sellers evaluate eachother using numbers and their total. The problem of simple scoring mechanismsis described in the next passage. Reputation mechanisms are widely treated in

8 Evaluation of the Reputation Network Using Realistic Distance Between. . . 139

multi-agent systems, computer science, game theory, and biology. Particular studiescan be found in the field of multi-agent systems. We can widely read aboutreputation mechanisms and the suggestions of clear hierarchical classification indocuments [9, 10]. First, a reputation mechanism is classified into two types:personal (individual) and group (group). Individuals are classified as direct (direct)or indirect (indirect). Direct types are classified as either observation (observed)or accidental occurrence (encounter-derived). An indirect reputation mechanism isclassified as a probability type (prior-derived), a group type (group-derived), or apropagation type (propagated). See [9] for details. Most reputation mechanismsof on-line auctions or shops are classified into individual, direct, and observedtypes or individual, direct, or happened types. In this paper, on-line auctions canalso build an indirect reputation mechanism that can actually be applied to aspread reputation mechanism. Previous studies [15–17] built indirect reputationmechanisms: indirect and propaganda types. Reputation information is handed fromagent to agent. Moreover, a previous work [16] argued that a reliable reputationmechanism is built after establishing that an incentive mechanism with honestfeedback returned true answers, assuming a rational agent. The features of thesereputation mechanism studies include agents who make themselves virtual agent’sof society and build and analyze its reputation mechanisms. On the other hand, oureffective reputation mechanism deals with networks based on an actual network.It mechanism, which ranks web pages, can also be called a reputation mechanism ofa web page. Google uses PageRank [2], which is the most famous ranking algorithm,and the link structure between web pages. HITS [1, 8] is a link-analysis algorithmthat determines a value for one page. For the page evaluation in HITS, the nodeinformation to constitute a network is transmitted over the entire network by links.HITS has a concept of good page authority that obtains a link from many pagesand a collection of good links of page hubs. Problems and many improvementmethods have been proposed for HITS. HITS and PageRank are spread reputationsystems that distribute the features of a link or a page. Direct and indirect typesare also found. Section 8.3 describes such details of PageRank or HITS, and theANT proposed in this paper is calculated and evaluated by graphs that consist ofnodes and links. TrustRank [4] describes a method that is judged and evaluatesgood pages by people’s eyes beforehand. It is also used to discover spam. Internetauctions have also been researched from various viewpoints. As mentioned above,for internet auctions, much research has been done on fraudulent practices andidentifying fraud. As typical examples, research of unique patterns is extracted fromcommunity extraction [11], which concentrates on the evaluation time in an auction.Business connections use probability resigning and identify frauds. Pandit et al. [12]On the other hand, some offer auction support by cooperation between two or moreevents about system mounting. Some research analyzed user reliance in an internetauction.


Shared post by user j (authorities)

Shared post to user j (hubs)user i user j

Fig. 8.1 Between userrelationship

8.2.2 Ranking Techniques for Web Pages

Hypertext Induced Topic Search (HITS) and PageRank measure the reliability ofweb pages, including search sites like Yahoo! and Google. We are mainly dependenton the link relations of pages for evaluations, which are based on simple scoring. It isdifficult to apply SNS user evaluations, which only link relations, for collateralizingreliance.

8.2.3 HITS Algorithm

We describe the most fundamental HITS with its algorithm in this section. HITS,which was invented in 1998 by Klineberg and others, is performed by a hyperlinkstructure to make scores that are relevant to web pages, as does PageRank. However,there is an important difference between HITS and PageRank. Although PageRankcreates one popular privilege on each page, HITS creates two popular privilegesand considers a web page as authorities and hubs. An authority is a page with manyinput links, and a hub is a page with many output links. A page is defined as goodwhen the opinion comes into effect. In fact, a good authority is indicated by goodhubs, and good hubs show good authority. Problems [10] and improved methodshave been suggested to HITS [6, 14]. When this information is translated into SNSelements, we get the following scenario. When the information of user i is sharedby user j, it becomes an authority for user i, and when the information of user j isshared by user i, it becomes a hub for user i.

Figure 8.1 shows a user’s relations.We apply this relation to HITS as follows:

x.k/i D

Xj Wej i2E

y.k�1/j and y

.k/i D

Xj Wei j2E

x.k/j (8.1)

8.2.4 PageRank

Google judges the importance of all pages based on recessive relations, where allthe pages are linked. A high quality page is defined by PageRank as one withmany pages of high quality. PageRank uses a simple grand total formula [5, 7]. Thesource refers to the analysis of the cf. article structure among academic journals.


1

2

3 4

5

6

Fig. 8.2 Directed graph

For example, PageRank of page Pi is r.Pi / means all the PageRanks of all pagesindicate Pi . Bp is a class of pages (back links) for Pi , and jPjj is the number ofoutput links from page Pj . In this case, value r.Pj /, which is the PageRank ofthe input links of page Pi , is unknown, but we solve it using a repetition method.Suppose at first that all pages have a value (the number of pages in the web index asn, 1 D n) of the same PageRank. Then we calculate r.Pi / for each page Pi of theindex by calculating it repeatedly. The following is the calculation formula:

rkC1.Pi / DX

Pj2Bpi

rk.pj /

jPj j (8.2)

This procedure is started as r0.Pi / D 1 D n for all pages Pi , and a PageRankscore finally converges in a stable value and is repeated. When we calculate thepages of six indexes, such as figures, the following directed graph is formed Fig. 8.2.

8.3 Proposal of a Reputation Network Using DistanceBetween Users

8.3.1 Concept of Realistic Distance Between Users

The real distance between users computes the actual distance (in kilometers)between users who exchanged information by geographical tags that were givento the residence posted on the profile of the SNS site. Two patterns determine thedistance between users:

• Real distance of residences• Real distance of information posted by users

In his paper, we perform a final user evaluation below.

1. The realistic distance of the information posted by users is the computed distanceby reverse geo encoding.

2. Computing information evaluation3. Ranking information with evaluation values4. Ranking a user who has received many high evaluations


user i

Real distance

authorities

hubs

Real distance

authorities

hubs

user j

user i user k

Fig. 8.3 Real distancebetween users

We considered final user evaluation where the rank of the information is evalu-ated. When a user posts about travel or a destination based on distance information,the distance information is evaluated. It is possible to meet a friend when traveling oron business trips. The importance of the information falls since the distance betweenusers that must be far becomes near usually. If the distance calculation of a user’splace of residence cannot respond to such cases, we measure the distance of theposted information.

8.3.2 Distance-HITS

There are some problems with HITS, as Sect. 8.2.3 described. In addition to thesimple user relations of HITS, we consider the real distance with this algorithm.Because it assumes with a high possibility that users actually know each other whenthe real distance between them is small, we assume that much trivial informationis shared. We also assumed high ratings for active information exchanges by thedistance between users. Figure 8.3 shows an example of real distance between users.

In this case, the real distance compares users i and k. In users j, J, we mustconsider the dignity of the information based on the distance because the realdistance is far. Then we inserted in the HITS algorithms using a realistic distancebetween users as d and built Distance-HITS, which is expressed as follows:

x.k/i D

Xj Wej i2E

dy.k�1/j and y

.k/i D

Xj Wei j2E

dx.k/j (8.3)

We added the link dignity (share/reshare for information) and a real distancebetween users. We can accurately measure user evaluations in comparison with theconventional method.


8.3.3 Distance-PageRank

After considering the rank of a web page as a user’s evaluation, we added therealistic distance information between users by the following formula:

rkC1.Pi / DX

Pj2Bpi

�rk.pj /

jPj j C ˛d.Pi ; Pj /

�(8.4)

We also added the link dignity (share/reshare for information) and the realdistance between users. We can measure this user evaluation more accurately thanthe conventional method. We also optimize the algorithm by inserting an elementabout other SNS parameters.

8.4 Parameter Setup Items

Various parameters besides the distance between users exist in SNSs, including thenumber of favorites, the number of friends, and affiliated communities. We computethe optimal parameters with these items. The concept of adding favorites changesslightly with SNS services. Here are the three names of the favorite buttons in eachservice:

• FaceBook Like! button• Google+ +1 button• Twitter favorite button

FaceBook and Google+ distributes information when users push the appropriatebutton whenever they like it and send it to others. On Twitter, it is possible to watchfavorite Tweets by lists, although such information isn’t distributed. FaceBook andGoogle+ resemble a link structure, but they cannot be written. This is a weakparameter compared with the share button.

The number of friends is the most important element for SNSs. Unless allservices are friend- or feed-registered, such information is not displayed on its feed.However, the number of friends is seldom related to the importance of information.The number of friends is the most important, and the number of friends itselfis not proportional to the importance of information. The number of friends canbe increased recklessly, but the number of friends reflects a partner’s evaluation.We consider the number of friends a parameter.


8.5 Experimental Results

We examine the evaluation of each allegorist through Facebook data from mypersonal page.

• 269 users• They have 2,946 edges.• They have a hub relationship.• They have real distance information.

Figure 8.4 shows the above relationship.

• User relationships and parameters

We used the following for our experimental environment.

- Computer: MacOSX 10.7.3 corei7 memory 8G- Execution environment: Gephi 0.8.1 Beta- Development environment: NetBeans 7.1.1- Proguraming language: Java

• The calculation result only in PageRank

It shows the evaluation value of the result calculated on the basis of PageRank. It iscalculated for a determinate altogether and computing as a score. User 95 with themost links is the best evaluation. The calculation result is shown in Table 8.1.

• The calculation result only in Distance-PageRank

It shows the evaluation value of the result calculated on the basis of Distance-PageRank. It is calculated for a determinate altogether and computing as a char-acteristic value vector. it is calculated based on the real distance between Edges.User 95 with the most links is the best evaluation. The calculation result is shown inTable 8.2.

• The calculation result only in HITS

It shows the evaluation value of the result calculated only on HITS. It is calculatedfor a determinate and computing the results of Auth, Hubs. The calculation result isshown in Table 8.3.

• The calculation result in Distance-HITS

It shows the evaluation value of the result calculated on Distance-HITS. It iscalculated for a determinate of HITS and real distance and computing the resultsof Auth, Hubs. The calculation result is shown in Table 8.4.

In this section, We performed comparative experiments which are conductedabout the user evaluation technique by the conventional Link structure, and thetechnique a of having inserted real distance as a parameter. We found the followingsabout this.


Fig. 8.4 Facebookrelationship


Table 8.1 Calculation result:only PageRank

Ranking User Eigenvector

1 User 95 0.000649422 User 103 0.00637893 User 26 0.005831634 User 210 0.005804265 User 264 0.005784506 User 86 0.0054797897 User 62 0.005221683

Table 8.2 Calculation result:Distance-PageRank

Ranking User Eigenvector

1 User 65 0.006494222 User 103 0.006378983 User 26 0.005831634 User 210 0.005804265 User 264 0.005784506 User 86 0.0054797897 User 62 0.005221683

Table 8.3 Calculation result: only HITS

Ranking User Eigenvector-Auth Ranking User Eigenvector-Hubs

1 User 95 0.0137066 1 User 45 0.07812292 User 210 0.0126096 2 User 28 0.07262573 User 62 0.0115131 3 User 39 0.07262574 User 86 0.0115131 4 User 26 0.067039115 User 155 0.0115131 5 User 41 0.0614526 User 103 0.0109649 6 User 40 0.0446927 User 60 0.0104166 7 User 46 0.044692

Table 8.4 Calculation result: Distance-HITS

Ranking User Eigenvector-Auth Ranking User Eigenvector-Hubs

1 User 228 0.0228978 1 User 45 0.10759512 User 203 0.0215434 2 User 39 0.09407633 User 210 0.0210951 3 User 28 0.08718784 User 188 0.0178784 4 User 26 0.07992215 User 142 0.0168541 5 User 41 0.06838846 User 155 0.0166097 6 User 47 0.0656097 User 160 0.0162235 7 User 40 0.062452

• There is no change in user’s Link structure being the most important parameter.• It is utilizable as an evaluation technique different from Link structure by

inserting an actual range.• It becomes difficult for the ranking operation by the simple technique by making

real distance into a parameter.


We think that higher precision user evaluation is attained compared with thealgorithm using the conventional link structure.

8.6 Discussion

We proposed the evaluation technique which is not only link structure by wayof connecting to user evaluation based on actual Facebook data by evaluationexperiment. We would reflect not only the evaluation of the link structure butalso the real distance between information. This means that only the linking orlinked evaluation can be raised intentionally by automatic script. But, since Wethink it is hard to omit the distance between information with a GPS location,It can be said to be an highly accurate evaluation technique. Even if you see anactual example, you can find that the evaluation of the relation which postingis shared with separated from distance of the user is higher than only counts ofshare/reshare information. From now on, We should mount also parameters, such asFavorite, number of Friends, peculiar to SNS. Since it is very freely possible to addbookmarks by cooperation with an external site at especially Facebook or Google+,We think it should be treated low as a parameter. Therefore, we aim to evaluatingcomprehensively after attaching suitable waits. And as the number of followers, itis not the number itself, I think it is necessary to take account of the percentage offollowers from the total number of friends.

8.7 Conclusion

In this paper, We use the evaluation technique of the exiting web page to capturea network structure for the user relation of Facebook. We think that it is possibleto falsify the ranking in an intentional script. However, in the ranking techniqueusing the actual range between information in the state where geo-location can’tbe operated from the outside, it is hard to perform ranking operation by malicioususers. As the number of users increases in SNS service, it is need to clear offthe malicious users and many damage caused by the application with which thevirus was embedded is reported increasingly. Therefore, We think it will be takenseriously increasing by from now on that guaranteeing the normality of usersevaluation by the using the technique proposed in this paper. We will enrich anevaluation technique further with various parameters peculiar to SNS and evaluateusing an actual data from now on.


References

1. Bharat, K., Henzinger, M.R.: Improved algorithms for topic distillation in a hyperlinkedenvironment. In: Proceedings of the 21st Annual International ACM SIGIR Conference onResearch and Development in Information Retrieval, pp. 104–111 (1998)

2. Brin, S., Page, L.: The anatomy of a large-scale hyper textual web search engine.WWW7/Comput. Netw. 30(1–7), 107–117 (1998)

3. Gilbert, E., Karahalios, K.: Predicting tie strength with social media. In: Proceedings of the27th International Conference on Human Factors in Computing Systems (2009)

4. Gyongyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In:Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 576–587.VLDB Endowment (2004)

5. Haveliwala, T.H.: Efficient Computation of PageRank, 1999 Stanford Technical Report6. Kleinberg, J.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), (1999)7. Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine

Rankings. Princeton University Press, Princeton (2006)8. Li, L., Shang, Y., Zhang, W.: Improvement of hits-based algorithms on web documents. In:

Proceedings of WWW2002, pp. 527–535 (2002)9. Mui, L.: Notions of reputation in multi-agents systems: A review. PhD thesis, Massachusetts

Institute of Technology (2003)10. Mui, L., Halberstadt, A., Mohtashemi, M.: Notions of reputation in multi-agent systems A

review. In: Proceedings of the 1st International Joint Conference on Autonomous Agents andMulti-Agent Systems (AAMAS 2002), pp. 280–287 (2002)

11. Pandit, S., Horng Chau, D., Wang, S., Faloutsos, C.: Netprobe: a fast and scalable systemfor fraud detection in online auction networks. In: Proceedings of the 16th InternationalConference on World Wide Web (WWW’07), pp. 124–132 (2007)

12. Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: Netprobe: A fast and scalable system for frauddetection in online auction networks. In: Proceedings of the 16th International Conference onWorld Wide Web (WWW’07), pp. 201–210 (2007)

13. Pujoi, J.M., Snguesa, R., Delgado, J.: Extracting reputation in multi agent systems by meansof social network topology. In: Proceedings of the First International Joint Conference onAutonomous Agents and Multiagent Systems, pp. 467–474 (2002)

14. Resnick, P., Zeckhauser, R.: Trust among strangers in internet transactions: Empirical analysisof ebay’s reputation system. Econ. Internet E Commerce 11, 127–157 (2002)

15. Schillo, M., Funk, P., Rovatsos, M.: Using trust for detecting deceitful agents in artificialsocieties. Applied Artificial Intelligence 14(8), 825–848 (2000)

16. Sabater, J., Sierra, C.: Reputation and social network analysis in multi-agent systems. In:Proceedings of the First International Joint Conference on Autonomous Agents and MultiagentSystems, pp. 475–482 (2002)

17. Yu, B., Singh, M.P.: An evidential model of distributed reputation management. In: Proceed-ings of the 1st International Joint Conference on Autonomous Agents and Multi-Agent Systems(AAMAS 2002), pp. 294–301 (2002)

18. Zhang, H., Law, E., Miller, R.C., Gajos, K.Z., Parkes, D.C., Horvitz, E.: Human computationtasks with global constraints: A case study. In: Proceedings of the ACM Conference on HumanFactors in Computing (2012)

Part IIAutomated Negotiating Agents

Competition

Chapter 9An Overview of the Results and Insightsfrom the Third Automated NegotiatingAgents Competition (ANAC2012)

Colin R. Williams, Valentin Robu, Enrico H. Gerding,and Nicholas R. Jennings

Abstract The third Automated Negotiating Agents Competition (ANAC 2012) washeld at the 11th International Conference on Autonomous Agents and MultiagentSystems (AAMAS 2012, Valencia, Spain). ANAC is an international competitionthat aims to encourage research into bilateral, multi-issue negotiation, by providinga platform in which strategies developed independently by different research teamscan be tried and compared against each other, in a real-time competition. In the2012 edition, we received 17 entries from 9 different universities worldwide, out ofwhich 8 were selected for the final round. This chapter aims to provide a broaddescription of the competition set-up (especially highlighting the changes fromprevious editions), the preference domains and the strategies submitted, as well asthe results from both the qualifying and final rounds.

Keywords AI competitions • Automated negotiation • Multi-agent systems

9.1 Introduction

Negotiation is a key process for reaching mutually beneficial agreements betweenself-interested parties. Automated negotiation has been at the forefront of researchinterests in the multi-agent systems and AI communities, and over time a variety ofstrategies have been proposed [5, 6, 10].

However, due to differences between the negotiation models considered andthe implementation platforms used, it has often proven difficult to compare theperformance of different strategies directly. The aim of the international negotiating

C.R. Williams (�) • V. Robu • E.H. Gerding • N.R. JenningsElectronics and Computer Science, University of Southampton, UKe-mail: [email protected]; [email protected]; [email protected]; [email protected]


151





152 C.R. Williams et al.

agents competition is fill this gap, and provide a platform in which independentlydeveloped strategies can be tested, compared and evaluated against each other.ANAC has been running since 2010 and, in this period, it has provided thenegotiation community with a standardised test platform, as well as a collectionof strategies, benchmarks, and analysis tools for bilateral, multi-issue negotiation.

In this overview chapter, we aim to provide a broad overview of our experience inrunning the 2012 edition of ANAC. The chapter is organised as follows. In Sect. 9.2we provide a short overview of the competition set-up, highlighting especiallythe new features that were introduced at ANAC 2012. In addition, we provide adescription of the preference domains used in running the competition (as in theprevious year, each entrant who submitted a strategy was also asked to submit apreference domain to be used in the competition). Then, in Sect. 9.3, we present,and briefly comment on, the competition results. The chapter is concluded bya discussion of the potential for future work and extensions of ANAC in futureeditions (Sect. 9.4).

9.2 Set-Up of the Competition

As in previous editions of the competition [2, 3], the aim of ANAC 2012 is to teststrategies for automated bilateral negotiation, using an alternating-offers protocol.In each negotiation, offers are exchanged in real time, with a deadline for reachingagreements set at 3 min. The real-time feature means that the number of offersthat can be exchanged within a certain time period varies and depends on thetime required by the agents to compute each offer. The preferences of each agentare described by a multi-issue, linearly additive utility function. We refer to thejoint set of utility functions of the two parties as a preference domain. Moreover,a discount factor was used in about half of the domains, where the value of anagreement decreased over time.

In this setting, the challenge for an agent is to negotiate without any knowledgeof the opponent’s preferences and strategy. Although each agent participates inmany negotiation sessions, against different opponents, and in a wide variety ofnegotiation scenarios, agents are not able to learn between negotiations. This meansthat the negotiation agents only have the opportunity to adapt and learn from theoffers they received within a single negotiation session.

The competition was run on a Java-based software platform, called GENIUS [4],developed for the testing of bi-lateral negotiation agents. Since the set-up of thecompetition and the features of GENIUS remained largely the same as in previouseditions of the competition, interested readers can consult [2,3] for a full description.In the remainder of this section, we focus on the new feature introduced in the 2012edition.

9 The Third Automated Negotiating Agents Competition (ANAC2012) 153

9.2.1 New Feature of the 2012 Competition

The main change which was implemented as part of the ANAC 2012, was tointroduce a reservation value. The reservation value of an agent is the utility ofconflict, and is achieved if either the agents fail to reach an agreement by thedeadline, or if one of the agents terminates the negotiation early. The reservationvalues can be different for each negotiation scenario but in each case it is common toboth agents, and known to the agents. An important property is that the reservationvalue is discounted in the same way that an agreement would be discounted. Thismakes it rational, in certain circumstances, for an agent to terminate an agreementearly, in order to take the reservation value with a smaller loss due to discounting.

9.2.2 Negotiation Domains

One of the main elements in any negotiation platform is the negotiation domain,which describes the two negotiating agents’ utilities over the different outcomesin the multi-issue negotiation space, as well as a discounting factor and reservationvalue for both agents. Each agent’s preference is also called its profile. Note that,in each domain, the profiles of the two agents are different in terms of the utilityfunctions but we kept the discounting factor and reservation value the same. In orderto eliminate any potential bias on the part of the organisers, as in the previous year,most of the ANAC domains are submitted by the participants themselves. Thus, weasked each team entering the competition to submit, in addition to the Java classescorresponding to their strategy, a negotiation domain.

9.2.2.1 Qualifying Round

In the qualifying round, we used the 17 domains submitted by the participants, plusthe Travel domain submitted by one of the teams at ANAC 2010 (the reason forincluding this domain was, beside having an even number required for buildingtest cases, that we felt more larger domains were needed). Therefore a total of 18domains were used in the qualifying round. Moreover, many of the domains weresubmitted without discounting factors or reservation values. Therefore, we assignedthese values to some of the domains.

Each negotiations was repeated 10 times each to obtain statistically significantresults. Also, each agent negotiated using each profile in the domain. Therefore, intotal, the qualifying round consisted of 52020 negotiations, which were run usingthe Iridis compute cluster1 at the University of Southampton.

1The cluster consists of 924 Westmere compute nodes, each with two 6-core processors, as well as84 Intel Nehalem compute nodes with two 4-core processors.


Table 9.1 Domain characteristics

Domain size Competitiveness

Name Years Value Class Value Class

NiceOrDie 2011, 2012 3 Small 0.840 HighFifty fifty 2012 11 Small 0.707 HighLaptop 2011, 2012 27 Small 0.160 LowFlight Booking 2012 36 Small 0.281 MediumRental House 2012 60 Small 0.327 HighBarter 2012 80 Small 0.492 HighOutfit 2012 128 Small 0.198 LowItex vs Cypress 2010, 2012 180 Small 0.431 HighHousekeeping 2012 384 Medium 0.272 MediumIS BT Acquisition 2011, 2012 384 Medium 0.117 LowAirport Site Selection 2012 420 Medium 0.285 MediumEngland vs Zimbabwe 2010, 2012 576 Medium 0.272 MediumBarbecue 2012 1,440 Medium 0.238 MediumGrocery 2011, 2012 1,600 Medium 0.191 LowPhone 2012 1,600 Medium 0.188 LowAmsterdam Party 2011, 2012 3,024 Medium 0.223 MediumFitness 2012 3,520 Large 0.275 MediumCamera 2012 3,600 Large 0.218 LowMusic Collection 2012 4,320 Large 0.150 LowADG 2011, 2012 15,625 Large 0.092 LowEnergy (small) 2012 15,625 Large 0.430 HighSupermarket 2012 98,784 Large 0.347 HighTravel 2010, 2012 188,160 Large 0.230 MediumEnergy 2011, 2012 390,625 Large 0.525 High

9.2.2.2 Final Round

In the final round, we expanded the range of domains as follows. In addition to the17 scenarios submitted by the participants in the earlier round, we added the Itex vsCypress and Travel domains from the 2010 competition, and the ADG, AmsterdamParty, Grocery, Laptop and NiceOrDie domains from the 2011 competition,2

thereby creating a total of 24 domains. In order to analyse the performance of thenegotiation strategies in different domains, we classified each domain according totheir size and competitiveness, and described below. Table 9.1 provides the size andcompetitiveness of the submitted domains.

The size of a domain is given by the number of possible agreement outcomes inthe domain. The smallest domains, NiceOrDie (Fig. 9.1b) and Fifty fifty (Fig. 9.1c)each have only a single negotiation issue, with just 3 and 11 possible outcomesrespectively. The smallest multi-issue domain is the Laptop domain (Fig. 9.1f),

2The remaining domains from the previous competitions: England vs Zimbabwe, Camera, Energyand IS BT Acquisition had each already been re-submitted by a one of the ANAC 2012 participants.


0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1a b

U1

0 0.2 0.4 0.6 0.8 1U1

U2

0

0.2

0.4

0.6

0.8

1

U2

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1c d

U1

0 0.2 0.4 0.6 0.8 1U1

U2

0

0.2

0.4

0.6

0.8

1U

2

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1e f

U1

0 0.2 0.4 0.6 0.8 1U1

U2

0

0.2

0.4

0.6

0.8

1

U2

Fig. 9.1 Outcome spaces of different domains, showing Pareto frontier. (a) ADG domain.(b) NiceOrDie domain. (c) Fifty fifty domain. (d) Energy domain. (e) IS BT Acquisition domain.(f) Laptop domain

with 3 issues each taking one of 3 possible values, leading to a total of 27 possibleoutcomes. At the other extreme, the largest domain is the Energy domain (Fig. 9.1d),with 8 issues each taking one of 5 possible values, leading to a total of 390,625possible outcomes.


Table 9.2 Allocation ofdomains with (S)mall,(M)edium and (L)argeoutcome spaces to thediscounting factors (df ) andreservation values (rv)

rv0:00 0:25 0:50

df 0:50 S M L0:75 L S M1:00 M L S

The competitiveness of a domain is defined as the minimum Euclidean distancefrom a point in the utility space to the point which represents maximum satisfactionfor both agents (that is, the point at which each agent achieves a utility of 1). Theleast competitive domain is the ADG domain (Fig. 9.1a), with a competitivenessvalue of 0:092, in which it is possible to reach an agreement in which both agentsreceive a utility greater than 0:93. The IS BT Acquisition domain (Fig. 9.1e) is alsohighly competitive. At the other end of the scale, the most competitive domain isthe NiceOrDie domain (Fig. 9.1b), with a competitiveness value of 0:840, in whichagreement can only be reached if one of the agents is willing to concede to a utilityof 0:16 or if both agents are willing to concede to a utility of 0:29.

Furthermore, in order to obtain results for a wide range of discount factor andreservation value parameters, we set three different values for each of the twoparameters. Specifically, we used discounting factors df 2 f1:0; 0:75; 0:5g, andreservation values rv 2 f0:0; 0:25; 0:5g. To determine the appropriate combinationsof these values, we used the Latin Squares experimental approach, which reduces thecombinations required while allowing to evaluate several parameters. In particular,for each domain, we repeated the negotiations with a different value of df and rv.such that all domains were run with all three values of df and with all three values ofrv, although not all combinations of those values. The combinations used are givenin Table 9.2, which shows all the experimental combinations of the discountingfactors, reserve values, and domain sizes. To see how this works, suppose wewould like to compare two strategies for medium-sized domains. We then simplytake the average results for all experiments with medium-sized domains (i.e. 3different combinations from Table 9.2), which ensures that all discounting factorsand reservation values are covered in equal proportion. Due to the Latin Squareapproach, the same is true for any other parameter value. This allows for principledstatistical analysis of each individual parameter, while reducing the number ofexperimental combinations required.

In order to obtain statistically significant results, we furthermore repeated eachcombination 10 times. In total, with 8 agents in negotiation with 8 opponentsacross 24 domains (each with 3 variants), each repeated 10 times, a total of 40,320individual negotiations were performed.


9.3 Competition Results

A total of 17 agents, from 8 institutions (listed in Table 9.3) were entered into thecompetition. In common with the previous competition, due to the large number ofparticipants, the competition consisted of a qualifying round, and a final containingthe top 8 agents from the qualifying round. The results of the qualifying roundare given in Table 9.4 and the results of the final round are given in Table 9.5.The statistical significance of the results was calculated using Welch’s t-test [9] totest for the null hypothesis, given the mean, variance and number of results in oursample. Welch’s t-test is an extension of Student’s t-test [8] for comparing samplesin which the variance may differ, as in the results we consider. Using this test,it was found that the agents which finished in 3rd and 4th places had scores thatwere not statistically significantly different from each other. Therefore both agentswere awarded a prize for finishing in joint third place. Differences between all otherpositions were found to be statistically significant.

9.3.1 Qualifying Round

The results of the qualifying round are presented in Table 9.4. The aim of thequalifying round was to filter the number of strategies selected to the final downto the top eight. Note that IAMcrazyHaggler2012 (developed by the organisers)completed the qualifying round in a position eligible for the final, but we decidedto withdraw it, because our other submission, IAMhaggler2012, completed thequalifying round in a higher position and we felt more diversity in the finalist poolwould help improve the competition. It is interesting to note that the strategy whichqualified after our withdrawal, AgentLG, achieved second place in the final (c.f.Table 9.5).

Table 9.3 Participants in the Automated Negotiating Agents Competition 2012

Institution Agent name(s)

American University of Sharjah Agent ZBar-Ilan University AgentLG, MYMGAgent, RumbaBen-Gurion University of the Negev BRAMAgent2, Meta-AgentDelft University of Technology AgentX, Dread Pirate Roberts,

TheNegotiator ReloadedMaastricht University OMACagentNagoya Institute of Technology AgentI, AgentLinear, AgentMR,

AgentMZ, AgentNSShizuoka University AgentYTYThe Chinese University of Hong Kong CUHKAgentUniversity of Southampton IAMcrazyHaggler2012,

IAMhaggler2012


Table 9.4 Scores achievedin the qualifying round of theAutomated NegotiatingAgents Competition 2012,including 95% confidenceintervals

Rank Agent name Score

1–2 CUHKAgent 0.597˙0:0051–2 OMACagent 0.590˙0:0073–5 TheNegotiator Reloaded 0.572˙0:0063–7 BRAMAgent2 0.568˙0:0053–7 Meta-Agent 0.565˙0:0074–7 IAMhaggler2012 0.564˙0:0044–8 AgentMR 0.563˙0:0087–9 IAMcrazyHaggler2012 0.556˙0:0038–10 AgentLG 0.550˙0:0079–11 AgentLinear 0.547˙0:00610–11 Rumba 0.542˙0:00612 Dread Pirate Roberts 0.521˙0:00613–14 AgentX 0.469˙0:00413–14 AgentI 0.465˙0:00615–16 AgentNS 0.455˙0:00615–16 AgentMZ 0.447˙0:00617 AgentYTY 0.394˙0:003

Table 9.5 Scores achievedin the final round of theAutomated NegotiatingAgents Competition 2012,including 95% confidenceintervals


1 CUHKAgent 0.626˙0:0012 AgentLG 0.622˙0:0013–4 OMACagent 0.618˙0:0013–4 TheNegotiator Reloaded 0.617˙0:0015 BRAMAgent2 0.593˙0:0016 Meta-Agent 0.586˙0:0017 IAMhaggler2012 0.535˙0:0008 AgentMR 0.328˙0:001

9.3.2 Final Round

The results from the final round of the competition are presented in Table 9.5.All teams developing the eight strategies qualified in the final were invited todescribe their strategies in a short presentation at the ACAN workshop, as wellas in short chapters as part of this book. There are several observations that canbe made from examining Tables 9.4 and 9.5. First, note that the results are close(the maximum difference between the top 7 strategies in the final is less than0.1). Nevertheless, all of the ranks assigned are statistically significant (where nostatistically significant difference in performance could be found after extensivetesting, a range is given, such as in the case of positions 3–4 in Table 9.5). Second,we note that some strategies performed consistently well in both the final and thequalifying round, such as the winner, CUHKAgent, which also finished in first placein the qualifying round, or OMACagent, which came third in the final and secondin the qualifying round. Additionally, it is interesting to note that both of these


Table 9.6 Scores achievedin the undiscounted domainsof the final round of theAutomated NegotiatingAgents Competition 2012,including 95% confidenceintervals


1 TheNegotiator Reloaded 0:742˙0:0022–3 CUHKAgent 0:725˙0:0022–3 OMACagent 0:724˙0:0024 AgentLG 0:717˙0:0035 Meta-Agent 0:657˙0:0016 BRAMAgent2 0:648˙0:0027 IAMhaggler2012 0:546˙0:0018 AgentMR 0:264˙0:000

Table 9.7 Scores achieved inthe discounted domains of thefinal round of the AutomatedNegotiating AgentsCompetition 2012, including95% confidence intervals


1 CUHKAgent 0:577˙0:0012 AgentLG 0:574˙0:0013–4 OMACagent 0:566˙0:0013–4 BRAMAgent2 0:565˙0:0015 TheNegotiator Reloaded 0:555˙0:0016 Meta-Agent 0:551˙0:0027 IAMhaggler2012 0:530˙0:0018 AgentMR 0:361˙0:002

strategies came from research teams that had not participated in previous editions ofANAC, which shows that the field is still open to new ideas for building negotiationand learning heuristics.

9.3.3 Results for Specific Domains

In addition to the aggregate results for all domains and preference profiles, we alsocomputed average scores for domains with specific discounting factors. In moredetail, we computed separately the average performance of our strategies in undis-counted domains (with df D 1) and discounted domains (with df 2 f0:75; 0:5g).Results are shown in Tables 9.6 and 9.7, respectively.

We note that the discounting factor can have a significant influence on the nego-tiation strategy used by the agent. In undiscounted domains, the best performingagents typically use a “hard headed” strategy, and concede less throughout thenegotiation. However, such a strategy may result in agents only reaching agreementsjust before the 3 min deadline. Thus, in discounted domains, a hard headed strategydoes not perform so well, because even if an agent manages to extract a high utilityin the agreement reached at the end, this agreement may have a low utility for theagent in practice, due to discounting.

In our results, we see significant differences in performance based on thediscounting factor. Thus, the best agent overall, CUHKAgent comes only secondin undiscounted domains. It is, however, the best performing agent in discounteddomains, securing the overall lead.


0.2 0.4 0.6 0.80.2

0.4

0.6

0.8

Opponent Score

IAMhaggler2012

CUHKAgent

Age

nt S

core

Fig. 9.2 Scores achieved bythe agents and theiropponents. The dashed lineshows the points at which theagent and its opponentachieve identical utilities.Space above this linerepresents the agent beatingits opponent, and converselyfor the space below the line.The dotted line shows thepoints with a social welfareequal to that ofIAMhaggler2012

9.3.4 Social Welfare Achieved by Each Agent

Finally, another aspect that we considered is the effect that each agent had on thesocial welfare of the reached agreements, considering not only the utility achievedby the agent itself, but also the utility achieved by its opponent. To explain, ina competition format, negotiating agents are incentivised to “beat” (i.e. achieve ahigher utility than) the opponent. However, in many real-life negotiations, the goalis not necessarily to beat the opponent, but to achieve a mutually agreeable outcome.Of course, each agent cares primarily about its own utility, but at a given level ofown utility, it also cares that the opponent is satisfied, as much as possible, with theachieved deal. This is because many real negotiations occur between parties (e.g.customers and suppliers) whose goal is not only extracting the maximum utilitypossible for themselves from the current deal, but also about repeat business etc.

To measure this, in Fig. 9.2 we plot the utility achieved both by the agent itselfand its opponent, averaged across all negotiations it participated in. Surprisingly,although CUHKAgent was the agent which achieved the highest utility for itself,IAMhaggler2012 achieves the best balance between own and opponent utility.Otherwise stated, it achieves the best social welfare (i.e. defined here as the sumof both its own and opponent utility). We hypothesise this is due to the fact that thisstrategy (which we developed) is more adaptive to opponent demands. This makes itdiscover so called “win–win” deals better, although in a competition format, it doesnot necessarily achieve the best score for itself.


9.4 Conclusions and Future Extensions of ANAC

We conclude that the ANAC 2012 competition was successful and it achieved itsmain goal, which is to compare a range of independently developed negotiationstrategies in a realistic, real time environment. As an immediate extension, webelieve it would be interesting to analyse the ANAC 2012 results in more depth,as well as study other characteristics of the strategies submitted, such as theirrobustness, similar to the analysis presented for ANAC 2011 in [1].

There are several ideas that emerged from the community for extending theACAN platform in future editions. These include:

• Allowing agents to learn opponent preferences between negotiation threads,not just within a single thread. To our knowledge, this change was alreadyimplemented in the 2013 edition of ANAC.

• Modeling one-to-many and many-to-many negotiations, rather than only bilateralones. Some previous research in this area [11] already uses the GENIUS platformfor this purpose, and work on extending it to cover the one-to-many negotiationsetting is under way.

• Allowing more complex utility models on the part of the negotiating agents, suchas interdependencies between negotiation issues [6, 7]. Work in this area mayinvolve a mediated type of negotiation [6], rather than strictly bilateral exchanges.

• Finally, an important extension is to allow the competition to model negotiationsbetween software agents and human counterparts, rather than only betweensoftware agents. This extension will allow the testing of a new set of negotiationtechniques [5].

While, at the moment, it is envisaged that these extensions will use the sameGENIUS platform, they may lead to different, specialised tracks in future editionsof ANAC. To conclude, we believe the ANAC competition (and its extensions) willcontinue to play a key role in supporting the efforts of the automated negotiationresearch community to build more complex and realistic negotiation systems.

Acknowledgements The authors acknowledge the use of the IRIDIS High Performance Comput-ing Facility, and associated support services at the University of Southampton, in the completionof this work.

References

1. Baarslag, T., Fujita, K., Gerding, E.H., Hindriks, K.V., Ito, T., Jennings, N.R., Jonker, C.M.,Kraus, S., Lin, R., Robu, V., Williams, C.R.: Evaluating practical negotiating agents: Resultsand analysis of the 2011 international competition. Artif. Intell. 198, 73–103 (2013)

2. Baarslag, T., Hindriks, K., Jonker, C.M., Kraus, S., Lin, R.: The first automated negotiatingagents competition (ANAC 2010). In: New Trends in Agent-based Complex AutomatedNegotiations; Series of Studies in Computational Intelligence, vol. 383, pp. 113–135 (2010)


3. Fujita, K., Ito, T., Baarslag, T., Hindriks, K., Jonker, C.M., Kraus, S., Lin, R.: The secondautomated negotiating agents competition (ANAC 2011). Complex Automated Negotiations:Theories, Models, and Software Competitions; Series of Studies in Computational Intelligence,vol. 435, pp. 183–197 (2011)

4. Hindriks, K., Jonker, C.M., Kraus, S., Lin, R., Tykhonov, D.: GENIUS: negotiation environ-ment for heterogeneous agents. In: Proc. 8th Int. Joint Conf. on Aut. Agents and Multi-AgentSyst. (AAMAS’09), vol. 2, pp. 1397–1398 (2009)

5. Lin, R., Kraus, S.: Can automated agents proficiently negotiate with humans? Comm. ACM53(1), 78–88 (2010)

6. Marsa-Maestre, I., Lopez-Carmona, M.A., Velasco, J.R., Ito, T., Klein, M., Fujita, K.:Balancing utility and deal probability for auction-based negotiations in highly nonlinear utilityspaces. In: Proc. of 21st Int. Joint Conf. on AI (IJCAI’09), pp. 214–219 (2009)

7. Robu, V., Somefun, D.J.A., Poutré, J.A.L.: Modeling complex multi-issue negotiations usingutility graphs. In: Proc. 4th Int. Conf. Aut. Agents and Multi-Agent Syst. (AAMAS’05),pp. 280–287 (2005)

8. Student: The probable error of a mean. Biometrika 6, 1–25 (1908)9. Welch, B.L.: The generalization of ‘student’s’ problem when several different population

variances are involved. Biometrika 34, 28–35 (1947)10. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Using gaussian processes to optimise

concession in complex negotiations against unknown opponents. In: Proceedings of theTwenty-Second International Joint Conference on Artificial Intelligence (IJCAI’11) vol. 1,pp. 432–438 (2011)

11. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Negotiating concurrently withunknown opponents in complex, real-time domains. In: Proc. of 20th European Conf. on AI(ECAI’12), pp. 834–839. IOS Press (2012)

Chapter 10An Adaptive Negotiation Strategy for Real-TimeBilateral Negotiations

Alexander Dirkzwager and Mark Hendrikx

Abstract Each year the Automated Negotiating Agent Competition (ANAC)introduces an increasingly complex negotiation setting to stimulate the developmentof negotiation strategies. This year, the competition featured a real-time bilateralnegotiation setting with private reservation values and time-based discounts. Thiswork introduces the strategy of one of the top three finalists: The NegotiatorReloaded (TNR). TNR is the first ANAC agent created using the BOA framework,a framework that allows separately developing and optimizing the components of anegotiation strategy. The agent uses a complex strategy that takes the opponent’sbehavior and the domain characteristics into account. This work presents theimplementation, optimization, and evaluation of the strategy.

Keywords Automated negotiation strategy • Bayesian learning • Domainanalysis • Strategy prediction

10.1 Introduction

Last year, the ANAC 2011 competition introduced a negotiation setting in whichagents competed in a real-time bilateral negotiation on domains with time-baseddiscounts [1]. This year the setting was extended to feature private reservation valuesthat are discounted over time.

This work introduces the strategy of the third place finalist and the bestperforming agent on undiscounted domains in the ANAC 2012 competition: TheNegotiator Reloaded (TNR). TNR is the first agent based on the BOA frame-work [2], a framework that allows to separately develop the bidding strategy,

A. Dirkzwager (�) • M. HendrikxDelft University of Technology, Interactive Intelligence Group,Mekelweg 4, Delft, The Netherlandse-mail: [email protected]; [email protected]


163



164 A. Dirkzwager and M. Hendrikx

opponent model, and acceptance conditions. The flexibility of this framework allowsus to optimize the negotiation strategy using the components of agents introducedin previous ANAC competitions.

The following sections discuss the implementation of the agent. Section 10.2discusses the negotiation strategy, how it is implemented and optimized using theBOA framework. In Sect. 10.3 a toolkit of quality measures is used to quantify theperformance of the negotiation strategy. Finally, Sect. 10.4 provides directions forfuture research.

10.2 Negotiation Strategy

This section discusses the strategy of The Negotiator Reloaded. Section 10.2.1briefly describes the BOA framework used to create TNR (for a detailed discussion,see [2]). Next, Sect. 10.2.2 discusses how the BOA framework is used to implementTNR’s components.

10.2.1 Introduction to the BOA Framework

The BOA framework is build upon GENIUS [3] which allows to separately developthe components of a negotiation strategy. The BOA framework makes a distinctionbetween three types of components: a Bidding strategy which maps a negotiationtrace to a bid; an Opponent model, which is a learning technique used to model theopponent’s preference profile; and finally an Acceptance strategy which determineswhether the opponent’s offer is acceptable. A full negotiation strategy is created byselecting a component for each of the three types. In fact, the full Cartesian productof these components can be evaluated.

There are three main advantages to implementing an agent as a BOA compatibleagent: first, each component can be evaluated in isolation; second, a component canbe easily switched for an alternative—possibly better—component; and finally, theimplementation of separate component simplifies agent creation.

Figure 10.1 provides an overview of how the components interact. Whenreceiving an opponent’s bid, the BOA agent first updates the bidding history andopponent model. Given the opponent’s bid, the bidding strategy generates a setof similarly preferred counter offers. Next, the bidding strategy uses the opponentmodel to select a bid from this set by taking the opponent’s utility into account.Finally, the acceptance strategy decides whether the opponent’s offer should beaccepted. If the opponent’s bid is not accepted, then the bid generated by the biddingstrategy is offered instead.

Each component of TNR was implemented separately using the BOA framework.The following section discusses the implementation and optimization of eachcomponent in detail.

10 An Adaptive Negotiation Strategy for Real-Time Bilateral Negotiations 165

Fig. 10.1 Overview of the BOA framework

Fig. 10.2 Overview of bidding strategy of TNR

10.2.2 Implementing the BOA Components

This section discusses the three BOA components of TNR in turn: the biddingstrategy, the opponent model, and the acceptance strategy. The discussion of eachcomponent consists of a description of its implementation, as well as how thecomponent is optimized using quality measures and the BOA framework.

10.2.2.1 Bidding Strategy

TNR is a BOA agent that takes the opponent’s strategy and domain characteristicsinto account to optimize its negotiation strategy. The discussion below follows thediagram of the complete negotiation strategy depicted in Fig. 10.2.

The first step taken by TNR, is that it determines if the discount is low, medium,or high. Next, the time is divided in a set of windows. At the start of each window,the domain analyzer is used to estimate the Kalai-Smorodinsky point and thestrategy analyzer is used to determine the opponent is a conceder or hardliner. Notethat preferably these calculations should be done each turn, however, this proved toocomputationally expensive. The target utility in a specific round is determined usingthe standard time-dependent decision function [4] depicted in Eq. (10.1). We optedfor this decision function as its parameters can be adjusted during the negotiation.

Pmin C .Pmax � Pmin/ � .1 � F.t// where F.t/ D k C .1 � k/ � t 1=e: (10.1)


The value for the concession rate e is selected from a table that maps the discounttype (low, medium, high) and opponent’s strategy type (conceder or hardliner) to aconcession rate. While the discount type does not change, the opponent’s behavioris likely to change over time. The maximum concession Pmin is set to the estimatedKalai-Smorodinsky point calculated by the domain analyzer. For domains with adiscount, Pmin is multiplied by the discount to ensure that the agent concedes faster.When the undiscounted reservation value is higher than Pmin, then Pmin is set to thereservation value. As a safeguard, Pmin is not allowed to be lower than a predefinedconstant. The variable k is always 0, and Pmax 1. The calculated target utility isused by the bid selector to select a bid with a utility as close as possible to the targetutility.

The tactic as discussed above strongly relies on the concession rate table.Since three discount types and two strategy types are distinguished, there are sixconcession rates to be determined. To do so, we created a variant of the ANAC 2011competition that excludes the agent ValueModelAgent and the domains Energy andNiceOrDie to decrease computational time. For each discount type we generated arepresentative set of domains, for example for the type medium discounts we createda set of preference profiles with discounts in the range .0:4; 0:8�. Next, we ran thecompetition multiple times for each discount type to determine the optimum valuesfor the strategy type parameters.

10.2.2.2 Opponent Model

As part of our implementation of the BOA framework, the opponent models ofprevious ANAC agents were isolated and modified to be compatible with theBOA framework [2]. Since the components now use an identical interface, theirquality can be compared using accuracy metrics as discussed by Baarslag et al. [5].An example of such a measure is the Pearson correlation between the estimated andopponent’s real preference profile. In this work we found the IAMhaggler BayesianModel introduced by Williams et al. [6] to be the most accurate in estimating theKalai-Smorodinsky point. The Negotiator Reloaded uses this model as part of itsdomain analyzer.

The computational resources required by the IAMhaggler Bayesian Modeldepend strongly on the domain size. Therefore the opponent model is not usedin very large domains, in which case the agent estimates the Kalai-Smorodinskypoint to be equal to a predefined constant. While the accuracy of the estimationincreases at the beginning of the negotiation, later on it actually decreases over time.We believe that this can be attributed to the assumed decision function that moreaccurately reflects the real decision function at the beginning of the negotiationfor most agents. To avoid this decay in accuracy, The Negotiator Reloaded stopsupdating the opponent model after a predefined amount of time.


Fig. 10.3 Basic acceptance conditions used by TNR

10.2.2.3 Acceptance Strategy

The acceptance strategy of TNR consists of a set of basic acceptance conditionsdiscussed by Baarslag et al. [7]. The flowchart of the acceptance strategy is depictedin Fig. 10.3. As visualized, there are two paths depending if the discount is negligibleor not and six parameters (˛; ˇ; ; ı; �; �).ACrv is an acceptance condition that decides to accept when the discounted

utility of the bid under consideration for offering is lower or equal to the reservationvalue. ACconst is an acceptance condition that accepts when the utility of theopponent’s bid is at least equal to a constant �. ACnext accepts when a lineartransformation of the opponent’s bid utility is better than the utility of the bid underconsideration. Finally, the agent uses ACmaxw when there is 1 � � time left andthe utilities of the bids of the agents have not crossed. This acceptance conditioncompares the offered bid with the maximum bid that has been given in a particularwindow and will accept if is higher than the maximum given in the previous windowand if it is higher than 0.5.

The multi-acceptance criteria (MAC) functionality of the BOA framework [2]was used to optimize the acceptance strategy. In short, the MAC can be used torun a large set of acceptance conditions in parallel during the same negotiationthread, assuming that the computational cost of each acceptance condition isminimal. In total 288 acceptance conditions were tested varying in the usage ofthe panic phase and the four parameters of the two acceptance conditions ACnext .The parameters ˛ D 1:0; ˇ D 0:0; D 1:05; ı D 0:0; � D 0:99 were found to beoptimal.

10.3 Empirical Evaluation

The previous sections introduced the BOA framework, and described how it hasbeen applied to optimize our negotiation agent. To demonstrate that using the BOAframework we were able to optimize TNR, and to analyze the behavior of theagent, this section discusses the results of a modified ANAC 2011 competition.Section 10.3.1 details the setup of this tournament and introduces the selectedquality measures. Next, Sect. 10.3.2 evaluates the results.


Table 10.1 Overview of quality measures used to quantity performance

Quality measure Description

Avg. time of agreement The average time of agreement of all matches which resulted inagreement

Std. time of agreement Standard deviation of the average time of agreement of each runAvg. discounted utility Average discounted utility of all matchesStd. discounted utility Standard deviation of the average discounted utility of each runRatio of agreement Percentage of matches which resulted in an agreementAvg. Kalai distance The average Kalai distance of all matchesTrajectory analysis The opponent’s moves can be classified based on their concession [8].

A unfortunate move for example, is a concession that accidentallyresults in a lower utility for the agent in comparison to theopponent’s previous bid


The default alternating offers protocol of GENIUS is used to run a tournamentidentical to the ANAC 2011 competition, except that ValueModelAgent is excludedand TNR is included, and that the agents compete on variants of the ANAC 2011based on the three discount types, resulting in a total of 24 domains. The completetournament is ran ten times to increase the statistical significance of the results.In a single tournament eight agents compete against all agents except themselveson 24 domains, playing both possible preference profiles. This results in a total of13,440 matches that were ran using a distributed version of GENIUS. The overviewof quality measures that were implemented to quantify the agents their performanceis depicted in Table 10.1.


This section discusses the results of the tournament visualized in Table 10.2. Notethat due to space constraints the standard deviations are not shown as they arenegligible, as well as the ratio of agreement, which is higher than 99% for all agents.This high percentage of agreement illustrates that most ANAC 2011 agents preferagreement over disagreement, and ultimately give in.

TNR achieves the highest discounted utility, and strongly outperforms therunner-up. With regard to the trajectory measures, the agent makes the leastconcessions, as indicated by its high percentage of silent moves and its lowranking on the percentage of unfortunate moves, fortunate moves, nice moves,and concession moves, which are all types of moves made when the agent triesto make a concession. TNR agent does not make selfish moves that increase its ownutility without conceding, which can be attributed to its usage of the time-dependentstrategy.


Table 10.2 Overview of the experimental results

Avg. Avg. Avg. Avg. Avg. Avg. Avg. Avg.time of discounted unfortunate fortunate nice selfish concession silent

Agent agreement utility moves moves moves moves moves moves

The NegotiatorReloaded

0.545 0.809 0.033 0.000 0.033 0.000 0.003 0.930

Gahboninho 0.528 0.782 0.027 0.001 0.038 0.002 0.004 0.929HardHeaded 0.638 0.778 0.111 0.013 0.133 0.052 0.028 0.663Nice Tit 0.605 0.767 0.112 0.079 0.066 0.116 0.11 0.512

For TatAgent K2 0.493 0.755 0.154 0.116 0.069 0.203 0.174 0.284The Negotiator 0.591 0.751 0.080 0.036 0.071 0.077 0.051 0.685IAMhaggler

20110.377 0.748 0.162 0.120 0.074 0.203 0.178 0.263

BRAMAgent 0.578 0.740 0.115 0.075 0.085 0.148 0.104 0.472

Bold text is used to emphasize the highest value, and underlined the lowest value. All averages arein the range Œ0; 1�

10.4 Conclusion and Future Work

In this work we discussed the implementation, optimization, and evaluation of aflexible negotiation strategy that outperforms the ANAC 2011 agents on variousdomains and performs well in the ANAC 2012. The Negotiator Reloaded is the firstANAC agent developed using the BOA framework.

The tournament results of our ANAC 2011 variant competition discussed inSect. 10.3 indicate a strong performance of TNR on various domains against arange of opponents. In the ANAC 2012 competition, TNR finished third overall andachieved the highest utility on undiscounted domains. The agent finished fifth whenonly focusing on the discounted domains. We believe that this can be attributedto our experimental setup used to optimize the agent: ANAC 2011 agents performrelatively poor on discounted domains.

For future work, it could be interesting to enable TNR to identify behavior-basedstrategies. In this case the bidding strategy should be further extended to use aneffective counter-strategy. Furthermore, the opponent model now used to estimatethe Kalai-Smorodinsky point could also be employed to estimate the best bid tooffer to the opponent given a set of similarly preferred bids.

Acknowledgements We would like to thank Tim Baarslag, Koen Hindriks, and Catholijn Jonkerfor introducing us to the field of bilateral negotiation and reviewing our paper. Furthermore, wethank the Universiteitsfonds Delft and the Interactive Intelligence Group of the Delft University ofTechnology for sponsoring our trip to the AAMAS 2012.


References

1. Baarslag, T., Fujita, K., Gerding, E.H., Hindriks, K., Ito, T., Jennings, N.R., Jonker, C., Kraus, S.,Lin, R., Robu, V., Williams, C.R.: Evaluating practical negotiating agents: results and analysisof the 2011 international competition. Artif. Intell. 198(0), 73–103 (2013)

2. Baarslag, T., Hindriks, K., Hendrikx, M., Dirkzwager, A., Jonker, C.: Decoupling negotiatingagents to explore the space of negotiation strategies. In: Proceedings of the Fifth InternationalWorkshop on Agent-based Complex Automated Negotiations (ACAN 2012) (2012)

3. Lin, R., Kraus, S., Baarslag. T., Tykhonov, D., Hindriks, K., Jonker, C.: Genius: An inte-grated environment for supporting the design of generic automated negotiators. ComputationalIntelligence, Blackwell Publishing Inc. http://mmi.tudelft.nl/sites/default/files/genius.pdf Doi:10.1111/j.1467-8640.2012.00463.x

4. Faratin, P., Sierra, C., Jennings, N.R.: Negotiation decision functions for autonomous agents.Robot. Auton. Syst. 24(3–4), 159–182 (1998) Multi-Agent Rationality.

5. Baarslag, T., Hendrikx, M., Hindriks, K., Jonker, C.: Measuring the performance of onlineopponent models in automated bilateral negotiation. In: Thielscher, M., Zhang, D., (eds.):AI 2012: Advances in Artificial Intelligence. Lecture Notes in Computer Science, vol. 7691,pp.1–14. Springer (2012)

6. Williams, C.R., Robu, V., Gerding, E.H., Jennings, N.R.: Iamhaggler2011: a gaussian processregression based negotiation agent. In: Ito, T., Zhang, M., Robu, V., Matsuo, T., (eds.)Complex Automated Negotiations: Theories, Models, and Software Competitions. Studies inComputational Intelligence, vol.435, pp. 209–212, Springer, Berlin (2013)

7. Baarslag, T., Hindriks, K., Jonker, C.: Acceptance conditions in automated negotiation. In: Ito,T., Zhang, M., Robu, V., Matsuo, T., (eds.) Complex Automated Negotiations: Theories, Models,and Software Competitions. Studies in Computational Intelligence, vol. 435, pp. 95–111.Springer, Berlin (2013)

8. Bosse, T., Jonker, C.M.: Human vs. computer behaviour in multi-issue negotiation. In: Proceed-ings of the Rational, Robust, and Secure Negotiation Mechanisms in Multi-Agent Systems. RRS’05, Washington, DC, USA, IEEE Computer Society (2005) 11

http://mmi.tudelft.nl/sites/default/files/genius.pdf

Chapter 11CUHKAgent: An Adaptive Negotiation Strategyfor Bilateral Negotiations over Multiple Items

Jianye Hao and Ho-fung Leung

Abstract Automated negotiation techniques can greatly improve the negotiationefficiency and quality of our human being, and a lot of automated negotiationstrategies and mechanisms have been proposed in different negotiation scenariosuntil now. To achieve efficient negotiation, there are two major challenges we areusually faced with: how to model and predict the strategy and preference of theopponent. To this end we propose an adaptive negotiating strategy (CUHKAgent)to predict the opponent’s strategy and preference at a high level, and make informeddecision accordingly.

Keywords Adaption • Negotiation • Reinforcement learning

11.1 Introduction

Negotiation is a commonly used approach to resolve conflicts and reach agreementsbetween different parties in our daily life. Automated negotiation techniques can, toa large extent, alleviate the efforts of human, and also facilitate human in reachingbetter negotiation outcomes by compensating for the limited computational abilitiesof humans when they are faced with complex negotiations.

Until now, a lot of automated negotiation strategies and mechanisms have beenproposed in different negotiation scenarios [1–5]. The major difficulty in designingautomated negotiation agent is how to achieve optimal negotiation results givenincomplete information on the negotiating partner. The negotiation partner usuallykeeps its negotiation strategy and its preference as its private information to avoidexploitations. To achieve efficient negotiation, a lot of research efforts have been

J. Hao (�) • H.-f. LeungDepartment of Computer Science and Engineering,The Chinese University of Hong Kong, Hong Konge-mail: [email protected]; [email protected]


171



172 J. Hao and H.-f. Leung

devoted into the following two directions: learning the opponent’s negotiationdecision function [4, 6] and estimating the opponent’s preference profile [3, 7, 8].Previous work usually assumes that the opponent’s strategy or preference profilefollow certain predefined patterns which can be accurately modeled as certainclasses of mathematical functions. For example, one may assume that the opponentmakes negotiation following certain probability function [2], and the task is howto accurately estimate the corresponding coefficients of the probability functionbased on the negotiation histories. For utility function, one commonly adoptedassumption is that the utility function is additive in which the contribution of eachissue to the overall utility is independent. However, in practice, the agents may notstrictly follow any function to make decisions, and also they may not determine theirpreferences over different combinations of items following a fixed type of utilityfunction. The consequence is that it may not be feasible to learn the opponent’sdecision function or utility function since such kind of functions may not existat all. Even if the opponent indeed makes decisions following certain forms ofmathematical functions, it is highly likely that it has already changed its decisionfunction to another form, which thus makes what we have learned useless.

Due to the aforementioned issues, we propose that, to make efficient negotiation,an agent should focus on making timely and effective adaption to the opponent’spast behaviors rather than learning the exact forms of the opponent’s decision func-tion or utility function. Considering the high diversity of the available negotiationstrategies that an agent can choose, it is usually very difficult (or even impossible) topredict which specific strategy (or combination of strategies) the negotiating partneris using based on this limited information. To effectively cope with different types ofopponents, we introduce the concept of non-exploitation point to adaptively adjustthe degree that an agent exploits its negotiating opponent. The value of the non-exploitation point is determined by the characteristics of the negotiation scenarioand the concessive degree of the negotiating partner, which is estimated basedon the negotiation history. Besides, to maximize the possibility that the offer ouragent proposes will be accepted by its negotiating partner, it can be useful to makepredictions on the preference profile of the negotiating partner. Instead of explicitlymodeling the negotiation partner’s utility function, we propose a reinforcement-learning based approach to determine the optimal proposal for the negotiatingpartner based on the current negotiation history.

The structure of this paper is organized as follows. In Sect. 11.2, we discuss anumber of key issues related with negotiation strategy design. Following that, weintroduce our negotiation agent CUHKAgent in details in Sect. 11.3. Finally wemake conclusions in Sect. 11.4.

11.2 Designing Issues

In this section, we discuss a number of key issues when designing an efficientnegotiation strategy.

11 CUHKAgent: An Adaptive Negotiation Strategy for Bilateral Negotiations 173

11.2.1 Learning the Opponent’s Decision Function or Not?

Much effort has been given to predict the opponent’s exact decision function inprevious work. This is usually based on the assumption that the opponent makesdecisions following certain predefined patterns which can be accurately modeled ascertain classes of mathematical functions. For example, one may assume that theopponent decides whether to accept an offer following certain probability function[2], and the task is how to accurately estimate the corresponding coefficients of theprobability function based on the negotiation histories. Another example is that in[4] the authors propose a way of predicting the opponent’s next round offer basedon the assumption that the opponent make decisions based on the combination oftime-dependent and behavior-dependent decision functions. Based on the predictionresults, the optimal offer(s) to be proposed to the opponent can be determinedby modeling the negotiation as a multi-stage control process and calculating thesequence of optimal controls (offers) accordingly.

However, in practice, this kind of assumption is usually not valid consideringthe high diversity of the possible strategies that an agent may adopt. An agentusually can exhibit highly dynamic behaviors which cannot be modeled as certaintypes of mathematical functions. Even if the opponent indeed makes decisions bystrictly following certain forms of mathematical functions, it is highly likely thatits decision function is changed in a dynamic way, which thus may make whatwe have learned useless. Instead of predicting the opponent’s decision function,an alternative approach is that we can model the opponent’s behavior at a high levelbased on certain high-level characteristics such as its concession degree, and makeadaptive response accordingly.

11.2.2 How to Make Concessions to the Opponent?

There are a number of factors to be considered when determining the concessiondegree to the opponent. The first factor is the amount of negotiation time left.The more the negotiation time has passed, the less utility an agent may obtain dueto discounting effect. Therefore we need to carefully balance between the possibleutility gain by being tough and the utility loss due to discounting effect. The secondfactor is the discounting degree. This factor is closely related with the first factor—the negotiation time left. The larger the discounting factor is, the more cautious weneed to be to avoid possible utility loss due to discounting effect. The last factor isthe concession degree of the opponent. The more concessive the opponent is, themore we can exploit the opponent by being tough to the opponent, and vice versa.


11.2.3 How to Guess the Opponent’s Preference?

In the current setting of the ANAC competition [9], the agents’ preference functionsare assumed to be additive, and thus it may be possible for an agent to learnits opponent’s preference function through past negotiations. For example, in [3],the authors propose a Bayesian learning based approach to learn the opponent’spreference function, i.e., the issue preference and the issues priorities of theopponent.

However, in general, an agent’s preference function can be in any form andmay not be known to other agents. Thus it is infeasible for us to learn the exactpreference function within limited negotiation time considering the high diversityof possible utility functions that an agent can choose. Instead of learning the exactpreference function of the opponent, an alternative approach is to directly learn therelative importance of each proposal (combination of items) of the opponent basedon the opponent’s past proposals.

11.3 Strategy Description

In this section, we describe the key components of CUHKAgent, which is aspecific implementation of the ABiNeS strategy [10]. Before describing the details,we introduce a few mathematical notions which will be used in the followingdescriptions. For each negotiation scenario, both agents can negotiate over multipleissues (items), and each item can have a number of different values. Let us denotethe set of items as M , and the set of values for each item mi 2 M as Vi . For eachnegotiation outcome !, we use !.mi/ to denote the corresponding value of the itemmi in the negotiation outcome !.

11.3.1 How to Determine the Acceptance Threshold

The determination of the acceptance threshold is the key issue in designing anegotiation strategy. For CUHKAgent, the principle is that it accepts a proposalfrom its opponent if its utility over this proposal is higher than its currentacceptance threshold, and also any proposal offered by CUHKAgent should alsoexceed its acceptance threshold. The value of the acceptance threshold reflects theagent’s current concession degree and should be adaptively adjusted based on theopponent’s concession degree and the characteristic of the negotiation environment.

We assume that the negotiating partner is self-interested, and it will accept anyproposal when the deadline is approaching (t 1). Therefore the acceptancethreshold of CUHKAgent is always higher than the highest utility it can obtain whent D 1. Specifically, at any time t , the acceptance threshold lt of CUHKAgent should


not be lower than umaxı1�t , where umax is its maximum utility over the negotiationdomain without discounting. Since the negotiating goal is to reach an agreementwhich maximize the agent’s own utility as much as possible, its negotiating partnershould be exploited as much as possible by setting its acceptance threshold as highas possible. One the other hand, due to the discounting effect, the actual utilitythe agent receives can become extremely low though its original utility over themutually-agreed negotiation outcome is high, if it takes too long for the agents toreach the agreement. In the worst case the negotiation may end up with a break-offand each agent obtains zero utility. Thus we also need to make certain compromisesto the negotiating partner, i.e., lower the acceptance threshold, depending on thetype of the partner we are negotiating with. Therefore, the key problem is how tobalance the trade-off between exploiting and making compromise to the negotiatingpartner. Towards this end, we introduce the adaptive non-exploitation point �, whichrepresents the specific time when we should stop exploitations on the negotiatingpartner. This value is adaptively adjusted based on the behavior of the negotiatingpartner. Specifically we propose that for any time t < �, CUHKAgent alwaysexploits its negotiating partner (agent B) by setting its acceptance threshold to avalue higher than umaxı1�� and approaching this value until time � according tocertain pattern of behavior. After time �, its acceptance threshold is set to be equalto umaxı1�t forever, and any proposal over which its utility is higher than this valuewill be accepted. Formally, the acceptance threshold l tA of CUHKAgent at time t isdetermined as follows,

I D�

umax � .umax � umaxı1��/. t�/˛ if t < �

umaxı1�t otherwise(11.1)

where the variable ˛ controls the way the acceptance threshold approaches umaxı1�t(boulware (˛ > 1), conceder (˛ < 1) or linear (˛ D 1)). One example showing thedynamics of the acceptance threshold with time t with different value of � is givenin Fig. 11.1.

The remaining question is how to calculate the value of non-exploitation point �.The value of � is determined by the characteristics of the negotiation scenario(i.e., discounting factor ı) and the concession degree of the negotiating partner.The smaller the discounting factor ı is, the less actual utility we will receive astime goes by, which means more risk we are facing when we continue exploitingthe negotiating partner. Therefore the value of � should be decreased with thedecreasing of the discounting factor ı. The concession degree of the negotiatingpartner is estimated based on its past behaviors. Intuitively, the more number of newnegotiation outcomes that the negotiating partner has recently proposed, the moreit is willing to make concession to end the negotiation. Specifically, the negotiationpartner’s concessive degree �t is defined as the ratio of new negotiation outcomesit proposed within the most recent finite negotiation history. If we predict thatthe negotiating partner is becoming more concessive, we can take advantage ofthis prediction by postponing the time we stop exploitations, i.e., increasing the


0 0.2 0.4 0.6 0.8 10.88

0.9

0.92

0.94

0.96

0.98

1

Time

Acc

epta

nce

Thr

esho

ldλ=0.75λ=0.6λ=0.5

Fig. 11.1 The dynamics of the acceptance threshold (umax D 1, ˛ D 0:5 and ı D 0:8)

value of �. Initially the value of � is determined by the discounting factor ı onlysince we do not have any information on the negotiating partner yet. After that,it is adaptively adjusted based on the estimation of the concession degree of thenegotiating partner. The overall adjustment rule of � during negotiation is shown inFig. 11.2.

11.3.2 How to Propose Bids to the Opponent

In previous section, we have described how CUHKagent determines its acceptancethreshold. When the proposed offered by its opponent is not satisfactory, it needsto propose a counter offer higher than its current acceptance threshold to itsopponent. Given the current acceptance threshold, any negotiation outcome overwhich CUHKAgent’s utility is higher than the acceptance threshold can be areasonable outcome to propose. To maximize the likelihood that the offer will beaccepted by the opponent, we need to predict the negotiation outcome !max whichcan maximize the opponent’s utility among the set C of candidate negotiationoutcomes.

To obtain !max , we need to estimate the opponent’s private preference based onits past negotiation moves. Different approaches [1, 3, 7, 8] have been proposed toexplicitly estimate the negotiating partner’s utility function in bilateral negotiationscenarios. To make the estimation feasible with the limited information available, we


Initial values

• l0 - the minimum value of l,

• b - the controlling variable determining the way the value of l changes with respect to the discounting factor d, i.e., boulware (b < 1), conceder (b > 1) or linear (b = 1),

• s t - the estimation of the negotiating partner’s concessive degree at time t,

• g - the controlling variable determining the way the value of l changes with respect to s t , i.e., boulware (g < 1), conceder (g > 1) or linear (g = 1),

• w - the weighting factor adjusting the relative effect of s t on the non- exploitation point l .

if t = 0 then l = l0 + (1−l0)d b

end ifif 0 < t ≤ 1 then l = l + w(1−l)s tg

end if

Fig. 11.2 Adjustment rule of � at time t

usually need to put some restrictions on the possible structures that the negotiationpartner’s utility function can have [3] or assume that the preference profile ofthe negotiation partner is chosen from a fixed set of profiles [7]. Due to theprevious concerns mentioned in Sect. 11.2, instead of estimating the opponent’sutility function directly, here we adopt a more general way to predict the current bestnegotiation outcome for the opponent based on model-free reinforcement learningapproach. The only assumption we need here is that the negotiating opponent isindividually rational and follows some kind of concession-based strategy whenproposing bids, which is the most commonly used assumption in both game-theoretic approaches and negotiations [3, 11].

Based on the above assumption, it is natural to assume that the sequence ofpast negotiation outcomes proposed by the opponent should be in accordance withthe decreasing order of its preference over those outcomes. Intuitively, for a valuevi of an item mi , the earlier and the more frequent it appears in the negotiationoutcomes of the past history, the more likely that it weights more in contributing tothe negotiation partner’s overall utility. Therefore, for each value of each item mi

in the negotiation domain, we keep record of the number of times that it appears inthe negotiating partner’s past negotiation outcomes and update its value each time anew negotiation outcome !0 is proposed by the opponent as follows,

n.!0.mi // D n.!0.mi //C k 8mi 2 M (11.2)

where !0 is the most recent negotiation outcome proposed by the opponent, isthe discounting factor reflecting the decreasing speed of the relative importance of


the negotiation outcomes as time increases, and k is the number of times that thevalue !0.mi / of item mi has appeared in the history.

For each negotiation outcome !, we define its accumulated frequency f .!/ asthe criterion for evaluating the relative preference of the opponent over it. The valueof f .!/ is determined by the value of n.!.mi// for each item mi 2 M basedon the current negotiation history. Formally, for any negotiation outcome !, itsaccumulated frequency f .!/ is calculated as follows,

f .!/ DXmi

n.!.mi // 8mi 2 M (11.3)

The negotiation outcome !max is selected based on the �-greedy explorationmechanism. With probability 1 � �, it chooses the negotiation outcome with thehighest f -value from the set C of candidate negotiation outcomes, and chooses onenegotiation outcome randomly from C with probability �.

11.4 Conclusion

In this paper, we propose an adaptive negotiating agent—CUHKAgent—to performautomatic negotiation in bilateral multi-issue negotiation scenarios. We introducethe concept of non-exploitation point � to adaptively adjust the agent’s concessiondegree to its negotiating opponent, and propose a reinforcement-learning basedapproach to determine the optimal proposal for the opponent to maximize thepossibility that the offer will be accepted by the opponent. As future work, oneworthwhile direction is to further refine the estimation of the negotiating partner’sconcessive degree to make more effective exploitation on the negotiating opponent,by taking into consideration the magnitude of the utility that the opponent proposes.

References

1. Faratin, P., Sierra, C., Jennings, N.R.: Using similarity criteria to make negotiation trade-offs.Artif. Intell. 142(2), 205–237 (2003)

2. Saha, S., Biswas, A., Sen, S.: Modeling opponent decision in repeated one-shot negotiations.In: AAMAS’05, pp. 397–403 (2005)

3. Hindriks, K., Tykhonov, D.: Opponent modeling in automated multi-issue negotiation usingBayesian learning. In: AAMAS’08, 331–338 (2008)

4. Brzostowski, J., Kowalczyk, R.: Predicting partner’s behaviour in agent negotiation. In:AAMAS ’06, 355–361 (2006)

5. Hao, J.Y., Leung, H.F.: An efficient negotiation protocol to achieve socially optimal allocation.In: PRIMA’12, 46–60 (2012)

6. Zeng, D., Sycara, K.: Bayesian learning in negotiation. In: AAAI Symposium on Adaptation,Co-evolution and Learning in Multiagent Systems, pp. 99–104 (1996)


7. Zeng, D., Sycara, K.: Bayesian learning in negotiation. Int. J. Hum. Comput. Syst. 48, 125–141(1998)

8. Coehoorn, R.M., Jennings, N.R.: Learning an opponent’s preferences to make effective multi-issue negotiation trade-offs. In: Proceedings of ICEC’04, ACM Press, 59–68 (2004)

9. Baarslag, T., Fujita, K., Gerding, E.H., Hindriks, K., Ito, T., Jennings, N.R., Jonker, C., Kraus,S., Lin, R., Robu, V., Williams, C.R.: Evaluating practical negotiating agents: results andanalysis of the 2011 international competition. Artif. Intell. 198, 73–103 (2013)

10. Hao, J.Y., Leung, H.F.: Abines: an adaptive bilateral negotiating strategy over multiple items.In: Proceedings of IAT’12, vol. 2, pp. 95–102 (2012)

11. Osborne, M.J., Rubinstein, A.: A Course in Game Theory. MIT Press, Cambridge (1994)

Chapter 12AgentMR: Concession Strategy Based onHeuristic for Automated Negotiating Agents

Shota Morii and Takayuki Ito

Abstract The Automated Negotiation Agents Competition (ANAC2012) wasorganized. Automated agents can alleviate some of the effort required of peopleduring negotiations and also assist people who are less qualified in the negotiationprocess. Thus, success in developing an automated agent with negotiationcapabilities has great advantages and implication. In this paper, we present thestrategy of the agent (Agent MR) based on the heuristic. We show the method ofsearching for the bid effectively and also discuss how to control concession.

Keywords Automated negotiation competition • Multi-agent system • Multi-issue negotiation

12.1 Introduction

The third international (ANAC2012) was held [1]. At ANAC, researchers proposedagents that had various strategies (e.g. [2]). It is likely that the strategies of an agentcan be applied to real-life negotiation problems.

We developed a negotiation agent for ANAC2012 that can negotiate on variousnegotiation problems [3]. In this paper, we present a method to search efficiently onthe various domains, and we discuss how to of the opponent in order to grasp thecharacteristics of the opponent.

S. Morii (�) • T. ItoDepartment of Computer Science and Engineering, Nagoya Institute of Technology, Gokiso-cho,Showa-ku, Nagoya, Aichi, Japane-mail: [email protected]; [email protected]


181



182 S. Morii and T. Ito

12.2 An Implementation of Negotiating Agents Basedon Heuristic Strategy

12.2.1 Method of Searching for Bid

In the setting of the competition, a multitude of domains has many issues. The num-ber of bids is proportional to the number of issues and each issue’s elements.In particular, if the domain has many bids, a simple method like a full search hasdifficulty in searching for a bid that has high utility. Therefore, it is necessary toconsider a method for searching efficiently on the domain.

We search for the bid based on the that one’s own bid has similar utilities. Whena certain bid changes one point at issue, we speculate that it has similar high utility.Table 12.1 gives an example of a method of searching for the bid. In Table 12.1,we can search for the bid that has high utility when we change the element of issueat shirts.

The method of based on a heuristic is effective in with many points at issue.This leads to early agreement since the search was completed at an early stage.Moreover, we search its own search space, as well as the opponent’s space. In thesetting of competition, it is important that we examine the strategy that can searchimmediately and effectively since one negotiation has a time limit. Figure 12.1 is agraphical representation of the average of utilities and the number of bids in one’sown best bid in the ANAC2012 final rounds (exclude Energy domain).

Simulation results show that the proposed strategy can search for a bid with highutility. On the other hand, average of utilities is low in some domains that have fewissues.

12.2.2 Evaluating Characteristics of Opponent

Since an agent’s own is not mutually taught to the other agent, information thatcan be used for strategy construction is scarce. Therefore, a compromising strategyneeds to be studied based on information such as the details of the domain and thecharacteristics of the opponent.

The main idea of strategy is to compute concessions of the opponent in order tograsp the characteristics of the opponent. Concretely, a concession of the opponentis calculated as follows. Let D be our set of domains. Agent A negotiates with B on

Table 12.1 Method of searching for bid

Shirts Pants Shoes Accessories Utility

Blouse Leather pants Sneakers Sunglasses 1.00Sweaters Leather pants Sneakers Sunglasses 0.94

12 Novel Insights in Agent-Based Complex Automated Negotiation 183

Fig. 12.1 Result of searching for bids in ANAC2012 final rounds

domain D 2 D, if they reach the certain bid !, in which a concession degree of Bon utility space of A is defined as expression (12.1).

� D U.!/ � U.!rivalF irst /1 � U.!rivalF irst / (12.1)

U.!/ is the utility of the bid ! on an agent’s own utility space. U.!rivalF irst /means the first bid of the opponent on its own utility space.

The � shows the feature of opponent behavior. In addition to this barometer, wedefine an agent’s own lower limit of the concession degree UmyMin on its utilityspace as follows:

UmyMin D U.!rivalF irst /C .1 � U.!rivalF irst // � � (12.2)

� is the coefficient for adjustment of concession, and is defined based onthe concession degree �. By using the lower limit UmyMin, the agent works atcompromising to the estimated optimal agreement point.

12.2.3 Control of Concession

We concede slowly using the sigmoid-based function. Concretely, our behavior isdecided based on the following expression (12.3).

184 S. Morii and T. Ito

Fig. 12.2 U.t/ when ˛ is changed from 1 to 9

U.t/ D 1 � 1

1C e�˛.t�ˇ/(12.3)

U.t/ is calculated by the when the timeline is t . ˛ is called gain at this function,and we use it for adjustment of the speed of concession. ˇ is used for adjustment ofthe concession degree at t . ˇ is defined so that UmyMin equals U.t/ at the deadline(t D 1).

U.1/ D UmyMin (12.4)

Therefore, ˇ is defined as follows:

ˇ D 1C 1

˛log

�UmyMin

1 � UmyMin

�(12.5)

Figure 12.2 is an example of U.t/ when ˛ is changed from 1 to 9. The horizontalaxis shows the passage of time of the negotiation. The vertical axis indicates theeffect value that the agent obtains. The curve U.t/ approaches UmyMin with timepassage.

12.3 Conclusion

In this paper, we argued a basic strategy for Agent MR. We presented details ofthe strategy, which is a method of searching for a bid. This strategy based on theheuristic can search for the bid with high utility at an early stage. Moreover, weexplain how to concede in order to follow the opponent’s behavior. It is possible toadequately estimate the opponent’s concession degree control.

12 Novel Insights in Agent-Based Complex Automated Negotiation 185

References

1. The Third International Automated Negotiating Agents Competition (ANAC2012). http://anac2012.ecs.soton.ac.uk/

2. Kawaguchi S., Fujita K., Ito T.: AgentK2: compromising strategy based on estimated maximumutility for automated negotiating agents. In: Complex Automated Negotiations: Theories,Models, and Software Competitions, pp. 235–241. Springer, Berlin (2012)

3. Morii S., Ito T.: Development of automated negotiating agents in multi-issue negotiation prob-lem (in Japanese). In: Tokai-Section Joint Conference on Electrical and Related Engineering2012 (2012)

http://anac2012.ecs.soton.ac.uk/

http://anac2012.ecs.soton.ac.uk/

Chapter 13OMAC: A Discrete Wavelet TransformationBased Negotiation Agent

Siqi Chen and Gerhard Weiss

Abstract This work describes an automated negotiation agent called OMAC whichwas awarded the joint third place in the 2012 Automated Negotiating Agent Com-petition (ANAC 2012). OMAC, standing for “Opponent Modeling and AdaptiveConcession,” combines efficient OMAC making. Opponent modeling is achievedthrough standard wavelet decomposition and cubic smoothing spline; concession-making is made through setting the best possible concession rate on the basis of theexpected utilities of forthcoming counter-offers.

Keywords Automated multi-issue negotiation • Discrete wavelet transformation •Opponent modeling

13.1 Introduction

Negotiation provides a mechanism for coordinating interaction among compu-tational autonomous agents which represent respective parties of different oreven conflicting interest. As automated negotiation can be applied to fields asdiverse as electronic commerce and electronic markets, supply chain management,task and service allocation, etc, it has become a core topic of multi-agent sys-tems [6]. This paper introduces a novel negotiation agent called OMAC (“OpponentModeling and Adaptive Concession”) for complex scenarios, where agents haveno useful information about their opponents, and in addition they are under

This is a shortened version of our OMAC description provided in [3].

S. Chen (�) • G. WeissDepartment of Knowledge Engineering, Maastricht University, Maastricht, The Netherlandse-mail: [email protected]; [email protected]


187



188 S. Chen and G. Weiss

real-time constraints. The negotiation strategy of OMAC integrates two key aspectsof successful negotiation: efficient OMAC making. Opponent modeling realizedby OMAC aims at predicting the utilities of an opponent’s future counter-offersand is achieved through two standard mathematical techniques, namely, waveletdecomposition and cubic smoothing spline. Adaptive concession making is achievedthrough dynamically adapting the concession rate (i.e., the degree at which an agentis willing to make concessions in its offers) on the basis of the utilities of futurecounter-offers it expects according to its opponent model.

The remainder of this paper is structured as follows. Section 13.2 describes thestandard negotiation environment underlying our research. Section 13.3 overviewsOMAC. Sections 13.4–13.6 describe OMAC in detail. Finally, Sect. 13.7 identifiessome important research lines induced by the work.

13.2 Negotiation Environment

We adopt a basic bilateral multi-issue negotiation setting which is widely used inthe agents field (e.g., [2, 3]). The negotiation protocol is based on a variant of thealternating offers protocol proposed in [5]. Let I D fa; bg be a pair of negotiatingagents, i represent a specific agent (i 2 I ), J be the set of issues under negotiation,and j be a particular issue (j 2 f1; : : : ; ng where n is the number of issues).The goal of a and b is to establish a contract for a product or service. Thereby acontract consists of a package of issues such as price, quality and quantity. Eachagent has a lowest expectation for the outcome of a negotiation; this expectation iscalled reserved utility ures . wij (j 2 f1; : : : ; ng) denotes the weighting preferencewhich agent i assigns to issue j , where the weights of an agent are normalized(i.e.,

PnjD1.wij / D 1 for each agent i ). During negotiation agents a and b act in

conflictive roles which are specified by their preference profiles. In order to reach anagreement they exchange offersO in each round to express their demands. Therebyan offer is a vector of values, with one value for each issue. The utility of an offerfor agent i is obtained by the utility function defined as:

U i.O/ DnX

jD1.wij � V i

j .Oj // (13.1)

where wij and O are as defined above and V ij is the evaluation function for i ,

mapping every possible value of issue j (i.e., Oj ) to a real number.Following Rubinstein’s alternating bargaining model [5], each agent makes, in

turn, an offer in form of a contract proposal. Negotiation is time-limited instead ofbeing restricted by a fixed number of exchanged offers; specifically, each negotiatorhas a hard deadline by when it must have completed or withdraw the negotiation.The negotiation deadline of agents is denoted by tmax . In this form of real-timeconstraints, the number of remaining rounds are not known and the outcome of

13 OMAC: A Discrete Wavelet Transformation Based Negotiation Agent 189

a negotiation depends crucially on the time sensitivity of the agents’ negotiationstrategies. This holds, in particular, for discounting domains, that is, domains inwhich the utility is discounted with time. As usual for discounting domains, wedefine a so-called discounting factor ı (ı 2 Œ0; 1�) and use this factor to calculate thediscounted utility as follows:

D.U; t/ D U � ıt (13.2)

where U is the (original) utility and t is the standardized time. As an effect, thelonger it takes for agents to come to an agreement the lower is the utility they canachieve.

After receiving an offer from the opponent,Oopp , an agent decides on acceptanceand rejection according to its interpretation I.t; Oopp/ of the current negotiationsituation. For instance, this decision can be made in dependence on a certainthreshold T hresi : agent i accepts if U i.Oopp/ � T hresi , and rejects otherwise.As another example, the decision can be based on utility differences. Negotiationcontinues until one of the negotiating agents accepts or withdraws due to timeout.

13.3 Overview of OMAC

An overview of OMAC is given in Algorithm 5. In more detail, OMAC includes twocore stages—opponent modeling and concession rate adaptation—as described indetail in Sects. 13.4 and 13.5, respectively. A third important stage of OMAC, itsresponse mechanism to counter-offers, is described in Sect. 13.6.

13.4 Opponent Modeling

According to OMAC, the aim of opponent modeling realized by a negotiating agent isto estimate the utilities of future counter-offers it will receive from its opponent. Thiscorresponds to the lines 3 to 8 in Algorithm 5. Opponent modeling is done througha combination of wavelets analysis and cubic smoothing spline. When receivinga new bid from the opponent at the time tc , the agent records the time stamp tcand the utility U.Oopp/ this bid has according to the agent’s utility function. Themaximum utilities in consecutive equal time intervals and the corresponding timestamps are used periodically as basis for predicting the opponent’s behavior (line5 and 6). The reasons for a periodical updating are twofold as discussed in [2].Firstly, this degrades the computation complexity so that the agent’s response timeis kept low. Assume that all observed counter-offers were taken as inputs, then theagent might have to deal with thousands of data points in every single session. Thiscomputational load would have a clear negative impact on the quality of negotiationin a real-time constraint setting. Secondly, the effect of noise can be reduced.


Algorithm 5: The strategy of OMAC. tc refers to the current time, ı thetime discounting factor, � the layer of wavelet decomposition, the waveletfunction, and tmax the deadline of negotiation. Oopp is the latest offer ofthe opponent, and Oown the offer to be proposed by OMAC. � represents thetime series comprised of the maximum utilities over intervals. Let � be thesmooth component of �-th order wavelet decomposition based on , and ˛ thepredicted main tendency of �. tl is the time we preform prediction process andul is the utility of our most recent offer. u0 is the target utility at time tc . R is thereserved utility function

1: Require W tmax; ı; �; ;R2: while tc <D tmax do3: Oopp ( receiveMessage./;4: recordBids.tc ; Oopp/;5: if needUpdate(tc) then6: �( preprocessData.tc/7: .˛; tl ; ul /( predict.�; �; /;8: end if9: u0 D getTarUtility.tc ; tl ; ul ; ı; ˛; R/;

10: if getOwnUtility.Oopp; tc ; ı/ � u0 then11: accept.Oopp/;12: else13: Oown( constructOffer.u0/;14: proposeBid.Oown/;15: end if16: end while

In multi-issue negotiation a small change in utility of the opponent can result in alarge utility change for the negotiator and this can easily result in a misinterpretationof opponent’s behavior.

Behavior prediction is mainly done by applying discrete wavelet transformation(DWT) to the time series �; this is captured by line 7. We decided to use DWTbecause wavelet analysis is known to be an efficient multi-scaling tool for exploringfeatures in data sets. With DTW a signal can be decomposed into two parts, anapproximation and a detail part. The former is smooth and reveals the trend of theoriginal signal, and the latter is rough and corresponds to noise (resulting e.g. fromseasonal fluctuations). OMAC focuses on the approximation part and intentionallyignores the detail part for three reasons. First, the approximation part represents thetrend of the opponent concession in terms of utility and indicates how the concessionof opponent will develop in the future. Second, it is smooth enough (comparedto the original signals, i.e. �) to allow for quality prediction performance. Third,the detail part contains information which is of little value in a negotiation setting.As we saw in various empirical investigations, the ratio between the main tendencyterm and the original signal tends to be about 0.98 with a small standard deviation.Precise extension of those detailed components can improve effectiveness of ourmodel slightly, it is however very costly for a medium-range lead time in real-timenegotiation.


Given the discrete wavelet function j;k.t/ transformed by a mother wavelet .t/,

j;k.t/ D a�j=20 .a

�j0 t � kb0/; j; k 2 Z (13.3)

DWT corresponds to a mapping from the signal f .t/ to coefficients Cj;k which arerelated to particular scales, where these coefficients are defined as follows:

Cj;k DZ C1�1

f .t/ j;k.t/dt; j; k 2 Z (13.4)

The .t/ is required to be an orthogonal wavelet, the set { j;k.t/jj; k 2 Z} is thenan orthogonal wavelet basis such that the signal f .t/ can be reconstructed.

With recursive application of DWT to the signal f .t/, the approximation (lowfrequency) and detail (high frequency) components are recovered, respectively. Forinstance, f can first be decomposed into a1 C d1 and the resulting part a1 can thenbe decomposed in finer components, that is, a1 D a2 C d2, and so on. Based uponthis recursive process, the signal can be expressed as f = a1 C a2 C : : : C an Cdn (further details on wavelets are given in e.g. [4]). The results reported in thispaper are achieved through wavelet decomposition using the Daubechies’ waveletsof order 10.

We use the following notation:

� D � C�XnD1

dn (13.5)

where � represents the approximation component of � and dn is n-layer detailpart (n is determined by the decomposition level �). An example can be found inFig. 13.1 which shows � and its corresponding approximation part � along with theestimated upper and lower bounds of �. The two bounds are represented by v ˙ � ,where � is the standard deviation of the ratio between � and � .

In order to forecast the opponent’s future behavior, cubic smoothing spline isused to extend the smooth component � . Cubic spline is widely used as a tool forprediction, see [7]. For equally spaced time series, a cubic spline is a smoothingpiecewise function, denoted as the function Og.t/ which minimizes:

p

nXtD1

w.t/.f .t/ � Og.t//2 C .1 � p/Z. Og.u/00/2du (13.6)

where p is the smoothing parameter controlling the rate of exchange betweenthe residual error described by the sum of squared residuals and local variationrepresented by the square integral of the second derivative of g and w is the weightvector (for further details, refer to [1]).


0 10 20 30 40 50 60 70 80 900.6

0.65

0.7

0.75

0.8

0.85

0.9

Percentage time (%)

Util

ity

Fig. 13.1 Illustrating the opponent’s concession (given by �, the thick solid line) and thecorresponding approximation part � (the thin solid line) when negotiating with Agent_K2 in theCamera domain (this agent and domain are taken from ANAC 2011). The two dash-dot linesrepresent the estimated upper and lower bounds of �

Figure 13.2 shows the actual and the predicted smooth parts of opponentconcession at different time points for the opponent “Iamhaggler2011”: as this figureillustrates, cubic spline is able to forecast the given signal within a medium rangevery well. Since OMAC applies a periodical updating mechanism, it is not necessaryand not wise to forecast globally (i.e., from the current moment to the end point ofnegotiation), because this probably brings too much noise into the prediction. OMAClimits the range of forecasting to � intervals and in this way achieves efficiency andnoise reduction.

13.5 Adaptive Adjustment of Concession Rate

Given the extended version of the smooth part—˛, we now discuss how to useit for adaptively setting the concession rate of our expected utility (see line 9 inAlgorithm 5). A possibility is to maximize the expected utility merely accordingto the predicted opponent move. This is quite straightforward but may be not soeffective. Suppose the negotiation partners are “tough” and always avoid makingany concession in bargaining. In this case the result of prediction could indicatea very low expectation about the utility offered by the opponent and this, in turn,would result in an adverse concession. In OMAC a simple function R, called reservedutility function, is used to realize concession adaptation. This function guaranteesthe minimum utility at each given time step. This is because the function values


0 5 10 15 20 25 30 350.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Percentage time (%)

Util

ity

Fig. 13.2 Illustration of the predictive power of OMAC in two consecutive ranges. The dash lineindicates the time point tc at which the current prediction is made. The plus signs on left of the dashline are the actual points of � before tc . The crosses to the right of the dash line show the actualpoints of � after tc . The extended version of � – ˛ (i.e., the prediction of �) is shown by thesolid line. These results are achieved against the agent Iamhaggler2011 in the domainAmsterdamparty (the agent and domain are taken from ANAC 2011)

are set as the lower bound of our expected utilities. Moreover, in principle it makesconcession over time, thereby taking into account the impact of the discountingfactor. Specifically, the reserved utility function is given by:

R.t/ D ures C .1 � t 1=ˇ/.maxUtility(p) � ı � ures/ (13.7)

where ures is the minimum utility the agent would accept, ˇ is a parameter whichhas a direct impact on the concession rate, maxUtility(p) is the function specifyingthe maximum utility given by the preference profile p of a negotiation domain, and is a parameter called risk factor which reflects the agent’s expectation about themaximum utility it can achieve.

We define the estimated received utility Eru.t/, which gives our agent theexpectation of opponent’s future concession, as follows:

Eru.t/ D D.˛.t/.1C Stdev.ratioŒtb ;tc �//; t/; t 2 Œtc ; ts� (13.8)

where Stdev.ratioŒtb ;tc �/ is the standard deviation of ratio between the smooth part� and the original signal � from the beginning of negotiation(tb) till now and ts isthe end of ˛.


Suppose the future expectation the agent has obtained from Eru.t/ is optimistic,in other words, there exists an interval {T jT ¤ ¿; T Œtc ; ts�}, so that

Eru.t/ � D.R.t/; t/; t 2 T (13.9)

OMAC then sets the time Ot at which the optimal estimated utility Ou is reached as:

Ot D argmaxt2T .Eru.t/ �D.R.t/; t// (13.10)

and Ou is simply assigned by:

Ou D Eru.Ot / (13.11)

When the opponent’s future concession is estimated to be below the agent’sexpectations according to R.t/ (i.e., there is no such interval T described above),OMAC investigates whether the best possible outcome under that “pessimistic”expectation of opponent concession should be accepted given the threshold �. Thisoutcome is denoted as � and is given by:

� D ��1 �Eru.t�/=D.R.t�/; t� /; t� 2 Œtc ; ts� (13.12)

where � is the tolerance threshold to accept Eru.t�/ as target utility and t� isgiven by:

t� D argmint2Œtc ;ts �.jEru.t/ �D.R.t/; t/j/ (13.13)

The rationality behind it is that if the agent rejects the “locally optimal” counter-offer, the agent will probably loose the opportunity to reach a “globally good”agreement (especially in discounting domains). If � > 1, Ou and Ot are assigned toEru.t�/ and t� , respectively. Moreover, the agent records the utility and time ofits last bid as ul and tl , respectively. Otherwise, the estimated utility is set to �1,meaning it does not take effect anymore, and D.R.tc/; tc/ is used to set the targetutility u0.

When the agent expects to achieve better outcomes (see Eq. (13.9)), the optimalestimated utility Ou is chosen as the target utility for our agent’s future bids.Obviously, it is not rational to concede immediately to Ou when ul � Ou, nor shouldit shift to Ou without delay given ul < Ou, especially because the predication may benot absolutely accurate. To simplify the negotiation strategy, OMAC applies a linearconcession making and the concession rate is dynamically adjusted to grasp everychance to maximize its profit. Overall, the target utility u0 is given as follows:

u0 D(D.R.t/; t/ if Ou D �1Ou C .ul � Ou/ t�Ot

tl�Ot otherwise(13.14)


13.6 Response Mechanism

The response stage corresponds to lines 10 to 15 in Algorithm 5. With the targetutility u0 known (Eq. 13.14), the agent then needs to examine the counter-offer tosee if the utility of that offer U.Oopp/ is higher than the target utility. If so, it acceptsthis counter-offer and, with that, terminates the negotiation session. Otherwise, theagent construct a bid to be proposed next round whose utility is indicated by u0.

In multi-issue negotiation, offers with exactly the same utility for one side canhave different values for the other party. Moreover, in time-limited negotiationscenarios no explicit limitation is imposed on the number of negotiation roundsand it is possible to generate many offers having a utility close to u0. OMACtakes advantage of this and aims at generating many offers in order to explore thespace of possible outcomes and to increase the acceptance chance of own bids.Specifically, offers are constructed in such a way that the agent randomly selects anoffer whose utility is in the range Œ0:99u0; 1:01u0�. If no such solution is found, thelatest offer made by the agent is used again in the subsequent round. Moreover, inview of negotiation efficiency, if u0 drops below the utility of the best counter-offeraccording to the agent’s utility function, this best counter-offer is proposed by theagent as its next offer. This makes sense because the counter-offer tends to satisfythe expectation of opponent and is thus likely to be accepted by the opponent.

13.7 Conclusions and Future work

This paper introduced an effective negotiation agent called OMAC for automatednegotiation in complex—bilateral multi-issue, time-constrained, no prior knowl-edge, low computational load, etc.—scenarios. This agent, based on its efficientdecision-making mechanism, achieved the joint third place in ANAC 2012.

We think the experimental results justify to invest further research efforts into thisstrategy and we see several interesting research questions. First, are there opponentmodeling techniques which are even more efficient than wavelet decomposition andcubic smoothing spline? Second, are there techniques for concession rate adaptationwhich are more accurate than the basic technique currently used? And third, canopponent modeling of OMAC, which currently focuses on modeling the opponent’sstrategies, be extended toward modeling the opponent’s preferences as well?

References

1. de Boor, C.: A Practical Guide to Splines. Springer, New York (1978)2. Chen, S., Ammar, H.B., Tuyls, K., Weiss, G.: Optimizing complex automated negotiation

using sparse pseudo-input Gaussian processes. In: Proceedings of the 12th International JointConference on Autonomous Agents and Multi-Agent Systems, pp. 707–714. ACM, Saint Paul,Minnesota (2013)


3. Chen, S., Weiss, G.: An efficient and adaptive approach to negotiation in complex environments.In: Proceedings of the 20th European Conference on Artificial Intelligence, pp. 228–233. IOS,Montpellier, France (2012)

4. Daubechies, I.: Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics,Philadelphia, PA (2006)

5. Rubinstein, A.: Perfect equilibrium in a bargaining model. Econometrica 50(1), 97–109 (1982)6. Weiss, G. (ed.): Multiagent Systems, 2nd edn. MIT, Cambridge, MA (2013)7. Yousefi, S., Weinreich, I., Reinarz, D.: Wavelet-based prediction of oil prices. Chaos Solitons

Fract. 25(2), 265–275 (2005)

Chapter 14The Simple-Meta Agent

Litan Ilany and Ya’akov (Kobi) Gal

Abstract The Simple-Meta agent uses machine learning to select the negotiationstrategy that is predicted to be most successful based on structural features of thedomain.

Keywords Algorithm selection • Machine learning • Negotiation

14.1 Introduction

The Simple-Meta agent combines machine learning with known agent strategies forANAC in order to choose the best existing strategies for different domains. Thisagent exploits the fact that individual negotiation strategies from the literature varywidely in their best-case performance for different negotiation domains [1]. Ourmethodology consists of defining a set of features that encapsulate the informationabout the domain that is available to agents at the onset of negotiation. Thesefeatures are then used to predict the performance of existing negotiation strategieson a new domain using canonical learning methods including multi-layer neuralnetworks, decision trees, and linear and logistic regression. At run time, we selectthe negotiation strategy that is predicted to be most successful on a new domainbased on its features.

L. Ilany (�) • Y. GalBen-Gurion University of the Negev, Sheva, Israele-mail: [email protected]; [email protected]


197



198 L. Ilany and Y. Gal

14.2 Definitions

A domain consists of a set of issues L. Each issue l 2 L can take one ofpossible discrete values out of the set Vl . The domain is common knowledge to thenegotiating parties. A proposal p D .v1; : : : ; vjLj/ is an assignment of values to allissues inL. Let P denote the set of all possible proposals in a domain. A negotiationround involves two participants termed Agent1 and Agent2. Each agent has a profile,that determines its valuation of a proposal, which is private information. The profileof Agent1 includes (1) a valuation function o1 W Vl ! R mapping a value of issue lto the real numbers; (2) a weight vector for all issues W1 D .w1;1; : : : ;w1;jLj/ wherew1;l is the weight of issue l ; (3) a discount factor ı1; (4) a reservation value r1. (Theprofile of Agent2 is defined in a similar way).

In a negotiation round, Agent1 and Agent2 make alternating take-it-or-leave-it offers to each other until a proposal is accepted or a predetermined deadline isreached. Each agent has a role that determines whether the agent makes the first orsecond offer in the negotiation round. If an agreement is reached for a proposal pt

at time t , the utility of Agent1 is

u1.pt / D

� Xl2L

w1;l � o1.vl /

� ıt1 (14.1)

Otherwise, the utility of Agent1 is r1 � ıt1 (and similarly for Agent2). The score of anagent in a negotiation round is simply the utility it achieved in the round.

14.2.1 Constructing Domain Features

We defined three types of features for any domain d in a tournament. The firsttype corresponds to domain information that is common knowledge, including thefollowing features:

• the number of issues in the domain (jLj),• the average number of values in each issue AVG.fjVkj j 8k 2 Lg/,• the number of possible proposals in the domain .jP j D Q

k2L jVkj/.The second type of features corresponds to an agent’s profile which is privateinformation. We describe these features from the point of view of Agent1 at time 0(when it is needed to select an agent):

• the discount factor ı1 and the reservation value r1,• the standard deviation of weights over all possible issues SD.fw1;kj8k 2 Lg/,• the average utility at time t D 0 over all possible proposals in the domain

(AVG.fu1.p0/ j p0 2 P g/),• the standard deviation of its utility .fSD.fu1.p0/ j p0 2 P g/ over all possible

proposals P .

14 The Simple-Meta Agent 199

The third type of features corresponds to information that is inferred from the firstproposal p0 that Agent1 receives from Agent2. These features include:

• the utility of agent Agent1 at time 0 from the proposal (u1.p0/),• the average utility at time 0 of all proposals that are preferable to Agent1 than p0

(give higher utility) (AVG.fu1.q0/ j 8q0 s.t. u1.q0/ > u1.p0/g/),• the standard deviation over the utility over all such preferable proposals

(SD.fu1.q0/ j 8q0 s.t. u1.q0/ > u1.p0/g/).

14.2.2 The Simple Meta-Agent

Let sdi;j denote the score obtained by agent i when negotiating with agent j in any

domain d .1 Let sdi D AVG.fsdi;k j 8k 2 A; k ¤ ig/ denote the average score foragent i that negotiates in domain d over all training agents A. Let sd D AVG.fsdj j8j 2 Ag/ denote the average score of all training agents that negotiate with eachother in domain d . The optimal agent in A for a domain d is associated with thehighest average score sd� when negotiating with all of the testing agents in A0:

sd� D maxi2A.AVG.sdi;k j k 2 A0// (14.2)

We used canonical supervised learning algorithms to predict the performanceof agents given a domain and profile by the difference between the score ofagent i when negotiating with any agent k in domain d and the average scoreover all negotiations with all agents in the domain .sdi;k � sd /. We used differentlearning techniques to predict an agent’s performance when negotiating in a newdomain (adapting standard overfitting avoidance methods for each technique).A regression tree algorithm that selected the tree size for minimizing the cross-validation error [2]; a neural network with a single hidden layer and four hiddennodes, using early stopping after 150 iterations when training; a linear regressionmodel with a forward-backward selection method for choosing the predictivevariables [3].

The algorithm used by the simple meta-agent to choose an agent strategy is givenin Fig. 14.1 (presented from the point of view of Agent1). We assume knowledge ofa set of training domains D and agents A. This training data is used to learn (off-line) the models described above. Given a test domain d 2 D0 the agent first checkwhether d is already known (d 2 D). In this case, the best the meta-agent can do isto select the agent in A that achieved the best performance in d (line 2). Otherwise,the meta-agent will compute the features associated with the domain. These featuresdepend on receiving a proposal from Agent2 (line 5). If the meta-agent is the first

1We assume a one-to-one correspondence between an agent i 2 A and its negotiation strategy; weuse i to refer to either.

200 L. Ilany and Y. Gal

Known: Domains D, agents AInput: Test domain d ∈ D ′

Output: Agent strategy i ∗

1. If d ∈ D then

2. return agent i ∗ such that i ∗ ∈ argmaxi∈Asdi

3. If (Agent1.IsProposer) then

4. make first proposal p∗0 ∈ argmaxp0u1(p0)

5. Receive first proposal p ′ from Agent2

6. Get the feature list F(d, p ′)7. For each agent a ∈ A8. perd

a = predict performance using F(d, p ′)

9. return agent i ∗ such that i ∗ ∈ argmax i∈A per di

Fig. 14.1 Simple meta-agentalgorithm

proposer, it needs to make a proposal to Agent2. Lacking any information aboutthe profile of Agent2, it makes the proposal that provides it with maximal utility(lines 3–4). In line 6, the meta-agent computes the features associated with domaind and the proposal p0 received from Agent2. Finally, in lines 7–8 it predicts theperformance of each agent in A on domain d , and returns the agent with the highestpredicted performance (breaking ties randomly).

Essentially, the algorithm above describes a class of simple meta-agents thatdepend on which learning method and performance predicting measure is used.The run-time of the algorithm is dominated by the feature selection process, whichis polynomial in the size of the bid space in the domain. In practice, this processterminated in less than a second for each domain on a commodity core i5 computer.

References

1. Lin, R., Kraus, S., Baarslag, T., Tykhonov, D., Hindriks, K.V., Jonker, C.M.: Genius: anintegrated environment for supporting the design of generic automated negotiators. Comput.Intell. (2012)

2. Breiman, L., Friedman, J., Stone, C., Olshen, R.: Classification and Regression Trees. Chapman& Hall, New York (1984)

3. Shibata, R.: An optimal selection of regression variables. Biometrika 68(1), 45–54 (1981)

Index

AAcceptance

condition, 67, 197strategy, 45, 167threshold, 174threshold determination, 175

Adaptiveconcession, 173concession-making, 192exploitation, 175learning, 172

Agent architecture, 63Agent Based Complex Automated

Negotiations, 126Agent K negotiation strategy, 13Agent MR, 181Agent performance evaluation, 168Agreement quality, 34Algebraic analysis, 98, 107Alternating offers protocol, 115Alternating protocol, 188ANAC 2011, 73, 74, 163Analysis

for competitor strategies, 16for conceder strategies, 15for matcher strategies, 16

Automatedbilateral negotiation, 18mediator, 57multiparty negotiation, 18negotiation, 126, 187

Automated negotiating agent competition(ANAC), 126, 181

negotiation agents, 12results, 152setup, 152

Average agreements, 121

BBargaining, 188Bell utility function, 34Bidding based deal identification, 126Bidding strategy, 66, 164BOA architecture, 66

acceptance condition, 67advantages, 65ANAC agents, 70applications, 75behavior, 73bidding strategy, 66components, 69decoupling, 69dependencies, 69equivalence, 73Genius, 67opponent model, 67performance, 74

BOA framework, 164BOA framework overview, 164, 165Borda voting strategy, 8Bundle, 119

CCharacteristics of the opponent, 182Characteristic value vector, 144Common design issues, 172Common issues, 116, 117Competition results, 152, 157–161Competition setup, 152Complex negotiation scenario, 36Compute concessions, 182Conceder strategy, 13Concession

convexity, 121

I. Marsa-Maestre et al. (eds.), Novel Insights in Agent-based Complex AutomatedNegotiation, Studies in Computational Intelligence 535,DOI 10.1007/978-4-431-54758-7, © Springer Japan 2014

201

202 Index

Concession (cont.)curve patterns, 120degree, 183degree estimation refinement, 178rate, 166

Conflict, 49Constraints based utility space, 127Cubic smoothing spline, 191CUHKAgent, 172Cumulative distribution of utilities, 37

DDeadline, 113, 120Decision criteria, 128Delegate negotiators, 113Depth heuristic, 47Desperate strategy, 111Details of the domain, 182Discounted domains, 159Discounting factor, 156Discrete wavelet transformation, 190Distance information, 138Distinct object, 110Domain analysis, 165Dynamic coordination strategy, 112Dynamic counteroffer strategy, 117Dynamic multi-threaded negotiations, 111

EEmployer employee negotiation, 130Estimated received utility, 193Estimated utilities, 50–51Experimental setting, 15Experimental setup, 54–55Experiment parameters, 36Experiments and results, 12–13Explore or exploit, 52Extensions to ANAC, 161

FFeature selection, 198Feedback, 45Feedback and voting based negotiation

protocol, 53Feedback based negotiation protocol, 44, 45,

51Final round, 154, 158First-order differences, 117

GGeneralized pattern search (GPS), 25, 26Generation strategies, 115Genius, 10Geometric analysis, 94–98, 103, 105–107Geometric method, 105Global counteroffer value, 114GPS. See Generalized pattern search (GPS)Group decision making, 43Group distance, 34

HHaggler’s family of negotiation strategies,

12–13Heuristic, 182Heuristics for incomplete preferences, 47Hierarchical clustering, 25, 28Hill climber, 45HITS, 138Human factors, 161Hypertext Induced Topic Search (IRIDIS), 161

IImproving flips, 46Interdependent issues, 126, 161Intra-team strategy, 6Issue, 113, 114Issues counteroffers’ weight matrix, 110, 116Item frequency update, 177

JJoint gain search, 24

LLearning preferences, 45Linear concession, 194Linguistic quantifier, 31

MMatrix data structure, 114Maximize social welfare, 126Mediated negotiation, 44–45Mediation mechanism, 27–34Min Max Swap algorithm, 118Modelling contract spaces, 127Monetary constraint weight, 132

Index 203

Multi-acceptance criteria (MAC), 76Multi-issue domain, 154Multi-issue negotiation, 86, 87, 92–93, 100,

107Multilateral negotiation, 43Multilateral negotiation protocol, 51, 54, 57Multiple-acceptance criteria, 167

NNash product, 51Negotiation

competition, 152domains, 13, 54, 153, 182environment, 4failure rate, 127framework, 113object, 110performance, 4platform, 53–54protocol, 127quality measures, 168setting, 188teams, 19

Negotiation strategy, 115, 164architecture, 63challenge, 172components, 63effectivity, 62modular approach, 62space of, 62, 64, 65, 77

Negotiation team definition, 4, 5Nice tit-for-tat negotiation strategy, 13Non linear utility space, 127

OObject, 113Offer acceptance, 7–8Offer generation strategy, 113, 114Offer proposal, 7, 9One-many negotiation, 161One-to-many negotiation, 110Opponent model, 66, 166–167Opponent model accuracy, 166Opponent modeling scheme, 189Opponent preference modeling, 177Opponent strategy analysis, 164Opponent utility analysis, 16Optimistic expectation of opponent concession,

193Optimized patient, 111

Ordered weighted averaging (OWA), 25, 26,29, 30

Outcome evaluation, 178Outcome selection, 178OWA. See Ordered weighted averaging (OWA)OWA operators, 29–30

PPagerank, 138Patient strategy, 111Performance measure, 62Pessimistic expectation of opponent

concession, 194Possible actions, 188Preferences, 45, 86, 92, 93, 96, 100–107

domain, 153elicitation, 126estimation, 175graph, 48learning, 195modeling, 46profiles, 14, 159

Proof-of-concept scenario, 35

QQualifying round, 153, 157Quantifier-guided aggregation, 29, 30

RRandom counter-offers, 194Random utility space, 130Realistic distance between users, 142Recursive decomposition, 191Regression analysis, 86–90, 107Representative strategy, 6–7Reservation intervals, 114Reservation value, 115, 156Results analysis, 158

SScalable negotiation protocol, 127Scoring outcomes, 48Search, 51Searching for a bid, 182Self-interested assumption, 174Sigmoid function, 183Simple voting strategy, 6Simulated annealer, 45

204 Index

Social welfare, 24, 47, 160Social welfare optimality, 39Statistical significance, 156Strategy, 182

components, 115description, 174estimation, 173

Sub super relations, 128Supervised learning, 199Synchronized multi-threaded negotiations, 111

TTarget utility calculation, 166Team member strategy, 7–10The Negotiator Reloaded (TNR), 163Three core components of the agent, 189Time-dependent, 115Tit-for-tat, 116Trajectory analysis, 168Transitivity of preferences, 46

UUnanimity strategy, 8Undiscounted domains, 159User’s geo-location, 147Using distance information, 142Utility

function, 153gain, 121space, 182weight vector, 112

VValue Of Individual Disapproval (VOID), 33,

39Voting, 53

WWeighted average utility, 117

Documents

[Studies in Computational Intelligence] Novel Insights in Agent-based Complex Automated Negotiation Volume 535 ||