Upload
eduardocasi
View
213
Download
0
Embed Size (px)
Citation preview
8/6/2019 emp09-approx-nash
1/38
1
Algorithms for Computing
Approximate Nash Equilibria
Vangelis Markakis
Athens University of Economics and Business
8/6/2019 emp09-approx-nash
2/38
2
Outline
Introduction to Games
- The concepts of Nash and -Nash equilibrium
Computing approximate Nash equilibria
- A subexponential algorithm for any constant > 0- Polynomial time approximation algorithms
Conclusions
8/6/2019 emp09-approx-nash
3/38
3
What is Game Theory?
Game Theory aims to help us understand
situations in which decision makers interact
Goals:
Mathematical models for capturing the properties of
such interactions
Prediction (given a model how should/would a rational
agentact?)
Rational agent: when given a choice, the agent alwayschooses the option that yields the highest utility
8/6/2019 emp09-approx-nash
4/38
4
Models of Games
Cooperative or noncooperative
Simultaneous moves or sequential
Finite or infinite
Complete information or incomplete information
8/6/2019 emp09-approx-nash
5/38
5
In this talk:
Cooperative or noncooperative
Simultaneous moves or sequential
Finite or infinite
Complete information or incomplete information
8/6/2019 emp09-approx-nash
6/38
6
Noncooperative Games in Normal Form
2, 2 0, 4
4, 0 -1, -1
Row
player
Column PlayerThe Hawk-Dove
game
8/6/2019 emp09-approx-nash
7/38
7
Example 2: The Bach or Stravinsky
game (BoS)
2, 1 0, 0
0, 0 1, 2
8/6/2019 emp09-approx-nash
8/38
8
Example 3: A Routing Game
s tB: 7.5x
A: 5x
C: 10x
8/6/2019 emp09-approx-nash
9/38
9
Example 3: A Routing Game
10, 10 5, 7.5 5, 10
7.5, 515, 15 7.5, 10
10, 7.510, 5 20, 20
A B C
A
B
C
8/6/2019 emp09-approx-nash
10/38
10
Definitions
2-player game (R, C):
n availablepure strategies for each player
n x n payoff matricesR, C
i, j played payoffs :Rij, Cij
Mixed strategy: Probability distribution over [n]
Expected payoffs :
8/6/2019 emp09-approx-nash
11/38
11
Solution Concept
x*, y* is a Nash equilibrium if no player has a unilateral
incentive to deviate:
(x, Ry*) (x*, Ry*) x
(x*, Cy) (x*, Cy*) y
[Nash, 1951]: Every finite game has a mixed strategy
equilibrium.
Proof: Based on Brouwers fixed point theorem.
(think of it as a steady state)
8/6/2019 emp09-approx-nash
12/38
12
Solution Concept
x*, y* is a Nash equilibrium if no player has a unilateral
incentive to deviate:
(x, Ry*) (x*, Ry*) x
(x*, Cy) (x*, Cy*) y
[Nash, 1951]: Every finite game has a mixed strategy
equilibrium.
Proof: Based on Brouwers fixed point theorem.
(think of it as a steady state)
8/6/2019 emp09-approx-nash
13/38
13
Solution Concept
x*
, y*
is a Nash equilibrium if no player has a unilateralincentive to deviate to a pure strategy:
(xi, Ry*) (x*, Ry*) xi
(x*, Cyj) (x*, Cy*) yj
It suffices to consider only deviations topure strategies
Letxi = (0, 0,,1, 0,,0) be the ith pure strategy
8/6/2019 emp09-approx-nash
14/38
14
Example: The Hawk-Dove Game
2, 2 0, 4
4, 0 -1, -1
Row
player
Column Player
8/6/2019 emp09-approx-nash
15/38
15
Example 2: The Bach or Stravinsky
game (BoS)
2, 1 0, 0
0, 0 1, 2
3 equilibrium points:
1. (B, B)
2. (S, S)
3. ((2/3, 1/3), (1/3, 2/3))
8/6/2019 emp09-approx-nash
16/38
16
Complexity issues
m = 2 players, known algorithms: worst case exponential time[Kuhn 61, Lemke, Howson 64, Mangasarian 64, Lemke 65]
IfNP-hard NP = co-NP [Megiddo, Papadimitriou 89] NP-hard if we add more constraints (e.g. maximize sum of payoffs)
[Gilboa, Zemel 89, Conitzer, Sandholm 03]
Representation problems
m = 3, there exist games with rational data BUT irrational equilibria[Nash 51]
PPAD-complete even for m = 2[Daskalakis, Goldberg, Papadimitriou 06, Chen, Deng, Teng 06]Poly-time equivalent to:
finding approximate fixed points of continuous maps on convex andcompact domains
8/6/2019 emp09-approx-nash
17/38
17
Approximate Nash Equilibria
Recall definition of Nash eq. :
(x, Ry*) (x*, Ry*) x
(x*, Cy) (x*, Cy*) y
-Nash equilibria (incentive to deviate ) :
(x, Ry*
) (x*
, Ry*
) +
x(x*, Cy) (x*, Cy*) + y
Normalization: entries ofR, C
in [0,1]
8/6/2019 emp09-approx-nash
18/38
18
Searching for Approximate Equilibria
[Lipton, M., Mehta 03]: For any
in (0,1), and for everyk 9logn/2, there exists a pair ofk-uniform strategiesx,y
that form an -Nash equilibrium.
Definition: A k-uniform strategy is a strategy where all
probabilities are integer multiples of 1/k
e.g. (3/k, 0, 0, 1/k, 5/k, 0,, 6/k)
8/6/2019 emp09-approx-nash
19/38
19
A Subexponential Algorithm (Quasi-PTAS)
[Lipton, M., Mehta 03]: For any
in (0,1), and for everyk 9logn/2, there exists a pair ofk-uniform strategiesx,y
that form an -Nash equilibrium.
Definition: A k-uniform strategy is a strategy where all
probabilities are integer multiples of 1/k
e.g. (3/k, 0, 0, 1/k, 5/k, 0,, 6/k)
Corollary : We can compute an -Nash equilibrium intime
Proof: There are nO(k) pairs of strategies to look at.
Verify -equilibrium condition.
8/6/2019 emp09-approx-nash
20/38
20
Proof of Existence
Letx*,y* be a Nash equilibrium.
- Sample ktimes from the set of pure strategies of
the row player, independently, at random, according
tox* k-uniform strategyx
- Same for column player k-uniform strategyy
Based on the probabilistic method (sampling)
Suffices to show Pr[x,y form an -Nash eq.] > 0
8/6/2019 emp09-approx-nash
21/38
21
Proof (contd)
Enough to consider deviations to pure strategies
(xi,Ry) (x,Ry) + i
(xi,Ry): sum ofkrandom variables with mean (xi,Ry*)
Chernoff-Hoeffding bounds (xi
,Ry) (xi
,Ry*
) withhigh probability
(xi,Ry) (xi,Ry*) (x*,Ry*) (x,Ry)
Finally when k = (logn/2) :
Pr[ deviation with gain more than ] =
8/6/2019 emp09-approx-nash
22/38
22
Multi-player Games
For m players, same technique:
support size: k = O(m2
log(m2
n)/2
)
running time: exp(logn, m, 1/)
Previously [Scarf 67]: exp(n, m, log(1/))(fixed point approximation)
[Lipton, M. 04]: exp(n, m) but poly(log(1/))(using algorithms for polynomial equations)
8/6/2019 emp09-approx-nash
23/38
23
Outline
Introduction to Games
- The concepts of Nash and -Nash equilibrium
Computing approximate Nash equilibria
- A subexponential algorithm for any constant > 0- Polynomial time approximation algorithms
Conclusions
P l i l Ti A i ti
8/6/2019 emp09-approx-nash
24/38
24
Polynomial Time Approximation
Algorithms
For = 1/2:
Feder, Nazerzadeh, Saberi 07: For < 1/2, we need
support at least (log n)
i
k
j
Pick arbitrary row i
Letj = best response to i
Find k= best response toj,
play i or kwith prob. 1/2
Rij, Cij
Rkj
, Ckj
P l i l Ti A i ti
8/6/2019 emp09-approx-nash
25/38
25
Polynomial Time Approximation
AlgorithmsDaskalakis, Mehta, Papadimitriou (EC 07):
in P for = 1-1/= (3-5)/2 0.382 (= golden ratio)
Bosse, Byrka, M. (WINE 07): a different LP-based method
1. Algorithm 1: 1-1/
2. Algorithm 2: 0.364
- ased on sampling + Linear Programming
- Need to solve polynomial number of linear programs
Running time: need to solve one linear program
8/6/2019 emp09-approx-nash
26/38
26
Approach
Fact: 0-sum games can be solved in polynomial time
(equivalent to linear programming)
- Start with an equilibrium of the 0-sum
game (R-C, C-R)- If incentives to deviate are high, players
take turns and adjust their strategies via best
response moves
0-sum games: games of the form (R, -R)
Similar idea used in [Kontogiannis, Spirakis 07]
for a different notion of approximation
8/6/2019 emp09-approx-nash
27/38
27
Algorithm 1
1. Find an equilibriumx*,y* of the 0-sum game (R - C, C - R)
2. Let g1, g2 be the incentives to deviate for row and column
player respectively. Suppose g1 g2
3. If g1 , outputx*,y*
4. Else: let b1 = best response toy*, b2 = best response to b1
5. Output:
x = b1
y = (1 - 2)y* + 2 b2
Theorem: Algorithm 1 with
= 1-1/and
2 = (1- g1) / (2- g1)achieves a (1-1/)-approximation
Parameters:,
2
[0,1]
8/6/2019 emp09-approx-nash
28/38
28
Analysis of Algorithm 1
Why start with an equilibrium of(R - C, C - R)?
Intuition: If row player profits from a deviation fromx* then
column player also gains at least as much
Case 1: g1 -approximation
Case 2: g1 >
for row player
2
for column player (1 - 2)(1 - (b1, Cy*))
(1 - 2)(1 - g1) = (1 - g1) / (2 - g1)
max{, (1 - )/(2 - )}-approximation
Incentive to deviate:
8/6/2019 emp09-approx-nash
29/38
29
Analysis of Algorithm 1
8/6/2019 emp09-approx-nash
30/38
30
Towards a better algorithm
1. Find an equilibriumx*,y* of the 0-sum game (R - C, C - R)
2. Let g1, g2 be the incentives to deviate for row and column
player respectively. Suppose g1
g2
3. If g1 , outputx*,y*
4. Else: let b1 = best response toy*, b2 = best response to b1
5. Output:
x = b1
y = (1 - 2)y* + 2 b2
l i h
8/6/2019 emp09-approx-nash
31/38
31
Algorithm 2
1. Find an equilibriumx*,y* of the 0-sum game (R - C, C - R)
2. Let g1, g2 be the incentives to deviate for row and column
player respectively. Suppose g1
g2
3. If g1 [0, 1/3], outputx*,y*
4. If g1 (1/3, ],
- let r1 = best response toy*, x = (1 - 1)x
* + 1 r1
- let b2 = best response tox, y = (1 - 2)y* + 2 b2
5. If g1 (, 1] output:
x = r1
y = (1 - 2)y* + 2 b2
Analysis of Algorithm 2
8/6/2019 emp09-approx-nash
32/38
32
Analysis of Algorithm 2
(Reducing to an optimization question)
- Let h = (x*, Cb2) - (x*, Cy*)
Theorem: The approximation guarantee of Algorithm 2 is
0.364 and is given by:
- We set 2 so as to equalize the incentives of the players to
deviate
8/6/2019 emp09-approx-nash
33/38
33
Analysis of Algorithm 2 (solution)
Optimization yields:
8/6/2019 emp09-approx-nash
34/38
34
Graphically:
8/6/2019 emp09-approx-nash
35/38
35
Analysis tight example
0, 0 , ,
, 0, 1 1, 1/2
, 1, 1/2 0, 1
(R, C) =
=
8/6/2019 emp09-approx-nash
36/38
36
Remarks and Open Problems
Spirakis, Tsaknakis (WINE 07): currently bestapproximation of0.339 yet another LP-based method
Polynomial Time Approximation Scheme (PTAS)?
Yes if: rank(R) = O(1) & rank(C) = O(1) [Lipton, M. Mehta 03]
rank(R+C) = O(1) [Kannan, Theobald 06]
PPAD-complete for = 1/n [Chen, Deng, Teng 06]
8/6/2019 emp09-approx-nash
37/38
37
Other Notions of Approximation
-well-supported equilibria: every strategy in thesupport is an approximate best response
[Kontogiannis, Spirakis 07]: 0.658-approximation, based
also on solving 0-sum games
Strong approximation: output is geometrically close
to an exact Nash equilibrium [Etessami, Yannakakis 07]: mostly negative results
h k
8/6/2019 emp09-approx-nash
38/38
38
Thank You!