Upload
lucas-bowman
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
Simple Search Methods for Finding a Nash Equilibrium
Ryan Porter, Eugene Nudelman, & Yoav Shoham
Computer Science Department
Stanford University
Finding a Sample Nash Equilibrium
Nash equilibrium (NE) Arguably the most important concept in game theory One always exists [N51]
Finding a sample NE in a normal form game: Considered hard, but unknown whether it is NP-hard State of the art among existing algorithms:
Lemke-Howson [LH64] Simplicial Subdivision [VV87] Govindan-Wilson [GW03] & [BSK03]
Our algorithms: simple Artificial Intelligence methods that perform well in practice
2-player games
N-player games
Notation
Normal Form Game G = h N,(Ai),(ui) i:
N = {1,…,n}: set of players
Ai: set of available actions for player i
ui: A1 x …x An ! <
Player i selects a mixed strategy:
pi: Ai ! [0,1], s.t. ai 2 Ai pi(ai) = 1
Utility function extended to take p=(p1,…,pn):
ui(p) = a 2 A ui(a) i 2 N pi(ai)
A strategy profile p* is a NE if:
8 i 2 N, ai 2 Ai: ui(ai,p*-i) ≤ ui(p*
i,p*-i)
1,-1 -1,1
-1,1 1,-1
1/2
1/2
0
0
0 0
1/2 1/2
A Harder Game
2,3 -1,4 2,4 5,2 1,-1
2,2 3,0 4,1 -2,4 1,3
4,6 7,2 2,-2 4,9 2,1
9,0 -2,6 6,3 7,0 0,5
3,2 6,1 2,5 5,3 1,05/11
2/11
0
0
4/11
3/7 2/72/70 0
Searching Over Supports
Feasibility Problem: Input: S = (S1,...,SN), where 8 i 2 N, Si µ Ai Find: p=(p1,…,pn) and v=(v1,…,vn) Subject to:
8 i 2 N
8 ai 2 Si, pi(ai) ≥ 0
8 ai 2 Si, pi(ai) = 0
ai 2 Ai pi(ai) = 1
8 ai 2 Si, a-i 2 A-i ui(ai,a-i) ji p(aj) = vi
8 ai 2 Si, a-i 2 A-i ui(ai,a-i) ji p(aj) ≤ vi
2,3 -1,4 2,4 5,2 1,-1
2,2 3,0 4,1 -2,4 1,3
4,6 7,2 2,-2 4,9 2,1
9,0 -2,6 6,3 7,0 0,5
3,2 6,1 2,5 5,3 1,05/11
2/11
0
0
4/11
3/7 2/72/70 0
22 3 3 3
2
2
4
4
4
Features of Algorithm
1) Prefer balanced supports
2) Prefer small supports Motivated by existing theoretical results for particular
distributions (e.g., [MB02])
3) Separately instantiate supports, and remove conditionally dominated actions: An ai is conditionally dominated, given R-i µ A-i if:
9 ai' 2 Ai, 8 a-i 2 R-i, ui(ai,a-i) < ui(ai',a-i) Especially useful in conjunction with (2)
Two-Player Algorithm
FOR ALL x = (x1,x2), sorted in increasing order of
|x1 – x2| and (x1 + x2)
FOR ALL S1 µ A1 s.t. |S1| = x1
A2' ← {a2 2 A2 not conditionally dominated, given S1}
IF @ a1 2 S1 conditionally dominated, given A2'
FOR ALL S2 µ A2' s.t. |S2| = x2
IF @ a1 2 S1 conditionally dominated, given S2
IF Feasibility Problem satisfied for (S1,S2)
Return found NE p
N-Player Algorithm
Constraint Satisfaction Problem (CSP) for each support size profile x=(x1,x2): Variables: Si
Domain: all subsets of Ai of size xi
Constraint: support profile S is consistent with a NE 2-player algorithm:
Backtracking, enforcing arc consistency w.r.t. weaker constraints that no conditionally dominated actions in S
N-player algorithm: Generalizes the 2-player algorithm Ordering of size and balance reversed
Experimental Results
Most previous empirical tests only on “random” games: Each payoff drawn independently from uniform distribution
GAMUT [NWSL04] Based on extensive literature search Generates games from a wide variety of distributions Available at http://gamut.stanford.edu
D1 Bertrand Oligopoly D2 Bidirectional LEG, Complete Graph
D3 Bidirectional LEG, Random Graph D4 Bidirectional LEG, Star Graph
D5 Covariance Game: = 0.9 D6 Covariance Game: = 0
D7 Covariance Game: Random 2 [-1/(N-1),1] D8 Dispersion Game
D9 Graphical Game, Random Graph D10 Graphical Game, Road Graph
D11 Graphical Game, Star Graph D12 Location Game
D13 Minimum Effort Game D14 Polymatrix Game, Random Graph
D15 Polymatrix Game, Road Graph D16 Polymatrix Game, Small-World Graph
D17 Random Game D18 Traveler’s Dilemma
D19 Uniform LEG, Complete Graph D20 Uniform LEG, Random Graph
D21 Uniform LEG, Star Graph D22 War Of Attrition
2-player Games
Tested on 100 2-player, 300-action games for each of 22 distributions Capped all runs at 1800s
0.01
0.1
1
10
100
1000
10000
Distribution
Tim
e (
s)
Algorithm 1 Lemke-Howson
2-player Games: Scaling
1
10
100
1000
10000
400 500 600 700 800 900 1000
Actions
Tim
e (
s)
Algorithm 1 Lemke-Howson
2-player Games: Covariance Games
Covariance Games: For each action profile, payoffs of all players drawn from a multivariate normal distribution, with identical covariance between any two players
0.01
0.1
1
10
100
1000
10000
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Covariance
Tim
e (
s)
N-player Games
Tested on 100 6-player, 5-action games for each distribution
0.001
0.01
0.1
1
10
100
1000
10000
Distribution
Tim
e (
s)
Algorithm 2 Simplicial Subdivision Govindan-Wilson
N-player Games: Scaling
6-Action, Random Games
0.01
0.1
1
10
100
1000
10000
3 4 5 6 7 8
Players
Tim
e (
s)
5-Player, Random Games
0.1
1
10
100
1000
10000
3 4 5 6 7 8
Actions
Tim
e (
s)
Algorithm 2
Simplicial Subdivision
Govindan-Wilson
N-player Games: Covariance Games
0.01
0.1
1
10
100
1000
10000
-0.2 0 0.2 0.4 0.6 0.8 1
Covariance
Tim
e (
s)
BFS Lemke-Howson
Lemke-Howson algorithm: Pivoting method to solve LCP for a 2-player game First pivot is an arbitrary selection of a1 2 A1
Afterwards, a deterministic path to a NE Idea: favor “simple” solutions Breadth-First Search:
FOR ALL a1 2 A1
Initialize Lemke-Howson(a1)
REPEAT
FOR ALL a1 2 A1
Pivot Lemke-Howson(a1)
IF found a NE, THEN return p
2-player “Random” Games
1.18
208
1.18
0.1
1
10
100
1000
Algorithm 1 Lemke-Howson BFS Lemke-Howson
Tim
e (
s)
2-player Games: Covariance Games
0.01
0.1
1
10
100
1000
10000
-1 -0.5 0 0.5 1
Covariance
Tim
e (
s)
Summary
CSP-based algorithms Heuristics:
Favor balanced and small supports Eliminate conditionally dominated strategies
Perform well in practice BFS Lemke-Howson
In preliminary results, performs even better than our 2-player algorithm
Commentary on problem: Games researchers care about tend to have at least
one “simple” solution
Future Work
Coming to Gambit Focus on “Covariance” Games, with low covariance Other techniques from Artificial Intelligence
Local Search: State: support profile Operators: add or delete an action Score: based on relaxation of the feasibility problem
Simple Search Methods for Finding a Nash Equilibrium
Ryan Porter, Eugene Nudelman, & Yoav Shoham
Computer Science Department
Stanford University