Learning in Games
Scott E PageRussell Golman & Jenna Bednar
University of Michigan-Ann Arbor
Overview
Analytic and Computational Models
Nash Equilibrium– Stability– Basins of Attraction
Cultural Learning
Agent Models
Domains of Interest
Analytic
Computable
Learning/Adaptation
Diversity
Interactions/Epistasis
Networks/Geography
Equilibrium Science
We can start by looking at the role that learning rules play in equilibrium systems. This will give us some insight into whether they’ll matter in complex systems.
Players
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Actions
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Cooperate: C Defect: D
Payoffs
4,4 0,6
6,0 2,2
C D
C
D
Best Responses
4,4 0,6
6,0 2,2
C D
C
D
Best Responses
4,4 0,6
6,0 2,2
C D
C
D
Nash Equilibrium
4,4 0,6
6,0 2,2
C D
C
D 2,2
“Equilibrium” Based Science
Step 1: Set up gameStep 2: Solve for equilibriumStep 3: Show how equilibrium depends
on parameters of modelStep 4: Provide empirical support
Is Equilibrium Enough?
Existence: Equilibrium exists
Stability: Equilibrium is stable
Attainable: Equilibrium is attained by a learning rule.
Examples
Best respond to current state
Better respond
Mimic best
Mimic better
Include portions of best or better
Random with death of the unfit
Stability
Stability can only be defined relative to a learning dynamic. In dynamical systems, we often take that dynamic to be a best response function, but with human actors we need not assume people best respond.
Battle of Sexes Game
3,1 0,0
0,0 1,3
EF CG
EF
CG
Three Equilibria
3,1 0,0
0,0 1,3
1/4 3/4
3/4
1/4
Unstable Mixed?
3,1 0,0
0,0 1,3
1/4 +e 3/4 - e
EF
CG
3/4 + 3e
3/4 - 3e
Note the Implicit Assumption
Our stability analysis assumed that Player 1 would best respond to Player 2’s tremble. However, the learning rule could be go to the mixed strategy equilibrium. If so, Player 1 would sit tight and Player 2 would return to the mixed strategy equilibrium.
Empirical Foundations
We need to have some understanding of how people learn and adapt to say anything about stability.
Classes of Learning Rules
Belief Based Learning Rules: People best respond given their beliefs about how other people play.
Replicator Learning Rules: People replicate successful actions of others.
Stability Results
An extensive literature provides conditions (fairly week) under which the two learning rules have identical stability property.
Synopsis: Learning rules do not matter
Basins Question
Do games exist in which best response dynamics and replicator dynamics produce different basins of attraction?
Question: Does learning matter?
Best Response Dynamics
x = mixed strategy of Player 1
y = mixed strategy of Player 2
dx/dt = BR(y) - x
dy/dt = BR(x) - y
Replicator Dynamics
dxi/dt = xi( i - ave)
dyi/dt = yi( i - ave)
Symmetric Matrix Game
60 60 30
30 70 20
50 25 25
A
B
C
A B C
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
A
BC
A
B
B
Best Response Basins
60 60 30
30 70 20
50 25 25
A
B
C
A B C
A > B iff 60pA + 60pB + 30pC > 30pA + 70pB + 20pC
A > C iff 60pA + 60pB + 30pC > 50pA + 25pB + 25pC
Barycentric Coordinates
A B
C
.7C + . 3B
.5A + . 5B
.4A + . 2B + .4C
Best Responses
A
A
B
C
C
BC B
Stable Equilibria
A
A
B
C
C
BC B
Best Response Basins
A
A
B
C
C
BA B
Replicator Dynamics Basins
A B
C
A
C
C
B? B
Replicator Dynamics
A
A
B
C
C
B
ABB
Recall: Basins Question
Do games exist in which best response dynamics and replicator dynamics produce very different basins of attraction?
Question: Does learning matter?
Conjecture
For any > 0, There exists a symmetric matrix game such that the basins of attraction for distinct equilibria under continuous time best response dynamics and replicator dynamics overlap by less than
Collective Action Game
SI Coop Pred Naive
2 2 2 2
1 N+1 1 1
0 0 0 N2
0 0 -N2 0
SI
Coop
Pred
Naive
Naïve Goes Away
SI Coop
Pred
Basins on Face
SI Coop
Pred
Coop
SI
Lots of Predatory
SI Coop
Pred
Coop
SI
Best ResponsePred
Coop
SI
ReplicatorPred
Coop
SI
The Math
dxi/dt = xi( i - ave)
ave = 2xS + xC (1+NxC)
dxc/dt = xc[(1+ NxC- - 2xS - xC (1+NxC)]
dxc/dt = xc[(1+ NxC)(1- xC) - 2xS]
Choose N > 1/
Assume xc >
dxc/dt = xc[(1+ NxC)(1- xC) - 2xS]
dxc/dt > [2(1- ) - 2(1- )] = 0
Therefore, xc always increases.
Collective Action Game
SI Coop Pred Naive
2 2 2 2
1 N+1 1 1
0 0 0 N2
0 0 -N2 0
SI
Coop
Pred
Naive
Theorem: In any symmetric matrix game for which best response and replicator dynamics attain different equilibria with probability one, there exists an action A that is both an initial best response with probability one and not an equilibrium.
Convex Combinations
Suppose the population learns using the following rule
a (Best Response) + (1-a) Replicator
Claim: There exists a class of games in which Best Response, Replicator Dynamics, and any fixed convex combination select distinct equilibria with probability one.
Best Response -> A
Replicator -> B
a BR + (1-a) Rep -> C
Learning/Adaptation
Diversity
Interactions/Epistasis
Networks/Geography
Diversity
EWA (Wilcox) and Quantal Response (Golman) learning models are misspecified for heterogenous agents.
EWA = “hybrid of response and belief based learning”
Interactions
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
7,7 4,14
14,4 5,5
7,7 3,11
11,3 5,5
GameS Theory
E O
M
E O
M
E O
M
E O
M
Bednar (2008), Bednar and Page (2007)
Learning Spillovers
A
A
B
C
C
BC B
Initial Conditions
A
A
B
C
C
BC B
Networks/Geography
A B B
B A
B C B
?
A
A
B
C
B
B
C
A
C
AC
BB
B
A
A
A
A
B
C
B
B AC BA
C CBB AB
C
A
A
B
C
B
C
A
ABM complement Math