Download ppt - Learning in Games Scott E Page Russell Golman & Jenna Bednar University of Michigan-Ann Arbor

Learning in Games

Scott E PageRussell Golman & Jenna Bednar

University of Michigan-Ann Arbor

Overview

Analytic and Computational Models

Nash Equilibrium– Stability– Basins of Attraction

Cultural Learning

Agent Models

Domains of Interest

Analytic

Computable

Learning/Adaptation

Diversity

Interactions/Epistasis

Networks/Geography

Equilibrium Science

We can start by looking at the role that learning rules play in equilibrium systems. This will give us some insight into whether they’ll matter in complex systems.

Players

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.



Actions





Cooperate: C Defect: D

Payoffs

4,4 0,6

6,0 2,2

C D

C

D

Best Responses

4,4 0,6

6,0 2,2

C D

C

D

Best Responses

4,4 0,6

6,0 2,2

C D

C

D

Nash Equilibrium

4,4 0,6

6,0 2,2

C D

C

D 2,2

“Equilibrium” Based Science

Step 1: Set up gameStep 2: Solve for equilibriumStep 3: Show how equilibrium depends

on parameters of modelStep 4: Provide empirical support

Is Equilibrium Enough?

Existence: Equilibrium exists

Stability: Equilibrium is stable

Attainable: Equilibrium is attained by a learning rule.

Examples

Best respond to current state

Better respond

Mimic best

Mimic better

Include portions of best or better

Random with death of the unfit

Stability

Stability can only be defined relative to a learning dynamic. In dynamical systems, we often take that dynamic to be a best response function, but with human actors we need not assume people best respond.

Battle of Sexes Game

3,1 0,0

0,0 1,3

EF CG

EF

CG

Three Equilibria

3,1 0,0

0,0 1,3

1/4 3/4

3/4

1/4

Unstable Mixed?

3,1 0,0

0,0 1,3

1/4 +e 3/4 - e

EF

CG

3/4 + 3e

3/4 - 3e

Note the Implicit Assumption

Our stability analysis assumed that Player 1 would best respond to Player 2’s tremble. However, the learning rule could be go to the mixed strategy equilibrium. If so, Player 1 would sit tight and Player 2 would return to the mixed strategy equilibrium.

Empirical Foundations

We need to have some understanding of how people learn and adapt to say anything about stability.

Classes of Learning Rules

Belief Based Learning Rules: People best respond given their beliefs about how other people play.

Replicator Learning Rules: People replicate successful actions of others.

Stability Results

An extensive literature provides conditions (fairly week) under which the two learning rules have identical stability property.

Synopsis: Learning rules do not matter

Basins Question

Do games exist in which best response dynamics and replicator dynamics produce different basins of attraction?

Question: Does learning matter?

Best Response Dynamics

x = mixed strategy of Player 1

y = mixed strategy of Player 2

dx/dt = BR(y) - x

dy/dt = BR(x) - y

Replicator Dynamics

dxi/dt = xi( i - ave)

dyi/dt = yi( i - ave)

Symmetric Matrix Game

60 60 30

30 70 20

50 25 25

A

B

C

A B C













A

BC

A

B

B

Best Response Basins

60 60 30

30 70 20

50 25 25

A

B

C

A B C

A > B iff 60pA + 60pB + 30pC > 30pA + 70pB + 20pC

A > C iff 60pA + 60pB + 30pC > 50pA + 25pB + 25pC

Barycentric Coordinates

A B

C

.7C + . 3B

.5A + . 5B

.4A + . 2B + .4C

Best Responses

A

A

B

C

C

BC B

Stable Equilibria

A

A

B

C

C

BC B

Best Response Basins

A

A

B

C

C

BA B

Replicator Dynamics Basins

A B

C

A

C

C

B? B

Replicator Dynamics

A

A

B

C

C

B

ABB

Recall: Basins Question

Do games exist in which best response dynamics and replicator dynamics produce very different basins of attraction?

Question: Does learning matter?

Conjecture

For any > 0, There exists a symmetric matrix game such that the basins of attraction for distinct equilibria under continuous time best response dynamics and replicator dynamics overlap by less than

Collective Action Game

SI Coop Pred Naive

2 2 2 2

1 N+1 1 1

0 0 0 N2

0 0 -N2 0

SI

Coop

Pred

Naive

Naïve Goes Away

SI Coop

Pred

Basins on Face

SI Coop

Pred

Coop

SI

Lots of Predatory

SI Coop

Pred

Coop

SI

Best ResponsePred

Coop

SI

ReplicatorPred

Coop

SI

The Math

dxi/dt = xi( i - ave)

ave = 2xS + xC (1+NxC)

dxc/dt = xc[(1+ NxC- - 2xS - xC (1+NxC)]

dxc/dt = xc[(1+ NxC)(1- xC) - 2xS]

Choose N > 1/

Assume xc >

dxc/dt = xc[(1+ NxC)(1- xC) - 2xS]

dxc/dt > [2(1- ) - 2(1- )] = 0

Therefore, xc always increases.

Collective Action Game

SI Coop Pred Naive

2 2 2 2

1 N+1 1 1

0 0 0 N2

0 0 -N2 0

SI

Coop

Pred

Naive

Theorem: In any symmetric matrix game for which best response and replicator dynamics attain different equilibria with probability one, there exists an action A that is both an initial best response with probability one and not an equilibrium.

Convex Combinations

Suppose the population learns using the following rule

a (Best Response) + (1-a) Replicator

Claim: There exists a class of games in which Best Response, Replicator Dynamics, and any fixed convex combination select distinct equilibria with probability one.

Best Response -> A

Replicator -> B

a BR + (1-a) Rep -> C

Learning/Adaptation

Diversity

Interactions/Epistasis

Networks/Geography

Diversity

EWA (Wilcox) and Quantal Response (Golman) learning models are misspecified for heterogenous agents.

EWA = “hybrid of response and belief based learning”

Interactions







7,7 4,14

14,4 5,5

7,7 3,11

11,3 5,5

GameS Theory

E O

M

E O

M

E O

M

E O

M

Bednar (2008), Bednar and Page (2007)

Learning Spillovers

A

A

B

C

C

BC B

Initial Conditions

A

A

B

C

C

BC B

Networks/Geography

A B B

B A

B C B

?

A

A

B

C

B

B

C

A

C

AC

BB

B

A

A

A

A

B

C

B

B AC BA

C CBB AB

C

A

A

B

C

B

C

A

ABM complement Math