Upload
joao-miguel-oneill
View
228
Download
2
Embed Size (px)
Citation preview
Evolutionary Game Theory
Axel Tidemann
About me
● Post doc at the computer science department
● My PhD supervisor was Pinar, focus on AI and learning by imitation using neural networks
Background: game theory
● The study of decision making● Mathematically model your outcome given
the choices of your opponent● Represented as a payoff matrix● Classic examples: The Prisoner’s Dilemma,
Hawk-Dove game, Tragedy of the commons
Game theory concepts
● Nash equilibrium: a set of strategies that is each agent’s best response to the other agents’ strategies
● Pareto optimality: you cannot change the strategies without making another agent worse off
Why evolutionary game theory?
● Standard game theory does not explicitly incorporate time - important for learning
● Game theory is static, EGT is a dynamic theory
● Agents need not be rational● The notion of fitness is introduced, with the
concept of having offspring related to fitness
Dynamic properties of EGT
● Evolution need not be biological evolution, can be seen as cultural evolution (norms, beliefs that change over time)
● Being explicitly dynamic, it is better suited to model biological, economical and social behaviour
Two approaches
1. Evolutionary stability2. Population dynamics
Evolutionary stability
● Follows the work of Smith and Price, where finding an evolutionary stable strategy (ESS) is how games are analyzed
● An ESS is a situation where no mutant can enter and dominate the population (i.e. someone with a novel strategy)
Example: Hawk-Dove
● A population with all Doves will be invaded by Hawks, not an ESS
● If V > C, strategy Hawk is an ESS● If C > V, no ESS (unless mixing of
strategies)
Hawk DoveHawk (V-C)/2 V,0
Dove 0,V V/2
V: payoffC: cost of injury
Population dynamics
● By assuming the population is large, we keep track of the distribution of each strategy
● The change in strategy frequency is small from generation to generation
● This is expressed as differential equations called replicator dynamics
Example: Prisoner’s Dilemma
Replicator dynamics describe the evolution of the strategies
Coop DefectCoop R,R S,T
Defect T,S P,P
T > R > P > S
R: rewardT: temptationS: sucker’s payoffP: punishment
Replicator dynamics
pc, pd: proportions of cooperate, defectWC, WD: average fitness of cooperate, defectW: average fitness of the entire population
Visual representation
0: defection, 1: cooperation
ESS ~ Nash equilibrium?
● In Prisoner’s Dilemma, ESS and Nash equilibrium are the same
● However, this is often more complex (and therefore more interesting!) when it comes to EGT - for instance if more than two pure strategies exist
Issues
● Selecting Nash equilibrium - if pure strategies are enforced, some games lack solutions altogether
● Interpreting fitness in a cultural evolution● Is there any explanatory power?
Applications
● EGT has been used to explain many aspects of human behaviour (altruism, public goods game, social learning, language acquisition, to name a few)
● The following slides will present two examples of EGT
Dividing the cake
● Nobody has any claims to the cake● If you cannot agree on the share, the cake is
spoiled● The obvious solution to us is to split it evenly● This is one of many Nash equilibria - it all
depends on how much each agent asks for
Dividing the cake: unequal share
Why is this so?
● Simulations reveal that when interaction between players are equally likely, fair division emerges in 62% of the cases (i.e. vulnerable to the starting position of the simulation)
● However, when spatial correlation is introduced, this changes dramatically
With spatial correlation
correlation coefficient = 0
correlation coefficient = 0.1
correlation coefficient = 0.2
How can this be interpreted?
● When you deal with your neighbours, a more fair division emerges - you will have a better grasp of what is fair
● Origin of justice?● How does this translate to the real world?
Caveat emptor
● It all depends on how well the replicator dynamics actually reproduce social behaviour
● But it does seem enticing to think so...
Current work
● We have a Master’s student who is trying to model the process of deploying aquaculture sites on Frøya
● Aquaculture is big business in Norway (third export after oil and gas)
● However, has its environmental drawbacks (pollution, aesthetics)
Conflicting interest groups on Frøya
● The fishermen do not want sites on their fishing grounds
● The tourist industry wants to avoid ugly sites that destroy the scenery
● The local population wants jobs● The local administration wants tax income● A site necessitates subcontractors of various
services and goods
The process
● The local administration (“kommune”) makes a coastal development plan
● This plan includes various activities, including possible aquaculture sites
● The fishermen can then object to the plans, knowing it will destroy their fishing grounds
● However, other fishermen will then know of the fishing spots
The process
● The incentive of the fishermen to tell the truth is then cumbered by the knowledge that other fishermen start fishing there, and by the uncertainty of whether a site will be deployed or not
● There is a long time delay between the plan and any actual development, without certainty that it will be deployed
The process
● Once the local administration has decided, aquaculture industry can apply for sites
● The state gives out licenses based on the applications (the number of licenses depend heavily on environmental factors)
● Once administered, a site is being built (timespan ~5 years)
Goals
● We want to build a simulator that uses EGT to study the dynamics of the population
● The agents will learn from generation to generation, refining the simulator
● The simulator will be for Frøya itself for relevance
Ongoing research
● The outcome will be published as the Master’s thesis of Yngve Svalestuen
● It is our goal that this research will be continued (either as another Master’s thesis or PhD), so if you’re interested, contact us
References
http://plato.stanford.edu/entries/game-evolutionary/
The Master’s thesis of Yngve Svalestuen (when ready)