Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Nash Equilibrium in Tullock Contests
Aidas Masiliunas1
1Aix-Marseille School of Economics
Controversies in Game Theory III, ETH Zurich
2 June, 2016
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rent-seeking (Tullock) contest
Two players compete for a prize (16 ECU) by making costlyinvestments (x1, x2 ≤ 16)
Higher investments increase the probability to win the prize
Probability that player i receives the prize: xixi+xj
Applications:
Competition for monopoly rentsInvestments in R&DCompetition for a promotion/bonusPolitical contests
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rent-seeking (Tullock) contest
Two players compete for a prize (16 ECU) by making costlyinvestments (x1, x2 ≤ 16)
Higher investments increase the probability to win the prize
Probability that player i receives the prize: xixi+xj
Applications:
Competition for monopoly rentsInvestments in R&DCompetition for a promotion/bonusPolitical contests
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Theory
E (π) = xixi+xj
· 16 + 16− xi
BRi (xj) : x∗i =√
16xj − xj
RNNE : x∗i = 4, dominance solvable in three steps.
5 10 15
510
15
Standard preferences
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1214
16
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Explanatory power of Nash equilibrium in experiments
7.04% of choices are exactly Nash
60.19% of choices are strictly dominated
Investments are spread across the whole strategy space
Experience does not help
Less stability compared to auctions
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Comparative statics of Nash equilibrium
An alternative to point predictions is comparative statics
Is behaviour sensitive to changes in the Nash prediction?
Players Nash Mean investment
2 250 3253 222 2834 188 3025 160 3229 99 326
Source: Lim, Matros & Turocy, 2014
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Comparative statics of Nash equilibrium
An alternative to point predictions is comparative statics
Is behaviour sensitive to changes in the Nash prediction?
Players Nash Mean investment
2 250 3253 222 2834 188 3025 160 3229 99 326
Source: Lim, Matros & Turocy, 2014
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Why should players choose Nash equilibrium?
Interpretation #1: Nash equilibrium is the unique actionprofile that can be justified by common knowledge ofrationality.
Rationality = maximization of expected payoff given somebelief.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rationalizable strategies
xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16BR(xi ) 3 4 4 4 4 4 4 3 3 3 2 2 1 1 1 1
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rationalizable strategies
xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
BR(xi ) 3 4 4 4 4 4 4 3 3 3 2 2 1 1 1 1
BR(BR(xi )) 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3BR(BR(BR(xi ))) 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + beliefthat the opponent believes in my rationality
Rationalizable: 4
Epistemic definition of Nash equilibrium: common belief inrationality + simple belief hierarchy
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rationalizable strategies
xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
BR(xi ) 3 4 4 4 4 4 4 3 3 3 2 2 1 1 1 1BR(BR(xi )) 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3
BR(BR(BR(xi ))) 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + beliefthat the opponent believes in my rationality
Rationalizable: 4
Epistemic definition of Nash equilibrium: common belief inrationality + simple belief hierarchy
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rationalizable strategies
xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
BR(xi ) 3 4 4 4 4 4 4 3 3 3 2 2 1 1 1 1BR(BR(xi )) 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3BR(BR(BR(xi ))) 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + beliefthat the opponent believes in my rationality
Rationalizable: 4
Epistemic definition of Nash equilibrium: common belief inrationality + simple belief hierarchy
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Rationalizable strategies
xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
BR(xi ) 3 4 4 4 4 4 4 3 3 3 2 2 1 1 1 1BR(BR(xi )) 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3BR(BR(BR(xi ))) 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Rationality
Rationalizable: 3, 4, 2, 1
Rationality + belief that the opponent is rational
Rationalizable: 3, 4
Rationality + belief that the opponent is rational + beliefthat the opponent believes in my rationality
Rationalizable: 4
Epistemic definition of Nash equilibrium: common belief inrationality + simple belief hierarchy
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot beruled out by common knowledge of rationality.
1 Players care about expected payoffs2 Players have the ability to calculate expected payoffs and
identify dominated strategies3 Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium is the rest point of various learning dynamics
Belief-based learning, e.g. Cournot best-response, fictitiousplay
Assumption 3 is not necessary
Payoff-based learning, e.g. reinforcement learning
Players must be willing to explore, remember past payoffs,receive accurate feedback.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot beruled out by common knowledge of rationality.
1 Players care about expected payoffs2 Players have the ability to calculate expected payoffs and
identify dominated strategies3 Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium is the rest point of various learning dynamics
Belief-based learning, e.g. Cournot best-response, fictitiousplay
Assumption 3 is not necessary
Payoff-based learning, e.g. reinforcement learning
Players must be willing to explore, remember past payoffs,receive accurate feedback.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Why should players choose Nash equilibrium?
Nash equilibrium is the unique action profile that cannot beruled out by common knowledge of rationality.
1 Players care about expected payoffs2 Players have the ability to calculate expected payoffs and
identify dominated strategies3 Players believe that other players satisfy 1-2, and believe that
they believe that they satisfy 1-2...
Nash equilibrium is the rest point of various learning dynamics
Belief-based learning, e.g. Cournot best-response, fictitiousplay
Assumption 3 is not necessary
Payoff-based learning, e.g. reinforcement learning
Players must be willing to explore, remember past payoffs,receive accurate feedback.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Which assumptions are violated?
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Preference-based explanations: joy of winning
Participants receive non-monetary utility from winning (Parcoet al, 2005, Sheremeta, 2011) or lose utility after losing(Delgado et al., 2008).
Sheremeta (2011) elicits joy of winning by implementing acontest where prize has no value.
5 10 15
510
15
Joy of winning with w=3
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1214
16
5 10 15
510
15
Joy of winning with w = 8
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1214
16
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Preference-based explanations: risk preferences
CRRA untility function: u(πi ) =π1−ρi1−ρ
Risk aversion if ρ = 0.5, risk seeking if ρ = −0.5
5 10 15
510
15
Risk aversion
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1214
16
5 10 15
510
15
Risk seeking
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
12
34
56
78
910
1214
16
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Preference-based explanations: social preferences
Fehr & Schmidt (1999) inequality aversion:
u(πi , πj) =
{πi − α(πj − πi ) if πi ≤ πjπi − β(πi − πj) if πi > πj
5 10 15
510
15Fehr and Schmidt (1999) inequality aversion
Other plays
Bes
t Res
pons
e
1 2 3 4 5 6 7 8 9 10 12 14 16
12
34
56
78
911
1315
a=0, b=0a=0.5, b=0a=1, b=0
Nash equilibrium Non-standard preferences Experimental design Results Other projects
All preferences from Sheremeta (2015)
Nash equilibrium Non-standard preferences Experimental design Results Other projects
”Behavioral Variation in Tullock Contests”, joint with F.Mengel and Ph. Reiss
Deviations from NE could be a result of bounded rationality
Players optimize given the feedback in previous rounds.
Noisy feedback prevents players from discovering optimalactions
Research questions:
Can we identify whether deviations from NE are a result ofbounded rationality or of preferences?Is behavioral variability lower and choices closer to theoreticalpredictions when feedback is more informative?
Nash equilibrium Non-standard preferences Experimental design Results Other projects
”Behavioral Variation in Tullock Contests”, joint with F.Mengel and Ph. Reiss
Deviations from NE could be a result of bounded rationality
Players optimize given the feedback in previous rounds.
Noisy feedback prevents players from discovering optimalactions
Research questions:
Can we identify whether deviations from NE are a result ofbounded rationality or of preferences?Is behavioral variability lower and choices closer to theoreticalpredictions when feedback is more informative?
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t →∞In experiments players rely on small samples of experience
Suppose that players always choose the action that yieldedhighest average payoff in the past.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t →∞In experiments players rely on small samples of experience
Suppose that players always choose the action that yieldedhighest average payoff in the past.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How informative is the feedback that players observe?
Reinforcement learning converges to NE as t →∞In experiments players rely on small samples of experience
Suppose that players always choose the action that yieldedhighest average payoff in the past.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Feedback depends on other’s choices and lottery outcomes
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Treatment 1: eliminate lottery allocation
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Treatment 2: eliminate variability of opponent’s choices
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Treatment 3: eliminate both
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higheraverage payoff than action 6.
Π(4) > Π(6)
Memory length
% o
f ite
ratio
ns
0 10 20 30 40 50
0
25
50
75
100
● Shared prize, fixed actionsShared prize, changing actionsLottery, fixed actionsLottery, changing actions
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higheraverage payoff than action 6.
Π(4) > Π(6)
Memory length
% o
f ite
ratio
ns
0 10 20 30 40 50
0
25
50
75
100
● Shared prize, fixed actionsShared prize, changing actionsLottery, fixed actionsLottery, changing actions
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higheraverage payoff than action 6.
Π(4) > Π(6)
Memory length
% o
f ite
ratio
ns
0 10 20 30 40 50
0
25
50
75
100
● Shared prize, fixed actionsShared prize, changing actionsLottery, fixed actionsLottery, changing actions
Nash equilibrium Non-standard preferences Experimental design Results Other projects
How easy is it to learn in different treatments?
Estimate the likelihood that action 4 will yield a higheraverage payoff than action 6.
Π(4) > Π(6)
Memory length
% o
f ite
ratio
ns
0 10 20 30 40 50
0
25
50
75
100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● Shared prize, fixed actionsShared prize, changing actionsLottery, fixed actionsLottery, changing actions
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Procedure
40 rounds, divided into 4 blocks of 10 rounds
Each block divided into experimentation phase (rounds 1-5)and incentivized phase (rounds 6-10)
1 5 106 11 15 16 20 21 26 3025 3531 36 40
Non-incentivized Non-incentivized Non-incentivized Non-incentivizedIncentivized Incentivized Incentivized Incentivized
Block 4Block 3Block 2Block 1
One round from each block randomly chosen for payment
Incentivized numeracy test at the end of the experiment
Average earnings 15.15 euro, duration 60 minutes
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Explanatory power of Nash equilibrium
Changing actions Fixed actionsLottery EV Lottery EV
P(x = NE ) 7.04% 13.33% - -P(x = BR) - - 22.50% 65.23%P(|x − NE | ≤ 1) 25.74% 32.78% - -P(|x − BR| ≤ 1) - - 47.95% 83.64%P(x > 4) 60.19% 62.78% 51.36% 16.14%
Absolute value of deviation from equilibrium significantly differentbetween EV/Fixed treatment and the other three treatments, but not inother comparisons.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Behavioral variation
Is the distribution of choices more concentrated? (notnecessarily around NE)
Entropy measures the stochastic variation of a randomvariable (0 = one strategy always chosen, 4 = all strategieschosen with equal frequency):
H = −∑
i=1...16
pi log(pi )
Changing actions Fixed actionsLottery EV Lottery EV
Entropy 3.22 2.79 2.45 1.50Std. Dev. 3.28 2.56 3.15 1.16
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Behavioral variation
Is the distribution of choices more concentrated? (notnecessarily around NE)
Entropy measures the stochastic variation of a randomvariable (0 = one strategy always chosen, 4 = all strategieschosen with equal frequency):
H = −∑
i=1...16
pi log(pi )
Changing actions Fixed actionsLottery EV Lottery EV
Entropy 3.22 2.79 2.45 1.50Std. Dev. 3.28 2.56 3.15 1.16
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Best-response curves in Fixed treatments
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Stability of choices and convergence
Changing strategies between rounds in experimentation andincentivized rounds.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Stability of choices and convergence
Changing strategies between rounds in experimentation andincentivized rounds.
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Replacing humans by computers
Playing against a computer player is different than playingagainst a human player: no social preferences, lower joy ofwinning (?)
Additional treatment replacing computers by human players.
All effects replicate if Fixed/EV treatment is replaced by thistreatment.
Changing actions Fixed actionsLottery EV Lottery EV EV-Human
P(x = NE ) 7.04% 13.33% - - -P(x = BR) - - 22.50% 65.23% 50.42%P(|x − NE | ≤ 1) 25.74% 32.78% - - -P(|x − BR| ≤ 1) - - 47.95% 83.64% 74.58%P(x > 4) 60.19% 62.78% 51.36% 16.14% 23.33%Entropy 3.22 2.79 2.45 1.50 1.13Std. Dev. 3.28 2.56 3.15 1.16 0.91
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Replacing humans by computers
Playing against a computer player is different than playingagainst a human player: no social preferences, lower joy ofwinning (?)
Additional treatment replacing computers by human players.
All effects replicate if Fixed/EV treatment is replaced by thistreatment.
Changing actions Fixed actionsLottery EV Lottery EV EV-Human
P(x = NE ) 7.04% 13.33% - - -P(x = BR) - - 22.50% 65.23% 50.42%P(|x − NE | ≤ 1) 25.74% 32.78% - - -P(|x − BR| ≤ 1) - - 47.95% 83.64% 74.58%P(x > 4) 60.19% 62.78% 51.36% 16.14% 23.33%Entropy 3.22 2.79 2.45 1.50 1.13Std. Dev. 3.28 2.56 3.15 1.16 0.91
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:The action of the other party is stable over time, hence it iseasier to learn.Players face no strategic uncertainty, hence it is easier tooptimize
Is stability of choices necessary in addition to the removal ofstrategic uncertainty?Design: computer plays actions from the baseline contest,players know these actions.
Changing actions Changing but known Fixed actionsLottery EV Lottery EV Lottery EV
P(a = NE ) 7.04% 13.33% - - - -P(a = BR) - - 7.59% 25.37% 22.50% 65.23%P(|a− NE | ≤ 1) 25.74% 32.78% - - - -P(|a− BR| ≤ 1) - - 25.00% 51.85% 47.95% 83.64%P(a > 4) 60.19% 62.78% 62.96% 47.04% 51.36% 16.14%
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:The action of the other party is stable over time, hence it iseasier to learn.Players face no strategic uncertainty, hence it is easier tooptimize
Is stability of choices necessary in addition to the removal ofstrategic uncertainty?Design: computer plays actions from the baseline contest,players know these actions.
Changing actions Changing but known Fixed actionsLottery EV Lottery EV Lottery EV
P(a = NE ) 7.04% 13.33% - - - -P(a = BR) - - 7.59% 25.37% 22.50% 65.23%P(|a− NE | ≤ 1) 25.74% 32.78% - - - -P(|a− BR| ≤ 1) - - 25.00% 51.85% 47.95% 83.64%P(a > 4) 60.19% 62.78% 62.96% 47.04% 51.36% 16.14%
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Strategic uncertainty vs stability
Matching players to computers has two effects:The action of the other party is stable over time, hence it iseasier to learn.Players face no strategic uncertainty, hence it is easier tooptimize
Is stability of choices necessary in addition to the removal ofstrategic uncertainty?Design: computer plays actions from the baseline contest,players know these actions.
Changing actions Changing but known Fixed actionsLottery EV Lottery EV Lottery EV
P(a = NE ) 7.04% 13.33% - - - -P(a = BR) - - 7.59% 25.37% 22.50% 65.23%P(|a− NE | ≤ 1) 25.74% 32.78% - - - -P(|a− BR| ≤ 1) - - 25.00% 51.85% 47.95% 83.64%P(a > 4) 60.19% 62.78% 62.96% 47.04% 51.36% 16.14%
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Strategic uncertainty vs stability
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Contests with forgone payoff information
Conclusion from the first paper: when feedback is moreinformative about the quality of actions, players make betterchoices.
Can we improve the quality of feedback without changing thenature of the game?
Hypothesis: more information and higher quality ofinformation increases the rate of learning
Design: 10 rounds of standard contest, 20 rounds of contestwith foregone payoff information, 10 rounds of standardcontest
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Contests with forgone payoff information
Conclusion from the first paper: when feedback is moreinformative about the quality of actions, players make betterchoices.
Can we improve the quality of feedback without changing thenature of the game?
Hypothesis: more information and higher quality ofinformation increases the rate of learning
Design: 10 rounds of standard contest, 20 rounds of contestwith foregone payoff information, 10 rounds of standardcontest
Nash equilibrium Non-standard preferences Experimental design Results Other projects
”Contests with foregone payoff information”
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Hypotheses: reinforcement learning simulation
Π(2) > Π(4)
Memory length
% o
f ite
ratio
ns
0 10 20 30 40 50
0
25
50
75
100
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
● ●●
●
● ●● ●
● ● ● ●
● ●● ●
● ● ● ●● ● ● ●
● ●
● Same actions, same random numbersDifferent actions, same random numbersSame actions, different random numbersDifferent actions, different random numbers
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Results: average investments
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Results: dominated strategies
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Payoff based learning, joint with H. Nax
Calculating expected values is very complicated
Convergence is much higher when players can use a payofftable/calculator and with neutral framing
020
040
060
080
0in
vest
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Nash equilibrium Non-standard preferences Experimental design Results Other projects
Summary
Nash equilibrium has a very low explanatory power in Tullockcontests
Explanatory power is much higher when actions have directpayoff consequences
Providing additional feedback about foregone payoffinformation does not improve the explanatory power
Paying the expected payoffs does not improve learning, unlessplayers know these payoffs.