Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
GameTheory--
Lecture3
PatrickLoiseauEURECOMFall2016
1
Lecture2 recap
• DefinedParetooptimality– Coordinationgames
• Studiedgameswithcontinuousactionspace– AlwayshaveaNashequilibriumwithsomeconditions– Cournot duopolyexample
à CanwealwaysfindaNashequilibriumforallgames?à How?
2
Outline
1. Mixedstrategies– BestresponseandNashequilibrium
2. MixedstrategiesNashequilibriumcomputation3. Interpretationsofmixedstrategies
3
Outline
1. Mixedstrategies– BestresponseandNashequilibrium
2. MixedstrategiesNashequilibriumcomputation3. Interpretationsofmixedstrategies
4
Example:installingcheckpoints
• Tworoad,Policechooseonwhichtocheck,Terroristschooseonwhichtopass
5
R1 R2
R1
R2
1,-1 -1,1
1,-1-1,1
Police
Terrorist• CanyoufindaNashequilibrium?
à Playersmustrandomize
Matchingpennies
• Similarexamples:– Checkpointplacement– Intrusiondetection– Penaltykick– Tennisgame
• Needtobeunpredictable
6
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Purestrategies/Mixedstrategies
• Game• Ai:setofactionsofplayeri (whatwecalledSibefore)
• Action=purestrategy• Mixedstrategy:distributionoverpurestrategies
– Includepurestrategyasspecialcase– Support:
• Strategyprofile:7
N, Ai( )i∈N , ui( )i∈N( )
si ∈ Si = Δ(Ai )
s = (s1,, sn )∈ S = S1 ×× Sn
supp si = {ai ∈ Ai : si (ai )> 0}
Matchingpennies:payoffs• WhatisPlayer1’spayoffifPlayer2
playss2 =(1/4,3/4)andheplays:
– Heads?
– Tails?
– s1 =(½,½)?
8
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Payoffsinmixedstrategies:generalformula
• Game,let• Ifplayersfollowamixed-strategyprofiles,theexpectedpayoffofplayeri is:
• a:purestrategy(oraction)profile• Pr(a|s):probabilityofseeingagiventhemixedstrategyprofiles
9
ui (s) = uia∈A∑ (a)Pr(a | s) where Pr(a | s) = si (ai )
i∈N∏
N, Ai( )i∈N , ui( )i∈N( ) A = ×i∈N
Ai
Matchingpennies:payoffscheck• WhatarethepayoffsofPlayer1
andPlayer2ifs=((½,½),(¼,¾))?
• DoesthatlooklikeitcouldbeaNashequilibrium?
10
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Bestresponse
• Thedefinitionformixedstrategiesisunchanged!
• BRi(s-i):setofbestresponsesofi tos-i11
Definition: Best ResponsePlayeri’s strategyŝi isaBRtostrategys-i ofotherplayersif:
ui(ŝi ,s-i)≥ui(s’i ,s-i)for alls’i inSi
Matchingpennies:bestresponse• Whatisthebestresponseof
Player1tos2 =(¼,¾)?
• Foralls1,u1(s1,s2)liebetweenu1(heads,s2)andu1(tails,s2)(theweightedaverageliesbetweenthepurestrategiesexp.Payoffs)
à Bestresponseistails!
12
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Importantproperty
• Ifamixedstrategyisabestresponsetheneachofthepurestrategiesinthemixmustbebestresponses
è Theymustyieldthesameexpectedpayoff
13
Proposition:Forany (mixed)strategys-i,if,then
.
Inparticular,ui(ai,s-i) isthesameforallai suchthat
si ∈ BRi (s−i )ai ∈ BRi (s−i ) for all ai such that si (ai )> 0
si (ai )> 0
Wordyproof• Supposeitwerenottrue.Thentheremustbeatleastone
purestrategyai thatisassignedpositiveprobabilitybymybest-responsemixandthatyieldsalowerexpectedpayoffagainstsi
• Ifthereismorethanone,focusontheonethatyieldsthelowestexpectedpayoff.SupposeIdropthat(low-yield)purestrategyfrommymix,assigningtheweightIusedtogiveittooneoftheother(higher-yield)strategiesinthemix
• Thismustraisemyexpectedpayoff• Butthentheoriginalmixedstrategycannothavebeenabest
response:itdoesnotdoaswellasthenewmixedstrategy• Thisisacontradiction
14
Matchingpenniesagain• WhatisthebestresponseofPlayer1tos2 =(¼,¾)?
• WhatisthebestresponseofPlayer1tos2 =(½,½)?
15
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Nashequilibriumdefinition
• Samedefinitionasforpurestrategies!– Butherethestrategiessi* aremixedstrategies
16
Definition: NashEquilibriumAstrategyprofile(s1*,s2*,…,sN*)isaNashEquilibrium(NE)if,foreachi,herchoicesi*isabestresponsetotheotherplayers’choicess-i*
Matchingpenniesagain
• Nashequilibrium:((½,½),(½,½))
17
heads tails
heads
tails
1,-1 -1,1
1,-1-1,1
Player1
Player2
Nashequilibriumexistencetheorem
• Inmixedstrategy!– Nottrueinpurestrategy
• Finitegame:finitesetofplayerandfiniteactionsetforallplayers– Botharenecessary!
• Proof:reductiontoKakutani’s fixed-pointthm18
Theorem: Nash(1951)EveryfinitegamehasaNashequilibrium.
Outline
1. Mixedstrategies– BestresponseandNashequilibrium
2. MixedstrategiesNashequilibriumcomputation3. Interpretationsofmixedstrategies
19
ComputationofmixedstrategyNE
• Hardifthesupportisnotknown• Ifyoucanguessthesupport,itbecomesveryeasy,usingthepropertyshownearlier:
20
Proposition:Forany (mixed)strategys-i,if,then
.
Inparticular,ui(ai,s-i) isthesameforallai suchthat(i.e.,ai inthesupportofsi)
si ∈ BRi (s−i )ai ∈ BRi (s−i ) for all ai such that si (ai )> 0
si (ai )> 0
Example:battleofthesexes
• Wehaveseenthat(O,O)and(S,S)areNE
• IsthereanyotherNE(inmixedstrategies)?– Let’strytofindaNEwithsupport{O,S}foreachplayer
2,1 0,00,0 1,2
Opera
Soccer
Opera
Player1
Player2Soccer
21
Example:battleofthesexes(2)
• Lets2 =(p,1-p)• Ifs1 isaBRwithsupport{O,S},thenPlayer1mustbeindifferentbetweenOandS
à p=1/3
2,1 0,00,0 1,2
Opera
Soccer
Opera
Player1
Player2Soccer
22
Example:battleofthesexes(3)
• Similarly,lets1 =(q,1-q)• Ifs2 isaBRwithsupport{O,S},thenPlayer2mustbeindifferentbetweenOandS
à q=2/3
2,1 0,00,0 1,2
Opera
Soccer
Opera
Player1
Player2Soccer
23
Example:battleofthesexes(4)
• Conclusion:((2/3,1/3),(1/3,2/3))isaNE
2,1 0,00,0 1,2
Opera
Soccer
Opera
Player1
Player2Soccer
24
Example:prisoner’sdilemma
• Weknowthat(D,D)isNE• CanwefindaNEwithsupport{C,D}witheach?
• ANEinstrictlydominantstrategiesisunique! 25
D C
D
C
-5,-5 0,-6
-2,-2-6,0
Prisoner1
Prisoner2
GeneralmethodstocomputeNashequilibrium
• Ifyouknowthesupport,writetheequationstranslatingindifferencebetweenstrategiesinthesupport(worksforanynumberofactions!)
• Otherwise:– TheLemke-Howson Algorithm(1964)– Supportenumerationmethod(Porteretal.2004)• Smartheuristicsearchthroughallsetsofsupport
• Exponentialtimeworstcasecomplexity
26
ComplexityoffindingNashequilibrium
• IsitNP-complete?– No,weknowthereisasolution– Butmanyderivedproblemsare(e.g.,doesthereexistsastrictlyParetooptimalNashequilibrium?)
• PPAD(“PolynomialParityArgumentsonDirectedgraphs”)[Papadimitriou1994]
• Theorem:ComputingaNashequilibriumisPPAD-complete[Chen,Deng2006]
27
ComplexityoffindingNashequilibrium(2)
28
P
NPPPAD
NP-complete
NP-hard
Outline
1. Mixedstrategies– BestresponseandNashequilibrium
2. MixedstrategiesNashequilibriumcomputation3. Interpretationsofmixedstrategies
29
Mixedstrategiesinterpretations
• Playersrandomize• Beliefofothers’actions(thatmakeyouindifferent)
• Empiricalfrequencyofplayinrepeatedinteractions
• Fractionofapopulation– Let’sseeanexampleofthisone
30
TheIncomeTaxGame(1)
• Assumesimultaneousmovegame• IsthereapurestrategyNE?• FindmixedstrategyNE
2,0 4,-104,0 0,4
A
N
Honest Cheat
q 1-q
p
(1-p)
Auditor
Taxpayer
31
TheIncomeTaxGame:NEcomputation
• MixedstrategiesNE:
( )( )[ ]( )( )[ ]( )( )[ ]( )( )[ ] 7
2144)1(4101,,
01,,
32)1(42
)1(041,,)1(421,,
2
2
1
1
=Þ=þýü
-+-=-=-
=Þ-=þýü
-+=--+=-
ppppppCUE
ppHUE
qqqqqqqNUEqqqqAUE
Lookattaxpayerspayoffs
Tofindauditorsmixing
32
TheIncomeTaxGame:mixedstrategyinterpretation
• Fromtheauditor’spointofview,he/sheisgoingtoauditasingletaxpayer2/7ofthetime
èThisisactuallyarandomization(which isappliedbylaw)
• Fromthetaxpayerperspective,he/sheisgoingtobehonest2/3ofthetime
è Thisinrealityimpliesthat2/3rdofpopulationisgoingtopaytaxeshonestly,i.e.,thisisafractionofalargepopulation payingtaxes
33
TheIncomeTaxGame(6)
• Whatcouldeverbedoneifonepolicymaker(e.g.thegovernment)wouldliketoincreasetheproportionofhonesttaxpayers?
• Oneideacouldbeforexampleto“prevent”fraudbyincreasingthenumberofyearsataxpayerwouldspendinjailiffoundguilty
34
TheIncomeTaxGame:Tryingtomakepeoplepay
• Howtomakepeoplepaytheirtaxes?
• Oneidea:increasepenaltyforcheating
• Whatisthenewequilibrium?
2,0 4,-204,0 0,4
A
N
Honest Cheat
q 1-q
p
(1-p)
Auditor
Taxpayer
35
TheIncomeTaxGame:newNE
( )( )[ ]( )( )[ ]( )( )[ ]( )( )[ ] 7
261424
)1(4201,,01,,
32)1(42
)1(041,,)1(421,,
2
2
1
1
<=Þ=þýü
-+-=-=-
=Þ-=þýü
-+=--+=-
ppppppCUE
ppHUE
qqqqqqqNUEqqqqAUE
• Theproportionofhonesttaxpayersdidn’tchange!– Whatdeterminestheequilibriummixforthecolumnplayeristherowplayer’spayoffs
• Theprobabilityofauditdecreased– Stillgood,auditsareexpensive
• Tomakepeoplepaytax:changeauditor’spayoff– Makeauditscheaper,moreprofitable 36
Importantremark
• Rowplayer’sNEmixdeterminedbycolumnplayer’spayoffandviceversa
• Neutralizetheopponent(makehimindifferent)
• Insomesensetheoppositeofoptimization(mychoiceisindependentofmyownpayoff)
37
Thepenaltykickgame
• 2players:kickerandgoalkeeper• 2actionseach– Kicker:kickleft,kickright– Goalkeeper:jumpleft,jumpright
• Payoff:probabilitytoscoreforthekicker,probabilitytostopitforthegoalkeeper
• Scoringprobabilities:
38
58.30 94.9792.91 69.92
L
R
L R
Kicker
Goalkeeper
Thepenaltykickgame:results
• IgnacioPalacios-Huerta.ProfessionalsPlayMinimax.ReviewofEconomicsStudies(2003).
• Result:
• Foragivenkicker,hisstrategyisalsoseriallyindependent
39
41.99 58.01 38.54 61.46
42.31 57.69 39.98 60.02
NEprediction
Observedfreq.
GoalL GoalR KickerL KickerR
Summary
• Mixedstrategies:distributionoveractions– ANashequilibriuminmixedstrategiesalwaysexistsforfinitegames
– Computationiseasyifthesupportisknown• Allpurestrategiesinthesupportofabestresponsearealsobestresponses• Makesotherplayerindifferentinhissupport
– Computationishardifthesupportisnotknown– Severalinterpretationsdependingonthegameatstake
40