71
Decision Making in Robots and Autonomous Agents Introduc2on to Game Theory; Stochas2c & Bayesian Games Subramanian Ramamoorthy School of Informa2cs 4 February, 2014

Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Decision  Making  in Robots and Autonomous Agents

 Introduc2on  to  Game  Theory;  Stochas2c  &  Bayesian  Games  

Subramanian  Ramamoorthy  School  of  Informa2cs  

 4  February,  2014  

Page 2: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Robots  O(en  Face  Strategic  Adversaries  

4/02/2014   2  

Key issue we seek to model: Misaligned/conflicting interest

Page 3: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

On  Self-­‐Interest  

 What  does  it  mean  to  say  that  agents  are  self-­‐interested?    

•  It  does  not  necessarily  mean  that  they  want  to  cause  harm  to  each  other,  or  even  that  they  care  only  about  themselves.  

•  Instead,  it  means  that  each  agent  has  his  own  descripHon  of  which  states  of  the  world  he  likes—which  can  include  good  things  happening  to  other  agents        —and  that  he  acts  in  an  a.empt  to  bring  about  these  states  of  the  world  (beLer  term:  inter-­‐dependent  decision  making)  

4/02/2014   3  

Page 4: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Basic  Constructs  of  Game  (Extensive  Form)  

•  Move:  A  point  of  decision  for  the  player,  defined  by  a  set  of  acHons  that  could  be  chosen  (e.g.,  I  am  holding  King-­‐10-­‐2).  In  an  abstract  form:  

 •  Choice:  The  parHcular  acHon  that  has  actually  been  chosen  

(play  King)  •  Play:  A  sequence  of  choices,  one  following  another,  unHl  the  

game  terminates  (King  -­‐>  10  -­‐>  2)  

4/02/2014   4  

The move looks identical in a different game involving, say, passing – calling- betting

Page 5: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Abstract  RelaHon  Between  Moves  

4/02/2014   5  

Number refers to which player has to make a choice

Page 6: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Game  Trees  

•  The  previous  picture  is  o(en  referred  to  as  a  game  tree,  a  lisHng  of  moves  and  consequences  

•  It  is  a  tree  in  the  mathemaHcal  sense  

•  Strictly  speaking,  many  games  should  have  a  game  graph  (not  just  a  tree)    –  Why?  

•  However,  the  convenHon  is  to  treat  the  play  (history  of  moves)  as  unique  even  if  it  revisits  a  specific  state  

4/02/2014   6  

Page 7: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

InformaHon  Sets  

•  What  can  each  player  know  when  he  makes  a  choice  at  a  move?  

•  We  are  not  asking  about  their  way  of  playing  –  just,  what  is  the  most  they  could  possibly  know  without  violaHng  the  rules  of  the  game?  –  e.g.,  think  of  card  games  with  chance  moves,  or  where  a  player  picks  a  

card  and  places  it  face  down  on  the  table  

•  Rules  of  the  game  specify  which  moves  are  indisHnguishable  •  Two  requirements  for  informaHon  set:  

–  Moves  must  be  assigned  to  same  player  –  Moves  must  have  same  number  of  alternaHves  

4/02/2014   7  

Page 8: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

InformaHon  Sets  -­‐  Pictorially  

4/02/2014   8  

Page 9: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

SpecificaHon  of  an  Extensive  Form  Game  

•  A  finite  tree  with  a  disHnguished  node  •  A  parHHon  of  the  nodes  of  the  tree  into  n+1  sets  (specifying  who  

takes  the  move)  •  A  probability  distribuHon  over  the  branches  of  chance  moves  •  A  refinement  of  the  player  parHHon  into  informaHon  sets  

(characterizes,  for  each  player,  the  ambiguity  of  locaHon  of  the  game  tree  of  each  of  his  moves)  

•  An  idenHficaHon  of  corresponding  branches  for  each  of  the  moves  in  each  informaHon  set  

•  A  set  of  outcomes  and  an  assignment  of  outcomes  to  each  endpoint  of  the  tree  

4/02/2014   9  

Page 10: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

How  Do  Players  Actually  Choose?  

•  Each  player  has  a  linear  uHlity  funcHon  Mi  over  outcomes  •  Each  player  is  fully  aware  of  the  rules,  and  will  maximize  

expected  uHlity  

•  Pure  strategy:  prescripHon  of  decision  for  each  situaHon  •  For  any  fixed  strategy,  given  the  rules  of  the  game,  the  game  

tree  can  be  evaluated  directly  to  yield  a  value  •  If  there  are  chance  moves,  selecHon  of  strategies  of  players  

defines  a  distribuHon  over  plays  and  the  payoff  is  expected  value  w.r.t.  this  distribuHon  

4/02/2014   10  

Page 11: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

A  Different  Simple  Model  of  a  Game  

•  Two  decision  makers  –  Robot  (has  an  acHon  space:  a)  –  Adversary  (has  an  acHon  space:  θ)  

•  Cost  or  payoff  (to  use  the  term  common  in  game  theory)  depends  on  acHons  of  both  decision  makers:    R(a,  θ)  –  denote  as  a  matrix  corresponding  to  product  space  

4/02/2014   11  

This is the normal form – simultaneous choice over moves

A

Page 12: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

RepresenHng  Payoffs  

 In  a  general,  bi-­‐matrix,  normal  form  game:  

             The  combined  acHons  (a1, a2, …, an)  form  an          ac#on  profile  a  Є  A

4/02/2014   12  

Action sets of players Payoff function:

a.k.a. utility u2(a)

Page 13: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Example:  Rock-­‐Paper-­‐Scissors  

•  Famous  children’s  game  •  Two  players;  Each  player  simultaneously  picks  an  acHon  which  

is  evaluated  as  follows,  –  Rock  beats  Scissors  –  Scissors  beats  Paper  –  Paper  beats  Rock  

4/02/2014   13  

Page 14: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

TCP  Game  

•  Imagine  there  are  only  two  internet  users:  you  and  me  •  Internet  traffic  is  governed  by  TCP  protocol,  one  feature  of  

which  is  the  backoff  mechanism:  when  network  is  congested  then  backoff  and  reduce  transmission  rates  for  a  while  

•  Imagine  that  there  are  two  implementaHons:  C  (correct,  does  what  is  intended)  and  D  (defecHve)  

•  If  you  both  adopt  C,  packet  delay  is  1  ms;  if  you  both  adopt  D,  packet  delay  is  3  ms  

•  If  one  adopts  C  but  other  adopts  D  then  D  user  gets  no  delay  and  C  user  suffers  4  ms  delay  

4/02/2014   14  

Page 15: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

TCP  Game  in  Normal  Form  

4/02/2014   15  

Note that this is another way of writing a bi-matrix game: First number represents payoff of row player and second number is payoff for column player

Page 16: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Some  Famous  Matrix  Examples  -­‐  What  are  they  Capturing?  

•  Prisoner’s  Dilemma:  Cooperate  or  Defect  (same  as  TCP  game)  

•  Bach  or  Stravinsky  (von  Neumann  called  it  BaLle  of  the  Sexes)  

 

•  Matching  Pennies:  Try  to  get  the  same  outcome,  Heads/Tails  

4/02/2014   16  

Page 17: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Different  CategorizaHon:  Common  Payoff  

     A  common-­‐payoff  game  is  a  game  in  which  for  all  ac>on  profiles  a  ∈  A1  ×·∙  ·∙  ·∙×  An  and  any  pair  of  agents  i,  j  ,  it  is  the  case  that  ui(a)  =  uj(a)  

4/02/2014   17  

Pure coordination: e.g., driving on a side of the road

Page 18: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Different  CategorizaHon:  Constant  Sum  

     A  two-­‐player  normal-­‐form  game  is  constant-­‐sum  if  there  exists  a  constant  c  such  that  for  each  strategy  profile  a  ∈  A1  ×  A2  it  is  the  case  that  u1(a)  +  u2(a)  =  c  

4/02/2014   18  

Pure competition: One player wants to coordinate Other player does not!

Page 19: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

What  Can  Players  Do?  

4/02/2014   19  

Page 20: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Strategies  

4/02/2014   20  

Expected utility

Page 21: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

SoluHon  Concepts  

 Many  ways  of  describing  what  one  ought  to  do:  –  Dominance  – Minimax  –  Pareto  Efficiency  –  Nash  Equilibria  –  Correlated  Equilibria  

 Remember  that  in  the  end  game  theory  aspires  to  predict    behaviour  given  specificaHon  of  the  game.  

 Norma>vely,  a  soluHon  concept  is  a  ra>onale  for  behaviour  

4/02/2014   21  

Page 22: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Dominance  

4/02/2014   22  

Page 23: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Iterated  Dominance  

4/02/2014   23  

Page 24: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Minimax  

4/02/2014   24  

Page 25: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Minimax  

4/02/2014   25  

Page 26: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CompuHng  Minimax:  Linear  Programming  

4/02/2014   26  We will look at this LP formulation in Tutorial 1, along with MDPs.

Page 27: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Pick-­‐a-­‐Hand  •  There  are  two  players:  chooser  (player  I)  &  hider  (player  II)    

•  The  hider  has  two  gold  coins  in  his  back  pocket.  At  the  beginning  of  a  turn,  he  puts  his  hands  behind  his  back  and  either  takes  out  one  coin  and  holds  it  in  his  le(  hand,  or  takes  out  both  and  holds  them  in  his  right  hand.    

•  The  chooser  picks  a  hand  and  wins  any  coins  the  hider  has  hidden  there.    

•  She  may  get  nothing  (if  the  hand  is  empty),  or  she  might  win  one  coin,  or  two.  

4/02/2014   27  

Page 28: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Pick-­‐a-­‐Hand,  Normal  Form:  

•  Hider  could  minimize  losses  by  placing  1  coin  in  le(  hand,  most  he  can  lose  is  1  

•  If  chooser  can  figure  out  hider’s  plan,  he  will  surely  lose  that  1  

•  If  hider  thinks  chooser  might  strategise,  he  has  incenHve  to  play  R2,  …  

•  All  hider  can  guarantee  is  max  loss  of  1  coin  

4/02/2014   28  

•  Similarly,  chooser  might  try  to  maximise  gain,  picking  R  

•  However,  if  hider  strategizes,  chooser  ends  up  with  zero  

•  So,  chooser  can’t  actually  guarantee  winning  anything    

Page 29: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Pick-­‐a-­‐Hand,  with  Mixed  Strategies  

•  Suppose  that  chooser  decides  to  choose  R  with  probability  p  and  L  with  probability  1  −  p  

•  If  hider  were  to  play  pure  strategy  R2  his  expected  loss  would  be  2p  

•  If  he  were  to  play  L1,  expected  loss  is  1  −  p  

•  Chooser  maximizes  her  gains  by  choosing  p  so  as  to  maximize  min{2p,  1  −  p}  

•  Thus,  by  choosing  R  with  probability  1/3  and  L  with  probability  2/3,  chooser  assures  expected  payoff  of  2/3,  regardless  of  whether  hider  knows  her  strategy  

4/02/2014   29  

Page 30: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Mixed  Strategy  for  the  Hider  

•  Hider  will  play  R2  with  some  probability  q  and  L1  with  probability  1−q  

•  The  payoff  for  chooser  is  2q  if  she  picks  R,  and  1  −  q  if  she  picks  L  

•  If  she  knows  q,  she  will  choose  the  strategy  corresponding  to  the  maximum  of  the  two  values.  

•  If  hider,  knows  chooser’s  plan,  he  will  choose  q  =  1/3  to  minimize  this  maximum,  guaranteeing  that  his  expected  payout  is  2/3  (because  2/3  =  2q  =  1  −  q)  

•  Chooser  can  assure  expected  gain  of  2/3,  hider  can  assure  an  expected  loss  of  no  more  than  2/3,  regardless  of  what  either  knows  of  the  other’s  strategy.  

4/02/2014   30  

Page 31: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Safety  Value  as  IncenHve  

•  Clearly,  without  some  extra  incenHve,  it  is  not  in  hider’s  interest  to  play  Pick-­‐a-­‐hand  because  he  can  only  lose  by  playing.    

•  Thus,  we  can  imagine  that  chooser  pays  hider  to  enHce  him  into  joining  the  game.    

•  2/3  is  the  maximum  amount  that  chooser  should  pay  him  in  order  to  gain  his  parHcipaHon.  

4/02/2014   31  

Page 32: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Another  Game  

Mixed  strategies:  •  Suppose  player  I  plays  T  

with  probability  p  and  B  with  probability  1−p  

•  Player  II  plays  L  with  probability  q  and  R  with  probability  1  −  q  

•  For  player  I,  expected  payoff  is  2(1  −  q)  for  playing  pure  strategy  T;  4q  +  1  for  playing  pure  strategy  B.    

•  If  she  knows  q,  she’ll  pick  the  strategy  corresponding  to  max{2(1  −  q),  4q  +  1}  

•  Player  II  can  choose  q  =  1/6  so  as  to  minimize  this  maximum,  and  expected  amount  player  II  will  pay  player  I  is  5/3.  

4/02/2014   32  

Page 33: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

The  Game  Analysed  Graphically  

 If  pl.  I  knows  q,  she’ll  pick  strategy  based  on  max{2(1  −  q),  4q  +  1}.  Player  II  can  choose  q  =  1/6  so  as  to  minimize  this  maximum.  Expected  amount  player  II  will  pay  player  I  is  5/3.  

 

 For  pl.  II,  expected  loss  is  5(1  −  p)  if  he  plays  pure  strategy  L  and  1+p  if  he  plays  pure  strategy  R;  he  will  aim  to  minimize  this  expected  payout.  In  order  to  maximize  this  minimum,  player  I  will  choose  p  =  2/3,  yielding  expected  gain  5/3.  

4/02/2014   33  

Page 34: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

General  FormulaHon  

 The  mixed  strategies  can  be  represented  as  follows,  where  xi  are  pure  strategies  of  player  I  and  yi  are  pure  strategies  of  player  II  

4/02/2014   34  

Page 35: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CondiHon  for  OpHmality  

•  Mixed  strategy                                  is  opHmal  for  player  I  if:  

•  Similarly,  mixed  strategy                                  is  opHmal  for  player  II  if:  

4/02/2014   35  

Page 36: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

A  General  Result  

4/02/2014   36  

Page 37: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Equilibrium  as  a  Saddle  Point  

4/02/2014   37  

Page 38: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Pareto  Efficiency  

4/02/2014   38  

Page 39: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Pareto  Efficiency  

4/02/2014   39  

Page 40: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Nash  Equilibrium  

4/02/2014   40  

Page 41: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Nash  Equilibrium  

4/02/2014   41  

Page 42: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Nash  Equilibrium  -­‐  Example  

4/02/2014   42  

Page 43: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Nash  Equilibrium  -­‐  Example  

4/02/2014   43  

Page 44: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Concept:  Correlated  Equilibrium  

4/02/2014   44  

Page 45: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Correlated  Equilibrium  

4/02/2014   45  

Page 46: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Correlated  Equilibrium  

4/02/2014   46  

Page 47: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

StochasHc  Games  

4/02/2014   47  

Page 48: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

StochasHc  Game  -­‐  Setup  

4/02/2014   48  

Page 49: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

StochasHc  Game  -­‐  Policies  

4/02/2014   49  

Page 50: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

StochasHc  Game  -­‐  Example  

4/02/2014   50  

Page 51: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

StochasHc  Game  -­‐  Remarks  

4/02/2014   51  

Page 52: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Nash  Equilibria  –  StochasHc  Game  

4/02/2014   52  

Page 53: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Nash  Equilibria  –  StochasHc  Game  

4/02/2014   53  

Page 54: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Incomplete  InformaHon  

•  So  far,  we  assumed  that  everything  relevant  about  the  game  being  played  is  common  knowledge  to  all  the  players:  –  the  number  of  players  –  the  acHons  available  to  each  –  the  payoff  vector  associated  with  each  acHon  vector  

•  True  even  for  imperfect-­‐informaHon  games  –  The  actual  moves  aren’t  common  knowledge,  but  the  game  is  

•  Next,  consider  games  of  incomplete  (not  imperfect)  informa#on  –  Players  are  uncertain  about  the  game  being  played  

4/02/2014   54  

Page 55: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Bayesian  Games  

•  We  know  the  set  G  of  all  possible  games,  but  do  not  know  anything  about  which  game  in  G  –  Enough  informaHon  to  put  a  probability  distribuHon  over  games  

•  A  Bayesian  Game  is  a  class  of  games  G  that  saHsfies  two  fundamental  condiHons  

•  Condi#on  1:  –  The  games  in  G  have  the  same  number  of  agents,  and  the  same  strategy  space  (set  of  possible  strategies)  for  each  agent.  The  only  difference  is  in  the  payoffs  of  the  strategies.  

•  This  condiHon  isn’t  very  restricHve  –  Other  types  of  uncertainty  can  be  reduced  to  the  above,  by  reformulaHng  the  problem  

4/02/2014   55  

Page 56: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

An  Example  •  Suppose  we  do  not  know  whether  player  2  only  has  strategies  L  and  

R,  or  also  an  addiHonal  strategy  C:  

 •  If  player  2  does  not  have  strategy  C,  this  is  equivalent  to  having  a  

strategy  C  that  is  strictly  dominated  by  other  strategies:  –  Nash  equilibria  for  G1'  are  the  same  as  for  G1  

•  Problem  is  reduced  to  whether  C’s  payoffs  are  those  of  G1'  or  G2  4/02/2014   56  

Page 57: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Bayesian  Games  

Condi#on  2  (common  prior):  –  The  probability  distribuHon  over  the  games  in  G  is  common  

knowledge  (i.e.,  known  to  all  the  agents)  •  So  a  Bayesian  game  defines    

–  the  uncertainHes  of  agents  about  the  game  being  played,    –  what  each  agent  believes  the  other  agents  believe  about  the  game  

being  played  •  The  beliefs  of  the  different  agents  are  posterior  probabiliHes  

–  Combine  the  common  prior  distribuHon  with  individual  “private  signals”  (what  is  “revealed”  to  the  individual  players)  

•  The  common-­‐prior  assumpHon  rules  out  whole  families  of  games  –  But  it  greatly  simplifies  the  theory,  so  most  work  in  game  theory  uses  it  

4/02/2014   57  

Page 58: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

The  Bayesian  Game  Model  

A  Bayesian  game  consists  of  •  a  set  of  games  that  differ  only  in  their  payoffs  •  a  common  (known  to  all  players)  prior  distribuHon  over  them  •  for  each  agent,  a  parHHon  structure  (set  of  informaHon  sets)  

over  the  games  

4/02/2014   58  

Page 59: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Bayesian  Game:  InformaHon  Sets  Defn.  

A  Bayesian  game  is  a  4-­‐tuple  (N,G,P,I)  

•  N  is  a  set  of  agents  •  G  is  a  set  of  N-­‐agent  games  •  For  every  agent  i,  every  game  in  

G  has  the  same  strategy  space  •  P  is  a  common  prior  over  G  

–  common:  common  knowledge  (known  to  all  the  agents)  

–  prior:  probability  before  learning  any  addiHonal  informaHon  

•  I = (I1, …, IN)  is  a  tuple  of  parHHons  of  G,  one  for  each  agent  (informaHon  sets)  

4/02/2014   59  

Page 60: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Example  

•  Suppose  the  randomly  chosen  game  is  MP  

•  Agent  1’s  informaHon  set  is  I1,1 –  1  knows  it  is  MP  or  PD  –  1  can  infer  posterior  

probabili2es  for  each  

•  Agent  2’s  informaHon  set  is  I2,1

4/02/2014   60  

Page 61: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Another  InterpretaHon:  Extensive  Form  

4/02/2014   61  

Page 62: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Epistemic  Types  

•  We  can  assume  the  only  thing  players  are  uncertain  about  is  the  game’s  uHlity  funcHon  

•  Thus  we  can  define  uncertainty  directly  over  a  game’s  uHlity  funcHon  

•  All  this  is  common  knowledge;  each  agent  knows  its  own  type  4/02/2014   62  

Page 63: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Types  

 An  agent’s  type  consists  of  all  the  informaHon  it  has  that  isn’t  common  knowledge,  e.g.,  

•  The  agent’s  actual  payoff  funcHon  •  The  agent’s  beliefs  about  other  agents’  payoffs,  •  The  agent’s  beliefs  about  their  beliefs  about  his  own  payoff  •  Any  other  higher-­‐order  beliefs  

4/02/2014   63  

Page 64: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Strategies  

Similar  to  what  we  had  in  imperfect-­‐informaHon  games:  •  A  pure  strategy  for  player  i  maps  each  of  i’s  types  to  an  acHon  

–  what  i  would  play  if  i  had  that  type  •  A  mixed  strategy  si  is  a  probability  distribuHon  over  pure  strategies  

si(ai|θj) = Pr[i plays action aj | i’s type is  θj] •  Many  kinds  of  expected  uHlity:  ex  post,  ex  interim,  and  ex  

ante  –  Depend  on  what  we  know  about  the  players’  types

4/02/2014   64  

Page 65: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Expected  UHlity  

4/02/2014   65  

Page 66: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Bayes-­‐Nash  Equilibrium  

4/02/2014   66  

Page 67: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CompuHng  Bayes-­‐Nash  Equilibria  

•  The  idea  is  to  construct  a  payoff  matrix  for  the  enHre  Bayesian  game,  and  find  equilibria  on  that  matrix  

 Write  each  of  the  pure  strategies  as  a  list  of  acHons,  one  for  each  type:  

4/02/2014   67  

Page 68: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CompuHng  Bayes-­‐Nash  Equilibria  

Compute  ex  ante  expected  uHlity  for  each  pure-­‐strategy  profile:  

4/02/2014   68  

Page 69: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CompuHng  Bayes-­‐Nash  Equilibria  

4/02/2014   69  

Page 70: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

CompuHng  Bayes  Nash  Equilibria  

4/02/2014   70  and so on…

Page 71: Decision(Making(Basic&Constructs&of&Game&(Extensive&Form)& • Move:&A&pointof&decision&for&the&player,&defined&by&asetof& acHons&thatcould&be&chosen&(e.g.,&Iam&holding&King;10;2

Acknowledgements  

Many  slides  are  adapted  from  the  following  sources:    •  Tutorial  at  IJCAI  2003  by  Prof  Peter  Stone,  University  of  Texas  •  Y.  Peres,  Game  Theory,  Alive  (Lecture  Notes)  •  Game  Theory  lectures  by  Prof.  Dana  Nau,  University  of  

Maryland  

4/02/2014   71