Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
Algorithmic Game Theory
Nguyen Kim Thang
LIAFA, 18/2/09
Game Theory + Algorithms
Game Theory + Algorithms Entities in society, each with its own information and
interests, behave in rational manners.
Game Theory + Algorithms Entities in society, each with its own information and
interests, behave in rational manners.
Game theory is a deep theory studying such interactions (in economics, political science, ... etc).
Game Theory + Algorithms Entities in society, each with its own information and
interests, behave in rational manners.
Theoretical computer science studies optimization problems, seeks to optimum, efficient computing, impossibility results, ... etc
Game theory is a deep theory studying such interactions (in economics, political science, ... etc).
Algorithmic Game Theory
Research field on the interface of game theory and theoretical computer science (mostly algorithms)
The field has phenomenally exploded with many branches: computing Nash equilibrium, mechanism design, inefficiency of equilibria, ... etc
Formulating novel goals and problems, fresh looks on different issues (inspired by Internet, ...).
Outline
Existence and inefficiency of pure Nash equilibrium
Online Algorithmic Mechanism Design
Scheduling Games in the Dark
Online Auction with single-minded customers
Nash Equilibrium
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibrium
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibriumchoose a distribution
over strategies
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibriumchoose a distribution
over strategiesdeterministically choose a strategy
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibrium
always exists (by Nash)
choose a distribution over strategies
deterministically choose a strategy
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibrium
always exists (by Nash)
choose a distribution over strategies
deterministically choose a strategy
does not necessarily exist
Nash Equilibrium Equilibrium: strategy profile that is resilient to
deviation of individual player.
Mixed equilibrium Pure equilibrium
always exists (by Nash)
choose a distribution over strategies
deterministically choose a strategy
does not necessarily exist
Potential games: admit a function such that if a player change her strategy to get a better utility then the function strictly decreases.
Inefficiency of equilibria
the worst Nashequilibrium price of anarchy
(PoA) = worst NE/OPT
price of stability (PoS) = best NE/OPT
the best Nashequilibrium
OPT
Good equilibria ?social objective function
Scheduling Game
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
n mi
j pij
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
n m
The cost of a job is its completion time.
ij pij
ci i
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
n m
The cost of a job is its completion time.
ij pij
The social cost is the makespan, i.e.
ci i
maxi
ci
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
Each machine specifies a policy how jobs assigned to the machine are to be scheduled.
n m
The cost of a job is its completion time.
ij pij
The social cost is the makespan, i.e.
ci i
maxi
ci
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
Each machine specifies a policy how jobs assigned to the machine are to be scheduled.
Eg: Shortest Processing Time First (SPT)
n m
The cost of a job is its completion time.
ij pij
The social cost is the makespan, i.e.
ci i
machine 1machine 2machine 3
maxi
ci
Scheduling Game jobs (players) and machines: a job chooses a
machine to execute. The processing time of job on machine is
Each machine specifies a policy how jobs assigned to the machine are to be scheduled.
Eg: Shortest Processing Time First (SPT)
n m
The cost of a job is its completion time.
ij pij
The social cost is the makespan, i.e.
ci i
machine 1machine 2machine 3
maxi
ci
Non-clairvoyant policies Typically, a policy depends on the processing time of jobs
assigned to the machine.
Non-clairvoyant policies Typically, a policy depends on the processing time of jobs
assigned to the machine.
What about policies that do not require this knowledge? Incomplete information games Private information of jobs Jobs cannot influence on their completion time
by misreporting their processing time
Non-clairvoyant policies Typically, a policy depends on the processing time of jobs
assigned to the machine.
What about policies that do not require this knowledge? Incomplete information games Private information of jobs Jobs cannot influence on their completion time
by misreporting their processing time
Non-clairvoyant policies
existenceNash equilibrium
small PoA
Natural policies RANDOM: schedules jobs in a random order.
EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.
ci = pij +12
!
i!:!(i!)=j,i! !=i
pi!j
In the strategy profile , is assigned to : ! i j
Natural policies RANDOM: schedules jobs in a random order.
EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.
ci = pij +12
!
i!:!(i!)=j,i! !=i
pi!j
In the strategy profile , is assigned to : ! i j
AB
C
C
D
pA = 1pB = 1
pC = 2pD = 3
0 4 6 7
Natural policies RANDOM: schedules jobs in a random order.
EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.
ci = pij +12
!
i!:!(i!)=j,i! !=i
pi!j
In the strategy profile , is assigned to : ! i j
ci = p1j + . . . + pi!1,j + (k ! i + 1)pij
If there are jobs on machine s.t: p1j ! . . . ! pkjk j
Models
Def of models:
Def: A job is balanced if i max pij/ min pij ! 2
Identical machines: for some length
Uniform machines: for some speed
Unrelated machines: arbitrary
pij = pi !j
pij = pi/sj
pij
sj
pi
Existence of equilibrium
The game under EQUI policy is a potential game.
Theorem:
The game under RANDOM policy is a potential game for 2 unrelated machines but it is not for more than 3 machines. For uniform machines, balanced jobs, there always exists equilibrium.
RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by
changing the strategy (other players’ strategies are fixed)
RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by
changing the strategy (other players’ strategies are fixed)
Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)
RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by
changing the strategy (other players’ strategies are fixed)
Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)
Jobs have length Machines have speed
p1 ! p2 ! . . . ! pn
s1 ! s2 ! . . . ! sm
pij = pi/sj
RANDOM, uniform machines
Lemma: Consider a job making a best move from to . If there is a new unhappy job with index greater than , then
Def: A job is unhappy if it can decrease its cost by changing the strategy (other players’ strategies are fixed)
Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)
Jobs have length Machines have speed
p1 ! p2 ! . . . ! pn
s1 ! s2 ! . . . ! sm
pij = pi/sj
sa > sb
i
i
ab
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
For any strategy profile , let be the unhappy job with greatest index.
! t
f!(i) =
!1 if 1 ! i ! t,
0 otherwise.1 0t
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
For any strategy profile , let be the unhappy job with greatest index.
! t
f!(i) =
!1 if 1 ! i ! t,
0 otherwise.
lex. decreases
1 0t
!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
For any strategy profile , let be the unhappy job with greatest index.
! t
f!(i) =
!1 if 1 ! i ! t,
0 otherwise.
If t! < t lex. decreases
!(!) = (1, s!(1), . . . , 1, s!(t!), 1, s!(t!+1), . . . , 1, s!(t), . . .)!(!!) = (1, s!(1), . . . , 1, s!(t!), 0, s!(t!+1), . . . , 0, s!!(t), . . .)
1 0t
!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
For any strategy profile , let be the unhappy job with greatest index.
! t
f!(i) =
!1 if 1 ! i ! t,
0 otherwise.
If lex. decreases
t! > t
!(!) = (1, s!(1), . . . , 1, s!(t), . . . , 0, s!(t!), . . .)!(!!) = (1, s!(1), . . . , 1, s!!(t), . . . , 0, s!(t!), . . .)
1 0t
!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))
Potential function Dynamic: among all unhappy jobs, let the one with the
greatest index make a best move.
For any strategy profile , let be the unhappy job with greatest index.
! t
f!(i) =
!1 if 1 ! i ! t,
0 otherwise.
If lex. decreases
t! > t
!(!) = (1, s!(1), . . . , 1, s!(t), . . . , 0, s!(t!), . . .)!(!!) = (1, s!(1), . . . , 1, s!!(t), . . . , 0, s!(t!), . . .)
by Lemma: s!(t) > s!!(t)
1 0t
!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))
Inefficiency
Theorem: For unrelated machines, the PoA of policy EQUI is at most 2m – interestingly, that matches the best clairvoyant policy.
PoA is not increased when processing times are unknown to the machines.
the worstNE
PoA
OPT
Mechanism Design
Goal: self-interested behavior yields desired outcomes.
Define the game
Online Auction A company produces one perishable item per time unit
(items have to be immediately delivered to customers, e.g. electricity, ice-cream, ...)
Opt. prob: maximize the welfare over all satisfied customers.
Single-minded customers arrive online: a customer arrives at , pays if he receives items before deadline , otherwise he pays nothing. !
i
wi
wi ki
di
ri
Online Auction A company produces one perishable item per time unit
(items have to be immediately delivered to customers, e.g. electricity, ice-cream, ...)
Opt. prob: maximize the welfare over all satisfied customers.
Single-minded customers arrive online: a customer arrives at , pays if he receives items before deadline , otherwise he pays nothing. !
i
wi
Mechanism design: are private Customers may misreport
their value. They bid
wi ki
di
ri
wi
bi
Mechanism Design (MD)Mechanism:
receives all bids
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
Mechanism: receives all bids
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
Mechanism: receives all bids
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
Mechanism: receives all bids
ui =
!wi ! pi if satisfied,
0 otherwise.
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
Mechanism: receives all bids
ui =
!wi ! pi if satisfied,
0 otherwise.
Goal: self-interested behavior yields truthfulness, bi = wi
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
Mechanism: receives all bids
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
monotone: a winner still win if he
raises his bid
Mechanism: receives all bids
Mechanism Design (MD)
allocation algorithm: determine the set of satisfied customers
payment algorithm: determine how much a customer has to pay
monotone: a winner still win if he
raises his bid
Mechanism: receives all bids
critical payment:the smallest bid that a winner needs to bid in
order to win.
Truthful MD
Theorem: for single-parameter domain, a mechanism is truthful iff its allocation algo is monotone and it uses the critical payment scheme.
Our problem: design a monotone algorithm
verify whether the critical payment scheme can be computed efficiently.
Online Algorithm
Maximizing the welfare is hard.
Def: an online algorithm is -competitive if for any instance , the outcome
!
i
wi
ALGI c · ALG(I) ! OPT (I)
c
Online Algorithm
Maximizing the welfare is hard.
Def: an online algorithm is -competitive if for any instance , the outcome
!
i
wi
ALGI c · ALG(I) ! OPT (I)
Theorem: if all then there exists a 7-competitive truthful mechanism.
ki = k
c
Algorithm The CONSERVATIVE algo:
if there is no currently running job, serve the pending one with highest value
still schedule the current customer except there is a new one with value at least 2 that of the current customer
Algorithm The CONSERVATIVE algo:
Proof:
if there is no currently running job, serve the pending one with highest value
still schedule the current customer except there is a new one with value at least 2 that of the current customer
the algorithm is monotone 7-competitive by a charging scheme
i 7wi
OPTCONSERwj
j
Proof (sketch)
CONSER
OPT
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
wj/k wj/k wj/k
? ??
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
(j, 1)
type 1: if is completed by CONSERj
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
type 1: if is completed by CONSERj
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
type 1: if is completed by CONSERj
type 2: if 2wi > wj
(i, a) (i0, 1)
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
type 1: if is completed by CONSERj
type 2: if 2wi > wj
Proof (sketch)
(j, k) (j, b) (j, 1)
CONSER
OPT
type 1: if is completed by CONSERj
type 2: if 2wi > wj
(i!, a!) (i0, 1)
type 3: otherwise , is not pending.
last timestill pending
j
2wi! > wj 2wi0/k > wj/k
2wi ! wj j
then
(i, a)
Proof (sketch)
Observation: receives at most charges of type 3.
(j, b)
CONSER
OPT
(i!, a!) (i, a)(i0, 1)
last timestill pending
j
(i0, 1) k
! k
Summing up all charges, we get 7-competitive.
General case
Theorem: if all then there exists a -competitive truthful mechanism. This mechanism is optimal.
O(k/ log k)ki ! k
Proof: more elaborated but the idea is similar.
Conclusion
Motivation through two problems.
Inspired by Game Theory, using technique of Computer Science
Inspired by Computer Science, using technique of Game Theory
theoretically beautiful
real problems, practical importance.