21
Max Weight Learning Algorithms wit Application to Scheduling in Unknown Environments Michael J. Neely University of Southern California http://www-rcf.usc.edu/~mjneely Information Theory and Applications Workshop (ITA), UCSD Feb. 2009 ed in part by the DARPA IT-MANET Program, NSF OCE-0520324, NSF Career CCF-0 Pr(success 1 , …, success n ) = ??

Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments

  • Upload
    talor

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments. Pr(success 1 , …, success n ) = ??. Michael J. Neely University of Southern California http://www-rcf.usc.edu/~mjneely Information Theory and Applications Workshop (ITA), UCSD Feb. 2009. - PowerPoint PPT Presentation

Citation preview

Page 1: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Max Weight Learning Algorithms with Application to Scheduling in Unknown Environments

Michael J. NeelyUniversity of Southern California

http://www-rcf.usc.edu/~mjneelyInformation Theory and Applications Workshop

(ITA), UCSD Feb. 2009*Sponsored in part by the DARPA IT-MANET Program, NSF OCE-0520324, NSF Career CCF-0747525

Pr(success1, …, successn) = ??

Page 2: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

•Slotted System, slots t in {0, 1, 2, …}

•Network Queues: Q(t) = (Q1(t), …, QL(t))

•2-Stage Control Decision Every slot t: 1) Stage 1 Decision: k(t) in {1, 2, …, K}.

Reveals random vector w(t) (iid given k(t)) w(t) has unknown distribution Fk(w).

2) Stage 2 Decision: I(t) in I (a possibly infinite set). Affects queue rates: A(k(t), w(t), I(t)) , m(k(t), w(t),I(t)) Incurs a “Penalty Vector” x(t): x(t) = x(k(t), w(t), I(t))

0 1 2 3 4 5 6

Page 3: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Stage 1: k(t) in {1, …, K}. Reveals random w(t).Stage 2: I(t) in I. Incurs Penalties x(k(t), w(t), I(t)). Also affects queue dynamics A(k(t), w(t), I(t)) , m(k(t), w(t),I(t)).

Goal: Choose stage 1 and stage 2 decisions over time so that the time average penalties x solve:

f(x), hn(x) general convex functions of multi-variables

Page 4: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Motivating Example 1: Min Power Scheduling with Channel Measurement Costs

A1(t)

A2(t)

AL(t)

S1(t)

S2(t)

SL(t)

If channel states are known every slot: Can Schedule without knowing channel statistics or arrival rates! (EECA --- Neely 2005, 2006) (Georgiadis, Neely, Tassiulas F&T 2006)

Minimize Avg. PowerSubject to Stability

Page 5: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Motivating Example 1: Min Power Scheduling with Channel Measurement Costs

A1(t)

A2(t)

AL(t)

S1(t)

S2(t)

SL(t)

If “cost” to measuring, we make a 2-stage decision:Stage 1: Measure or Not? (reveals channels w(t) )Stage 2: Transmit over a known channel? a blind channel?

-Li and Neely (07) -Gopalan, Caramanis, Shakkottai (07)Existing Solutions require a-priori knowledge of the full joint-channel state distribution! (2L , 1024L ? )

Minimize Avg. PowerSubject to Stability

Page 6: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Motivating Example 2: Diversity Backpressure Routing (DIVBAR)

1

23

broadcasting

error

Networking with Lossy channels & Multi-Receiver Diversity:DIVBAR Stage 1: Choose Commodity and TransmitDIVBAR Stage 2: Get Success Feedback, Choose Next hop

If there is a single commodity (no stage 1 decision), we do not need success probabilities! If two or more commodities, we need full joint success probability distribution over all neighbors!

[Neely, Urgaonkar 2006, 2008]

Page 7: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Stage 1: k(t) in {1, …, K}. Reveals random w(t).Stage 2: I(t) in I. Incurs Penalties x(k(t), w(t), I(t)). Also affects queue dynamics A(k(t), w(t), I(t)) , m(k(t), w(t),I(t)).

Goal:

Equivalent to:

Where g(t) is an auxiliary vector that is a proxy for x(t).

Page 8: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Stage 1: k(t) in {1, …, K}. Reveals random w(t).Stage 2: I(t) in I. Incurs Penalties x(k(t), w(t), I(t)). Also affects queue dynamics A(k(t), w(t), I(t)) , m(k(t), w(t),I(t)).

EquivalentGoal:

Technique: Form virtual queues for each constraint.

U(t) bh(g(t)) Un(t+1) = max[Un(t) + hn(g(t)) – bn,0]

Z(t) g(t)x(t) Zm(t+1) = Zm(t) – gm(t) + xm(t)

Possibly negative

Page 9: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Use Stochastic Lyapunov Optimization Technique: [Neely 2003], [Georgiadis, Neely, Tassiulas F&T 2006]

Define: Q(t) = All Queues States = [Q(t), Z(t), U(t)]Define: L(Q(t)) = (1/2)[sum of squared queue sizes]Define: D(Q(t)) = E{L(Q(t+1)) – L(Q(t))|Q(t)}

Schedule using the modified “Max-Weight” Rule: Every slot t, observe queue states and make a 2-stage decision to minimize the “drift plus penalty”:

Minimize: D(Q(t)) + Vf(g(t))

Where V is a constant control parameter that affectsProximity to optimality (and a delay tradeoff).

Page 10: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

How to (try to) minimize:

Minimize: D(Q(t)) + Vf(g(t))The proxy variables g(t) appear separably, and their termscan be minimized without knowing system stochastics!

Minimize:

Subject to:

[Zm(t) and Un(t) are known queue backlogs for slot t]

Page 11: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Minimizing the Remaining Terms:

Minimize: D(Q(t)) + Vf(g(t))

Page 12: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Solution: Define g(mw)(t), I(mw)(t) , k(mw)(t) as the ideal max-weight decisions (minimizing the drift expression).

Define ek(t):

k(mw)(t) = argmin{k in {1,.., K}} ek(t) (Stage 1)

I(mw)(t) = argmin{I in I} Yk(t)(w(t), I, Q) (Stage 2)

g(mw)(t) = solution to the proxy problem

Then:

?

Page 13: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Approximation Theorem: (related to Neely 2003, G-N-T F&T 2006)

If actual decisions satisfy:

With:

(related to slackness of constraints)

Then: -All Constraints Satisfied. [B + C + c0V] min[emax – eQ, s – eZ]

-Average Queue Sizes <

f( x ) < f*optimal + O(max[eQ,eZ]) + (B+C)/V

-Penalty Satisfies:

Page 14: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

It all hinges on our approximation of ek(t):

Declare a “type k exploration event” independently with probability q>0 (small). We must use k(t) = k here.

{w1(k)(t), …, wW

(k)(t)} = samples over past W type k explor. events

Approach 1:

Page 15: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

It all hinges on our approximation of ek(t):

Declare a “type k exploration event” independently with probability q>0 (small). We must use k(t) = k here.

{w1(k)(t), …, wW

(k)(t)} = samples over past W type k explor. Events{Q1

(k)(t), …, QW(k)(t)} = queue backlogs at these sample times.

Approach 2:

Page 16: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Analysis (Approach 2):

Subtleties:1) “Inspection Paradox” issue requires use of samples at exploration events, so {w1

(k)(t), …, wW(k)(t)} iid.

2) Even so, {w1(k)(t), …, wW

(k)(t)} are correlated with queue backlogs at time t, and so we cannot directly apply the Law of Large Numbers!

Page 17: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Analysis (Approach 2):

Use a “Delayed Queue” Analysis:

constant constantCan Apply LLN

ttstart

wW(t)w1(t) w2(t) w3(t)

Page 18: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Max-Weight Learning Algorithm (Approach 2):(No knowledge of probability distributions is required!)

-Have Random Exploration Events (prob. q).

-Choose Stage-1 decision k(t) = argmin{k in {1,.., K}}[ ek(t) ]

-Use I(mw)(t) for Stage-2 decision: I(mw)(t) = argmin{I in I} Yk(t)(w(t), I, Q(t))

-Use g(mw)(t) for proxy variables.

-Update the virtual queues and the moving averages.

Page 19: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Theorem (Fixed W, V): With window size W we have:

-All Constraints Satisfied. [B + C + c0V] min[emax – eQ, s – eZ]

-Average Queue Sizes <

f( x ) < f*q + O(1/sqrt{W}) + (B+C)/V

-Penalty Satisfies:

Page 20: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments

Concluding Theorem (Variable W, V): Let 0 < b1 < b2 < 1.

Define V(t) = (t + 1) b1 , W(t) = (t+1)b2

Then under the Max-Weight Learning Algorithm: -All Constraints are Satisfied. -All Queues are mean rate stable*:

-Average Penalty gets exact optimality (subject to random exploration events):

f( x ) = f*q

*Mean rate stability does not imply finite average congestion and delay. In fact, Average congestion and delay are necessarily infinite when exact optimality is reached.

Page 21: Max Weight Learning Algorithms  with Application  to  Scheduling in Unknown Environments