Designing Autonomic Wireless Multi-hop Networks for Delay ...medianetlab.ee.ucla.edu/data/slides/Defense_presentation_shiang.pdf · Designing Autonomic Wireless Multi-hop Networks

1

Designing Autonomic Wireless Multi-hop Networks for Delay-Sensitive Applications

Peter Hsien-Po ShiangAdvisor : Prof. Mihaela van der Schaar

Electrical Engineering, UCLA

2

Delay-sensitive applications are booming!

Examples of delay-sensitive applications

1) Hard delay constraints2) Prioritized multimedia traffic

(graceful degradation desired)

Video telephony SurveillanceLive audio Live video

Vehicular communications

Battlefield sensing GamesVideo conferencing

3

Overall goalBuilding efficient multi-hop networks for delay-sensitive applications

- Autonomic decision making framework• Gather local information• Learn• Make decisions and interact

Channel condition

Autonomic node = Agent

Information gathering phase

Control info.

DataDecision making phase

Info. exchange

Learning phase

Application layertraffic requirementsNetwork

environment

4

Autonomic network scenarios

SINRInterference coupling among transmitter-receiver pairs

Power control over ad hoc mobile networks

Primary users’loading, other secondary users’ actions

Resource availability(spectrum holes)

Distributed resource management over cognitive radio networks

Source rate, transmission rate, packet error rate

Source traffic, channel condition

Multimedia transmission over wireless mesh network

Local information

DynamicsNetwork scenarios

……

……

……

1PU

2PU

3PU

4PU

1PU

4PU

1F

2F

3F

4F

1 2, ,..., NSU SU SU

S2

D1

r4

r5

r1

r2

r3

S1

D2

S2S2

D1D1

r4

r5

r1r1

r2

r3

S1

D2

5

Overview

I. Multimedia transmission over mesh networksII. Exploiting information over space

– information horizonIII. Exploiting information over time

– learningIV. Conclusions

1. Information gathering phase

Control info.

Data packets3. Decision phase

2. Learning phase

Multimediacharacteristics

6

Focus: multiple multimedia applications over multi-hop wireless networks

V1 V2

Nodes:Applications:Actions:Utility:Goal: Maximizing overall efficiency

iV=V

m=M

, mA m= ∀ ∈A M

( )iU A

max ( )i

i

V

U∈∑

AV

A

m

, 1,..., i k iV C k K= =

Edges:Classes:

( , ') m ml=E

7

Limitations of prior work (1/2)

Centralized optimization for multimedia transmission

• Wu and Chou (2005)

• Setton, Yoo, Zhu, Goldsmith, and Girod. (2005)

• Jurca, Frossard (2007)

• Andreopoulos, Mastronarde, and Van der Schaar (2006,2007)

max ( ( ))

s.t. ( )

i

i i

V

U R

CR

∈

⊆

∑A

V

A

R A Capacity constraint

Function of resource, e.g. throughput

Low

High

Complexity

Distributed

Centralized

Decision maker

Fast

Slow

Adaptation ability

LowProposed sol.

HighTraditional sol.

Information overhead

8

Limitations of prior decentralized work (2/2)

Flow-based optimized routing using queue information feedback

• Awerbuch and Leighton (1994)

• Neely, Modiano, Rohrs (2005)

• Gupta, Javidi (2007)

• Gupta, Lin, Srikant (2007)

Flow-based optimized routing using link state information

• Wei, Zakhor (2002)

• Draves, Padhye, Zill (2004)

Online adaptation

Predetermined

Resource allocation

Packet-based

Flow-based

Applicationmodel

ExplicitProposed sol.

ImplicitTraditional sol.

Delayconstraint

9

Challenges

Heterogeneous characteristics of delay-sensitive applications

Different priorities, hard delay deadlines, and loss tolerance

Time-varying transmission environmentDynamic network conditions

Informationally-decentralized environmentCost of information gathering

Coupling among agents’ actions and utilities

10

Required solution features for multimedia transmission

Fully distributed optimization that determines the actions at each node , e.g. how to relay

Dynamic adaptation to the changing network/source conditions at each node and coupling between nodes’actions

From rate-constraint flow-based to explicit delay-constraint packet-based optimization

mA

11

Delay-sensitive application quality model

1i

Ktransi k k

V k

Q Rλ∈ =

=∑ ∑V

Quality at the sources:

1

(1 ( ))i

Kreceivei k k k

V k

Q R Pλ∈ =

= −∑ ∑V

A

Quality at the destinations:

GOP Transmission Time

Multimedia Packets

:

:

:

:

Classes

1C

3C

2C

4C

1 1 1, ,D Lλ

3 3 3, ,D Lλ

2 2 2, ,D Lλ

4 4 4, ,D Lλ

1V

2V

Source Nodes Relay Nodes

1m

2m

3m

4m

12

Cross-layer transmission strategy (action )

Applicationscheduling

Networkrelay selection

Data Linkretransmission limit

PhysicalMCS selection

mA

Priority queuing model provides an unified framework to analyze

• Packet scheduling• Relay selection impacts packet

arrival process • ARQ can be modeled as geometric

service time distribution• MCS provides different physical

transmission rates and packet error rates

Per-packet decisions with explicit delay constraints

13

Elementary structure (2-hop case)

V users with distinct sources and destinations.M intermediate nodes.Information feedback from all the nodes of the next hop

– fully distributed.

14

Decisions at the PHY and MAC layer

We show that in the delay constraint optimization, it is optimal to transmit the most important packet first with infinite retransmission limit

Knowing the next relay, the optimal modulation and coding scheme:

, 1, h h

MAXk m mγ

+→ ∞

( )1 1 11, , , , , ,, ,

ˆ argmax 1 ( , ) ( )h h h h h hh h

effk m m k m m k k m mk m mT p L Tθ θ θ

+ + ++

= = −

( )1

, 1

,, ,, 1

hh h

h h

eff PASSk k mk m mMAX

k m mk

T D Delay

Lγ +

+

− = −

Cascade the elementary structure to a multi-hop network

15

Overlay multi-hop network structure

Directed Acyclic Multi-hop NetworkCan be applied to any physical network as an overlay network

Classes

1

K

C

C

Information feedback

16

Decentralized optimization

• Advantages:– No predetermined rate allocation– Low complexity– Fully distributed solution– Fast adaptation to network changes

• Delay evaluation

Centralized cross-layer optimization:

1

argmax (1 ( ))

s.t. ( ) , 1,...,

Kopt

k k k

k

k k

R P

Delay D k K

λ=

= −

≤ =

∑A

A A

A

Decentralized optimization:

,

, ,,

, , ,

( ) argmin [ ( , )]

s.t. ( , )

h h h hhk mh

h h h h

optm k m k m mk m

A

PASSk m k m m k k m

A E Delay A

Delay A D Delay

=

≤ −

L L

L

,[ ]hk mE Delay

Delay constraints

17

Route selection at the network layer

Our method: a generalization of the Bellman-Ford routing algorithm

Local information:Transmission rates, packet error rates, expected delays

1

1 1

1

, , , 1, ,1

[ ] [ ] [ ]h

h h h h

h

M

k m k m k h m k m

m

E Delay EW E Delayβ+

+ +

+

+=

= + ∑

Information from the next hopInformation to the previous hop relay selecting parameter

, ,,1 [ ]h

h

kk h m

k m

Coeff

E Delay γβκ

=+

Property: Automatically avoid the congestion regionMulti-path routing

1

,

1

1 [ ]hh

kk mm that

feedback

CoeffE Delay γκ

−

= +

∑

Proposed solution: self-learning algorithm [H. Shiang JSAC 2007]

18

Convergence of the route selection

Proposition 1: The self-learning policy over an H-hop overlay network will converge to a steady state

Key idea:

Evaluate

,[ ]hk mE Delay

Get informationfeedback

1,[ ]hk mE Delay+

Learn the relayselecting prob.

1, 1, hk h mβ++

Priority queuing analysis for

,[ ]hk mEW

1

1 1

1

, , , 1, ,1

[ ] [ ] [ ]h

h h h h

h

M

k m k m k h m k m

m

E Delay EW E Delayβ+

+ +

+

+=

= + ∑

19

Advantages of using a queuing model

Advantages:• Fast adaptation to the dynamics• Sophisticated models at different layers

Video streamsstatistics

Relayselection

Input rateanalysis

, , hk h mη

, , hk h mX

Service timeanalysis

(MAC layer)Retransmission,TXOP(PHY) Modulation

Priority queuing analysis

Delay/Packet loss

20

Average queue waiting time analysisAverage queue waiting time (M/G/1 preemptive-repeat model):

2, , , ,

1, , 1

, , , , , , , ,1 1

[ ]

[ ]

2 1 [ ] 1 [ ]

h h

h

h h h h

k

i h m i h m

ik h m k k

i h m i h m i h m i h m

i i

E X

EW

E X E X

η

η η

=−

= =

= − −

∑

∑ ∑

Interference affects the average service time

1

, , , , ,0

1

, , , , ,0 1

, , , ,, ,1

Prob [ ]

[ ] [ ]

[ ] exp( )[ ]

h h

h h

h h

h

h

k h m k h m k k j

j

h K

k k j i h m i h mKj i

i h m i h mk h mi

P W D EW

D EW E X

E XEW

η

η

−

=

−

= =

=

= > −

− ≈ −

∑

∑ ∑∑

Approximation of packet loss rate at a relay

, , , , ,1

[ ] [ ]j

j j

j

M

k j k j m k j m

m

EW EWβ=

= ∑

21

Results of the elementary structure

SimulationAnalytical

35.5935.1034.2932.2635.6135.3433.9332.49PSNR(dB) Coastguard

33.0532.0031.4129.3433.1231.7430.3430.15PSNR(dB) Mobile

0.60.50.40.30.60.50.40.3Tm(Mbps)

v1

v2 v2

v1

m1

m2

m3

m4

5Tm

4Tm5Tm

3Tm

5Tm

4Tm5Tm

3Tm

3Tm4Tm5Tm5Tm

4Tm

5Tm

3Tm

5Tm

Analytical Result

22

Results for a 6-hop network

SimulationAnalytical

35.5833.8833.5631.8635.6133.9333.9232.48PSNR(dB) Coastguard

32.8531.3530.2128.3933.1231.7430.3428.20PSNR(dB) Mobile

0.60.50.40.30.60.50.40.3Tm(Mbps)

Analytical Result

23

Comparisons with state-of-the-art routing solutions

S2

D1

r4

r5

r1

r2

r3

S1

D2

: Physical connections: Overlay connections

4.3Tm

4.1Tm

3.8Tm

4Tm

5.1Tm

3Tm

2.5Tm

4.9Tm

4Tm

3.8Tm

4.1Tm

4.2Tm

4Tm

3.8Tm

S2S2

D1D1

r4

r5

r1r1

r2

r3

S1

D2

: Physical connections: Overlay connections

4.3Tm

4.1Tm

3.8Tm

4Tm

5.1Tm

3Tm

2.5Tm

4.9Tm

4Tm

3.8Tm

4.1Tm

4.2Tm

4Tm

3.8Tm

35.6133.1033.2730.42Self-learning policy

35.5832.8531.8628.39MDTMR [Wei, Zakhor 2004]

34.3231.3730.6724.98AODV [Perkins 1999]

“Coastguard”Y-PSNR (dB)

“Mobile”Y-PSNR(dB)

“Coastguard”Y-PSNR (dB)

“Mobile”Y-PSNR(dB)

Tm = 0.6 (Mbps)Tm = 0.3 (Mbps)Simulated

method

24

Overview




The need of information feedback• Decentralized decision making• Timely adaptation• Inter-user collaboration

Similar concept can be found in distance vector routing protocols, e.g. AODV [Perkins 1999], DSDV [Perkins 1994]

• Information horizon = 1 hop?

25

Larger information horizon

Information horizon

n1

n2

n3

n4

n5

n6

n7

hop1 hop2 hop3

Video data (With TX strategies)

Information feedback

TX strategies

n1

n2

n3

n4

n5

n6

n7

hop1 hop2 hop3

TX strategies

horizonh

from multiple agents

1C

2C

3C

4C

1horizonh = 2horizonh =

Advantages:• More accurate delay estimation• Faster adaptation

26

Information horizon tradeoff

( ( ))horizonkP h ↓A

1

argmax (1 ( ( )))K

opt horizonk k k

k

R P hλ=

= −∑A

A A

better decisions of , so that

larger time overhead per packet

( )horizonhA

,[ ( )]h

horizonk mE X h ↑ ( ( ))horizon

kP h ↑A

Example: risk-aware scheduling hmπ

27

Risk-aware scheduling – definition of “risk”Three categories of the queued packets

“Dropped” packets

“Almost dropped” packets “Seldom dropped” packets

Definition of risk:

, h

PASSk m kDelay D>

, ,[ ]h h

PASSk m k m kDelay E Delay D+ >

, ,[ ]h h

PASSk m k m kDelay E Delay D+ ≤

I,

I, , ,

,

,

( , )

Prob( [ ]), if [ ] 0 (seldom-dropped packets)=

0 , if [ ] 0 (almost-dropped packets)

=

h

h h h

h

h

horizonk m

rem remk m k m k m

remk m

i m

Risk Time h

W Time E D E D

E D

Eη

+ > > ≤

( ), ,

1 I, , ,

,1

[ ]

[ ] exp [ ] , if [ ] 0 [ ]

0 , if

h h

h h h

h

k

i m i mki rem rem

i m k m k mk mi

E X

X Time E D E DEW

E

η=

=

× − >

∑∑

,[ ] 0 h

remk mD

≤

, , ,[ ] [ ]h h h

rem PASSk m k k m k mE D D Delay E Delay= − −

28

Illustrative example

2,[ ]remmE D

Class should be sent before class during , since it is more “risky”

2C 1CITime

1,[ ]remmE D

User 1: Mobile Deadline: 500 ms

User 2: Coastguard Deadline: 300 ms

29

Problem formulation

Priority-based packet scheduling

Risk-aware packet scheduling

I I, ,

1

1

,

( )

argmax ( , ) ( , )

subject to ( ,..., ,..., ),

, if , and

h

h h hmh

h

h

IFDS horizonm

Khorizon

k k m k m m

k

m l L

PASSl k l m k

h

Risk Time h N Time

drop l C Delay D

λ

π π π

π

=

=

× ×

=

= ∈ ≥

∑π

π

π

π

( )I,1

1

,

argmax ,

subject to ( ,..., ,..., ),

, if , and

h h hmh

h

h

KPRIm k k m m

k

m l L

PASSl k l m k

N Time

drop l C Delay D

λ

π π π

π

=

= ×

=

= ∈ ≥

∑π

π π

πNumber of class packets sentkC

Instead of using onlykλ

30

Optimal information horizon

S1

S2

D1

D2

10Tm

10Tm

5Tm

5Tm

5Tm

5Tm4Tm

3Tm

3Tm4Tm

5Tm

5Tm 5Tm

5Tm

4Tm

3Tm

5Tm

4Tm

5Tm

5TmVideo: MobileDeadline = 500ms

Video: CoastguardDeadline = 300ms

Hop1 Hop2 Hop3 Hop4

n1

n3

n4

n5

n6

n7

n8

n2

5Tm

5Tm

Hop5

n9

n10

n11

n12

n13

Hop6

5Tm 10Tm

5Tm

5Tm

4Tm

3Tm

3Tm

4Tm

3Tm

5Tm

5Tm

10Tm

10Tm

10TmS1

S2

D1

D2

10Tm

10Tm

5Tm

5Tm

5Tm

5Tm4Tm

3Tm

3Tm4Tm

5Tm

5Tm 5Tm

5Tm

4Tm

3Tm

5Tm

4Tm

5Tm

5TmVideo: MobileDeadline = 500ms

Video: CoastguardDeadline = 300ms

Hop1 Hop2 Hop3 Hop4

n1

n3

n4

n5

n6

n7

n8

n2

5Tm

5Tm

Hop5

n9

n10

n11

n12

n13

Hop6

5Tm 10Tm

5Tm

5Tm

4Tm

3Tm

3Tm

4Tm

3Tm

5Tm

5Tm

10Tm

10Tm

10Tm

31.75

30.85

29.59

Riskh=4

32.0

31.1

29.63

Riskh=3

31.55

30.80

30.1

Riskh=2

29.61300Kbps

30.75400Kbps

31.50500Kbps

Priorityh=1

Tm

Analytical average PSNR (dB) for various information horizon

31

Overview




32

Given limited information feedback, can an agent do better?

Remarks: • Interaction with other agents• Local information is changing over time• Current actions may influence future

local information

Answer: Yes!!Solution: Learn changing environment

and make foresighted decisions

Agent

e.g.input rate, SINR, etc.

Utility evaluation

Determine transmission

action

Gather local information

Wirelessnetworks

(otheragents)

future influence

33

Foresighted decision making

Key ideas:• Agent does not have to wait for something really to happen then react!!• The anticipation is not only over space, but also over time• Markov decision process

Futureutility

evaluation

Gather local Information

State


action

Agent

Priority queuing model

,[ ( , )]h h hk m m mEW s A

, hk mη 1,h hm mx+

e.g.input rate, SINR, etc.

Wirelessnetworks

(other agents)

future influence

State transition prob.( ' | , )

h h hm m mp s s A

34

Markov Decision Process (MDP)

• Tuple:– state:– action: – transition probability:– immediate reward:– discount factor:

• Goal: maximize the discounted sum of future rewards

, , , ,p R γS A

s ∈ S

A ∈ A

( ' | , )p s s A

( , )R s A

where 0 1γ< <γ

0

0

( , | )t t t t

t

R s A sγ

∞

=∑ Why discounted???

35

MDP Solution

Policy:Optimal state-value function (Bellman equation):

Optimal policy:

Off-line solution: value-iteration

Immediate Reward

Discounted ExpectedFuture Reward

:π →S A

* *

'

( ) max ( , ) ( ' | , ) ( ')A

s

V s R s A p s s AV sγ∈

∈

= + ∑A

S

* *

'

( ) argmax ( , ) ( ' | , ) ( ')A

s

s R s A p s s AV sπ γ∈

∈

= + ∑A

S

1

'

( ) max ( , ) ( ' | , ) ( ')t t

As

V s R s A p s s AV sγ+

∈∈

= + ∑A

S

36

Reinforcement learning

Applied when the dynamics are partially known or unknown

Model-free reinforcement learning, e.g. Q-learning [Watkins 1992],TD-learning [Sutton 1988]

Cannot take advantage of the queuing model

Converges slowly

Model-based reinforcement learningPriority queuing model (M/G/1 preemptive-repeat model)Maximum likelihood state transition probability

( ' | , )p s s A

*

'

( , ) ( , ) ( ' | , ) ( ')s

Q s A R s A p s s AV sγ∈

= + ∑S

''( ) ( )t t

s sssn A n A

∈= ∑ S

''

( )ˆ ( )( )

tsst

ss ts

n AT A

n A=

.

.

37

Coupling among agents in the multi-hop network

Information feedforward 1

fhF −

Expected delay evaluation

Condition to drop a certain priority class

1, , ,[ ] [ ] [ ]h h hk m k m k mE Delay EW E Delay

+= +

, 1 ,[ ]h

PASSk h k m kDelay EW D− + >

1 1 , , h h

fbm m h hs F F+ −=L

Required local information

Information feedback 1bhF +

1

,1 ,1( ) [ ]

h

b t th k mhF m E Delay

+++ =

Agent

hm

1bhF +

1fhF −

38

, , 1, 1 1( ) arg min [ ( , )] ( ) ( ) ( ', )

h h h h h m m h h hh hm hh

mh

b t b tt t t tkh m k m m m m s s m m mh h

As

EW s A F A T A V s Fµ γ −+ +∈

= + +

∑ '

'A

L

Proposed transmission policy

Transmission policy update:

Information exchange update:

, 1 , ,1 ,1( ) ( ) [ ( , ( ))]

h h h

b t b t d tt t th h k m m mh h khF m F m EW s µ+

++= + L

, 1 ,, 1 ,( ) [ ( , ( ))]

h h h

f t d tPASS t t th k h k m m mh khF m Delay EW s µ+

−= + L

Current delay Future delay

39

Proposed distributed MDP

Step 1: Gather local informationStep 2: Evaluate state transition prob. and queuing delaysStep 3: Update transmission policyStep 4: Exchange information

Futureutility

evaluation

DistributedMDP

Local Information

State


action h hm ∈ M

Decisionprocessof agents

1fhF −

bhF

fhF

hms

( )hkh msµ

Markovian statetransition

1bhF +

, , 1, 1 1

ˆ( ) arg min [ ( , )] ( ) ( ) ( ', )h h h h h m m h h hh h

m hhmh

b t b tt t t tkh m k m m m m s s m m mh h

As

EW s A F A T A V s Fµ γ −+ +∈

= + +

∑ '

'A

L

Feedback-modified Bellman equation

40

Futureutility

evaluation

DistributedMDP

Local Information

State


action 1 1h hm − −∈ M


1bhF −

1hms −

11( )hkh msµ−−


converge

Convergence of proposed distributed MDP

Proposition 2: The transmission policy of the distributed MDPwill converge if and only if the priority class is not dropped in the networks

[ , 1,..., ]tkh h Hµ =

Futureutility

evaluation

DistributedMDP

Local Information

State


action h hm ∈ M


bhF

1fhF −

hms

( )hkh msµ


1bhF +

Last hop

converge

kC

41

Comparison with traditional routing solutions

Existing routing solutions– Throughput optimal [Tassiulas 1996]– Flow-based optimized routing using queue size backpressure [Neely and

Modiano 2006]– Throughput and delay optimized opportunistic routing [Gupta and Javidi 2007]– Low complexity distributed joint scheduling-routing algorithms

[Gupta, Lin, Srikant 2007] – Selfish routing based on congestion information [Roughgarden 2002]– Network utility maximization framework (NUM) [Kelly 1998][Xu 2008]

Required knowledge

Decision making

Online learning based on local information

A priori known environment

(e.g. given capacity region)

Foresighted decision makingMyopic decision making

Proposed autonomicmulti-hop routing

Traditional routing

42

0 20 40 60 80 100 1200

20

40

E[D

elay

1]

0 20 40 60 80 100 1200

20

40

E[D

elay

2]

0 20 40 60 80 100 1200

20

40

E[D

elay

3]

0 20 40 60 80 100 1200

20

40

E[D

elay

4]

40

Model-based learningSelf-learningQ-learning

Simulation results

Delay deadline: 1sec

Packet loss

time (sec)



2 4,C C →

1 3,C C →

2 4,C C→

1 3,C C→

43

Multi-agent interactive learning solutions -required observations and information

Model-free learning– Q-learning [Watkins 1992]– TD-learning [Sutton 1988]– Reinforcement learning

Model-based learning– Fictitious play

[Brown 1951][Shapley 1996]– Model-based

reinforcement learning [Singh 1995][Ok 1998]

Informationoverhead

Reinforcementlearning

Model-basedreinforcement

learningobservation and informationabout the agent itself

Fictitious play

observation and informationabout all the other agents

44

Autonomic network scenarios

Reinforcement learning

SINRInterference coupling among agents

Power control over ad hoc mobile networks

Fictitious playPrimary users’loading, other secondary users’ actions

Resource availability(spectrum holes)

Distributed resource management over cognitive radio networks

Model-based reinforcement learning

Source rate, transmission rate, packet error rate

Source traffic, channel condition

Multimedia transmission over wireless mesh network

Suitable learning

Local information

DynamicsNetwork scenarios

45

Conclusions

Decentralized decision making is not enough!

Proposed new networking paradigm, where autonomic agents can self-configure and optimize the applications’performance by

adapting their cross-layer transmission strategiesproactively acquiring information by trading-off information overheads vs. performance gainsinteractively learningmaking foresighted decisions (across time and across hops)

46

Broader impact and future direction

Vision: the foresighted decision making and interactive learning approaches

Managing any decentralized system with both information and delayconstraintDecentralized management by autonomic “cognitive” agents

Future directionHierarchies of cognitive agents

Coalition of agents

Solutions for malicious behavior prevention

47

Summary of main contributions

Cross-layer design for multimedia streamingMulti-user video streaming over multi-hop wireless networks[JSAC 2007, Asilomar 2006, IIH-MSP 2006]Risk-aware scheduling [TMM 2007, VCIP 2008]

Dynamic resource management in cognitive radio networksQueuing-based channel selection for multimedia transmission[TMM 2008, ICIP 2008]Joint route/channel selection in multi-hop cognitive radio networks [TVT 2008, DySPAN 2008]

Learning in gamesAdaptive learning in power control game [TVT 2009]Predictive channel selection [ICC 2008]Learning in conjecture-based channel selection game [TNet submitted, Gamenets 2009]Model-based reinforcement learning for distributed MDP[under preparation]

OtherRouting decision for surveillance network under information constraint[TCSVT Submitted]

48

Journal publicationsAccepted• Hsien-Po Shiang , Mihaela van der Schaar, “Multi-user Video Streaming over Multi-hop Wireless

Networks: A Distributed, Cross-layer Approach Based on Priority Queuing,” IEEE Journal of Selected Areas in Communications, vol. 25, no. 4, pp. 770-785, May 2007.

• Hsien-Po Shiang , Mihaela van der Schaar, “Informationally Decentralized Video Streaming over Multi-hop Wireless Networks,” IEEE Transactions on Multimedia, vol. 9, no. 6, pp. 1299-1313, Oct 2007.

• Hsien-Po Shiang , Mihaela van der Schaar, “Queuing-Based Dynamic Channel Selection for Heterogeneous Multimedia Applications over Cognitive Radio Networks,” IEEE Transactions on Multimedia, vol. 10, no. 5, pp. 896-909, Aug. 2008.

• Hsien-Po Shiang , Mihaela van der Schaar, “Distributed Resource Management in Multi-hop Cognitive Radio Networks for Delay Sensitive Transmission,” IEEE Transactions on Vehicular Technology, vol. 52, no.2, pp. 941-953, Feb 2009.

• Hsien-Po Shiang , Mihaela van der Schaar, “Feedback-Driven Interactive Learning in Dynamic Wireless Resource Management for Delay Sensitive Users,” IEEE Transactions on Vehicular Technology, accepted, to appear.

Submitted• Hsien-Po Shiang , Mihaela van der Schaar, “Conjecture-Based Channel Selection for

Autonomous Delay-Sensitive Users in Multi-Channel Wireless Networks,” submitted to IEEE Transactions on Networking.

• Hsien-Po Shiang , Mihaela van der Schaar, “Information-Constrained Resource Allocation in Multi-Camera Wireless Surveillance Networks,” submitted to IEEE Transactions on Circuits and Systems for Video Technology.

49

• Conference Papers• Hsien-Po Shiang , Mihaela van der Schaar, “Delay-Sensitive Resource Management in Multi-hop

Cognitive Radio Networks" in IEEE Dynamic Spectrum Access Networks (DySPAN 2008), Oct. 2008.

• Hsien-Po Shiang , Mihaela van der Schaar, “Dynamic Channel Selection for Multi-user Video Streaming over Cognitive Radio Networks," in Proc. Int. Conf. On Image Processing. (ICIP 2008) Oct. 2008.

• Hsien-Po Shiang , Wenchi Tu, Mihaela van der Schaar, “Dynamic Resource Allocation of Delay Sensitive Users Using Interactive Learning over Multi-carrier Networks," in Proc. Int. Conf. Commun. (ICC 2008) May 2008.

• Hsien-Po Shiang , Mihaela van der Schaar, “Risk-aware scheduling for multi-user video streaming over wireless multi-hop networks,” in IS&T/SPIE Visual Communications and Image Processing (VCIP 2008), San Jose, Jan 2008.

• Hsien-Po Shiang , Mihaela van der Schaar, “Multi-user Video Streaming over Multi-hop Wireless Networks: A Cross-layer Priority Queuing Approach,” in IEEE Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), pp. 255-258, Dec 2006.

• Hsien-Po Shiang , D. Krishnaswamy, and Mihaela van der Schaar, “Quality-aware Video Streaming over Wireless Mesh Networks with Optimal Dynamic Routing and Time Allocation,” in Proceedings of the 40th Asilomar Conference on Signals, Systems, and Computers, Oct 2006.

• D. Krishnaswamy, H.-P. Shiang, J. Vicente, W. S. Conner, S. Rungta, W. Chan and K. Miao, “A Cross-Layer Cross-Overlay Architecture for Proactive Adaptive Processing in Mesh Networks,” in 2nd IEEE Workshop on Wireless Mesh Networks (WiMesh 2006), Sep 2006.

50

References (1/2)

[WCZ05] Y. Wu, P. A. Chou, Q. Zhang, K. Jain, W. Zhu, S.Y. Kung, "Network Planning in Wireless Ad Hoc Networks: A Cross-Layer Approach", IEEE Journal on Selected Areas in Communications, vol. 23, no. 1, pp. 136-150, Jan. 2005.

[SYZ05] E. Setton, T. Yoo, X. Zhu, A. Goldsmith, and B. Girod, “Cross-layer design of Ad hoc Networks for real-time video streaming,” IEEE Wireless Communications Mag., pp. 59-65, Aug 2005.

[JF07] D. Jurca, P. Frossard, “Packet Selection and Scheduling for Multipath video streaming,” IEEE Transactions on Multimedia, vol. 9, no. 2, Apr. 2007.

[AMV06] Y. Andreopoulos, N. Mastronarde, and M. van der Schaar, “Cross-layer Optimized video Streaming over wireless multi-hop Mesh Networks,” IEEE Journal on Selected Areas in Communications, vol. 24, no. 11, Nov 2006, pp. 2104-2115.

[PR99] C. E. Perkins, E. M. Royer, “Ad hoc on-demand distance vector routing,” in Proceedings of the 2nd IEEE Workshop on Mobile Computing Systems and Applications, pp. 90-100, Feb 1999.

[PB94] C. E. Perkins, P. Bhagwat, “Highly Dynamic Destination-Sequenced Distance-Vector Routing (DSDV) for Mobile Computers,” ACM SIGCOMM Computer Communication Review, vol. 24, no. 4, pp. 234-244, Oct. 1994.

[WZ02] W. Wei, and A. Zakhor, “Multipath unicast and multicast video communication over wireless ad hoc networks,” Proc. Int. Conf. Broadband Networks, Broadnets, pp. 496-505, 2002.

[DPZ04] R. Draves, J. Padhye, and B. Zill, “Routing in multi-radio, multi-hop wireless mesh networks,”in Proc. ACM Internat. Conf. on Mob. Computing and Networking (MOBICOM), 2004, pp. 114-128.

[AL94] B. Awerbuch and T. Leighton, “Improved Approximation Algorithms for the Multi-commodity Flow Problem and Local Competitive Routing in Dynamic Networks,” Proc. 26th ACM Symposium on Theory of Computing, May 1994.

[NMR05] M. J. Neely, E. Modiano, and C. E. Rohrs, “Dynamic Power Allocation and Routing for Time-Varying Wireless Networks”, IEEE Journal on Selected Areas in Communications, vol. 23. no1, Jan 2005. pp. 89-103.

[GJ07] P. Gupta and T. Javidi, "Towards Throughput and Delay-Optimal Routing for Wireless Ad-Hoc Networks,'' Asilomar Conference on Signals, Systems and Computers, Nov. 2007.

51

References (2/2)

[WD92] C. J. C. H. Watkins, P. Dayan, “Q-learning”, Machine Learning, vol. 8, no. 3-4, pp. 279-292, May 1992.

[Sut88] R. S. Sutton, ”Learning to predict by the method of temporal differences,” Machine Learning, vol. 3, no. 1, pp. 9-44, Aug. 1988.

[TO98] P. Tadepalli and D. Ok, "Model-based average reward reinforcement learning", Artificial Intelligence, Volume 100, Issues 1-2, January 1998, Pages 177-224.

[BBS95] A. G. Barto, S. J. Bradtke and S. P. Singh, "Learning to act using real-time dynamic programming", Artificial Intelligence, Volume 72, Issues 1-2, January 1995, Pages 81-138.

52

Sub-flows separation of Coastguard and Mobile video sequences

1λ

2λ

3λ

5λ

7λ

4λ

6λ

8λ

53

Agent

Local information

Transmission action

Delay evaluation

hmA

hmL

54

Delay-Sensitive Multimedia Applications

• Heterogeneous dependencies– Delay deadlines– Time-varying complexity

• Loss tolerant / adaptable

0 1 2 3 4 5 6 7 8 90

1000

2000

3000

4000

5000

6000

7000

8000Complexity profile over time for decoding four layers -- Silent.CIF at 1.5 Mb/s

Time (sec)

Nor

mal

ized

Pro

cess

or T

icks

0 1 2 3 4 5 6 7 8 90

1000

2000

3000

4000

5000

6000

7000

8000Complexity profile over time for decoding four layers -- Silent.CIF at 1.5 Mb/s

Time (sec)

Nor

mal

ized

Pro

cess

or T

icks

(c)

Decoding complexity (Silent sequence)

Time (seconds)

N

orm

aliz

ed C

ompl

exit

y

(a) Sequential Dependencies

(a) Typical Hybrid Coder Dependencies (MPEG-2, H.264/AVC)

(a) Scalable Coding Dependencies

[Chou, 2006]

55

Multimedia transmission over wireless mesh networks

S2

D1

r4

r5

r1

r2

r3

S1

D2

S2S2

D1D1

r4

r5

r1r1

r2

r3

S1

D2

56

Resource management in cognitive radio networks

1PU

2PU

3PU

4PU

1PU

4PU

1F

2F

3F

4F

1 2, ,..., NSU SU SU

57

Power control in ad hoc networks

……

……

……

58

Example: distributed channel/route selection in multi-hop cognitive radio networks[TVT Shiang 2008]

20 40 60 80 100 120 14020

40

60

80

100

120

140

1 2 3

4 56

7 8 9

1011 12

13 14 15

1V

2V

1dn

1sn

2sn

2dnm

App: 2 video streamsUsers: 15 secondary users (nodes)Actions: channel/route selection

2 frequency channelsUtilities: reduce packet loss rate of

delay-sensitive applicationsTransmission range: 40 metersPrimary user around node 11,12Adopt fictitious play

59

Fictitious Play

– Goal: learn the other agents’ policies – Count the empirical frequency of the other agents’ actions– Probabilistic behaviors

( )[ ( ), for ] , ( )

( )u

tu ut t t

u u u u u u tu

A

r AS A A S A

r A∈

= ∈ ∈ =∑

S

A

A S

1: arg max [ ( , )]v

v v

t t t tv v v v

AA EU A B

ππ

−

−

∈=

A

: , v

t t tv u vB u

π− − = ∈ ΩS S

Privateinformation

Evaluate andmaximize

User vm

Users v−tvπ

1tvπ−

−Wireless network

environment

tvL

1tv−−L

Fictitious playvΛ

v

tBπ−

[ ]tvEU

Should an agent monitorall the other agents??

60

Information cellBenefit of acquiring more information

Build more accurate belief Avoid “information mismatch problem”

Cost of gathering information

1n

3n

4n

1m

6n

5n

2n

(a)

6 6 6, ( ), [ ( )]n k kA n E d nI

1 1 1, ( ), [ ( )]n k kA n E d nI

5 5 5, ( ), [ ( )]n k kA n E d nI

3 3 3, ( ), [ ( )]n k kA n E d nI

4 4 4, ( ), [ ( )]n k kA n E d nI

Interference range of 2nInformation horizon

1n1n

3n3n

4n4n

1m1m1m

6n6n

5n5n

2n2n

(a)

6 6 6, ( ), [ ( )]n k kA n E d nI

1 1 1, ( ), [ ( )]n k kA n E d nI

5 5 5, ( ), [ ( )]n k kA n E d nI

3 3 3, ( ), [ ( )]n k kA n E d nI

4 4 4, ( ), [ ( )]n k kA n E d nI


6 6 6, ( ), [ ( )]n k kA n E d nI

1 1 1, ( ), [ ( )]n k kA n E d nI

5 5 5, ( ), [ ( )]n k kA n E d nI

3 3 3, ( ), [ ( )]n k kA n E d nI

4 4 4, ( ), [ ( )]n k kA n E d nI


2n

1n

3n

4n

1m

6n

5n

(b)

3 3 3, ( ), [ ( )]n k kA n E d nI

4 4 4, ( ), [ ( )]n k kA n E d nI

1 1 1, ( ), [ ( )]n k kA n E d nI

2n2n

1n1n

3n3n

4n4n

1m1m1m

6n6n

5n5n

(b)

3 3 3, ( ), [ ( )]n k kA n E d nI

4 4 4, ( ), [ ( )]n k kA n E d nI

1 1 1, ( ), [ ( )]n k kA n E d nI

( )It ν

( ( ))I nd hL ( ( ))I nd hL

Decision making Packet transmission

( )It ν

( ( ))I nd hL ( ( ))I nd hL

Decision making Packet transmission

( ) ( ) ( )( ( )) ( )[( 1)( ) ]d I AI nd h N h K U U U= − + +L

61

Adaptive fictitious play in cognitive radio networks

Adaptive fictitious play adapts the information cell that limits the neighbors with which information is exchanged

Primary usersnZ

Minimum-delayroute/channel

selection

ˆ ( )n kA

optnA

h

Availableresource

AdaptiveFictitious play

( )n hA−

( ( ))S n h−Secondary users

in horizon

Node n

62

Results of the two applications V1 and V2

2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

Average Transmission Rate T(e,f) (Mbps)

Pac

ket

Loss

Rat

e

2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

Average Transmission Rate T(e,f) (Mbps)

Pac

ket

Loss

Rat

e

AODV V2AODV/LB V2DCS V2AFP horizon 2 V2AFP horizon 1 V2

AODV V1AODV/LB V1DCS V1AFP horizon 2 V1AFP horizon 1 V1

Myopic channel selection

Learn from less neighbors

Learn from more neighbors

Random channel selection

(Primary users loading ~ 0)

63

Results regarding the impact of primary users

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.4

0.5

0.6

0.7

0.8

Primary user time fraction ρ

Pac

ket

loss

rat

e

AFP horizon 3 V1AFP horizon 2 V1AFP horizon 1 V1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80.1

0.15

0.2

0.25

0.3

0.35

Primary user time fraction ρ

Pac

ket

loss

rat

e

AFP horizon 3 V2AFP horizon 2 V2AFP horizon 1 V2

(Primary users around nodes 11, 12, T=5Mbps)

Information cost

Documents

Designing Autonomic Wireless Multi-hop Networks for Delay ...medianetlab.ee.ucla.edu/data/slides/Defense_presentation_shiang.pdf · Designing Autonomic Wireless Multi-hop Networks