23
DeepHop on Edge: Hop-by-hop Routing by Distributed Learning with Semantic Attention Bo He 1 , Jingyu Wang 1 , Qi Qi 1 , Haifeng Sun 1 , Zirui Zhuang 1 , Cong Liu 2 , Jianxin Liao 1 1 Beijing University of Posts & Telecommunications 2 China Mobile Research Institute

DeepHop on Edge: Hop-by-hop Routing by Distributed

  • Upload
    others

  • View
    28

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DeepHop on Edge: Hop-by-hop Routing by Distributed

DeepHop on Edge: Hop-by-hop Routing byDistributed Learning with Semantic Attention

Bo He1, Jingyu Wang1, Qi Qi1, Haifeng Sun1, Zirui Zhuang1, Cong Liu2, Jianxin Liao1

1Beijing University of Posts & Telecommunications2China Mobile Research Institute

Page 2: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

2

2

3

4

• Routing in Edge Networks• Challenges• Our DeepHop Framework• Experimental results• Conclusions

Page 3: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

3

2

3

4

• Routing in Edge Networks

Page 4: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Routing in Edge Networks

4

node v1

node v2link (v1,v2)WIFI/LTE/ETH…

queue…

Delay

sender

Bandwidth

Packet loss rate

Page 5: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

5

• Routing in Edge Networks• Challenges

Page 6: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Challenges

6

It is hard for the traditional full-path routing to handle the unexpected trafficstress on distributed edge nodes

The traffic fluctuation leads to some elements of the network state exhibitinggreat variations, thus the heuristic centralized routing can hardly capture thewhole dynamic network state transitions

Network state elements have different significance for routing, but theirimplications are intractable to be distinguished and utilized reasonably in thetraffic data forwarding process

Page 7: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

7

• Routing in Edge Networks• Challenges• Our DeepHop Framework

Page 8: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Challenges

8

It is hard for the traditional full-path routing to handle theunexpected traffic stress on distributed edge nodes

The traffic fluctuation leads to some elements of the networkstate exhibiting great variations, thus the heuristic centralizedrouting can hardly capture the global dynamic network statetransitions

Network state elements have different significance for routing,but their implications are intractable to be distinguished andutilized reasonably in the traffic data forwarding process

Hop-by-hop approach

Multi-Agent DRL method

Self-attention mechanism

The MAPOKTR algorithm

Contributions of DeepHop Framework

Page 9: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Our DeepHop Framework

9

Objective 1:minimizing the ratio of discarded packets:

Hop-by-hop routing mechanism of DeepHop

Objective 2:minimizing average slowdown of eachtraffic packet:

Page 10: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Our DeepHop Framework

10

Multi-Agent Deep Reinforcement Learning (MADRL) model

State Space:

Action Space:

Reward Function:

Page 11: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Our DeepHop Framework

11

The Multi-Agent Deep Reinforcement Learning algorithm--MAPOKTR:Multi-Agent Policy Optimization using Kronecker-Factored Trust Region

Loss Function:

the ratio of probability distributions:

the point probability distance:

Page 12: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Our DeepHop Framework

12

The sub-layers structure of DRL agents State Space:

𝑂𝑂1 = 𝑂𝑂𝜏𝜏,𝑗𝑗𝑇𝑇 :

The destination node index

𝑂𝑂2 = 𝑂𝑂𝜏𝜏𝑁𝑁 :

The network state (links and nodes)

𝑂𝑂3 = 𝑂𝑂𝜏𝜏,𝑗𝑗𝑃𝑃 :

The transmission priority

Page 13: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Our DeepHop Framework

13

The Semantic Attention Mechanism (SAM)

Page 14: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

14

2

3

4

• Routing in Edge Networks• Challenges• Our DeepHop Framework• Experimental results

Page 15: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

15

2

3

4

The experimental environments

Net1

Net2

the platform: a Dell desktop (CPU: Inteli7-8700 3.2GHz; Memory: 32G DDR4L2666Mhz; OS: 64-bit Ubuntu 16.04LTS)

the network topology graphs: two realnetwork topology Net1 (12 nodes) andNet2 (20 nodes)

the traffic data: an open real trafficdataset including 8 classes of traffic

Page 16: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

16

2

3

4

Performance of hop-by-hop approach

the MADRL-based DeepHophandles the routing tasks betterunder all degrees of congestionin edge networks than theheuristic CRF protocol

DeepHop performs more stablythan CRF in lower degrees ofcongestion according to theirresults in Net1

Task Unfinished Ratio (TUR): the ratio of the packets beingdiscarded before reaching their own destination nodes

Speed of Injection: leads to different degrees of congestion

Coexistent Routing and Flooding (CRF) algorithm: a state-of-the-art heuristic routing protocol of the edge network

Page 17: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

17

2

3

4

Performance of Different MADRL algorithms

MAACKTR algorithm: Multi-Agent Actor Critic using Kronecker-Factored Trust Region

MADDPG algorithm: Multi-Agent Deep Deterministic Policy Gradient

Injection of speed=3000(Net1) Injection of speed=4000(Net1) Injection of speed=5000(Net1) Injection of speed=6000(Net1)

2

Injection of speed=3000(Net2) Injection of speed=4000(Net2) Injection of speed=5000(Net2) Injection of speed=6000(Net2)

Page 18: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

18

2

3

4

Performance of different neural network structures in DRL agents the structure of separate sub-layers

makes the neural network understandthe semantics effectively

the attention mechanism accelerate theconvergence of policies

MADRL-SAM: The neural networks with sub-layersand self-attention mechanism

MADRL-SM: The neural networks with sub-layers

MADRL: The neural networks only have fully-connected layers

Injection of speed=3000(Net1) Injection of speed=4000(Net1)

Injection of speed=5000(Net1) Injection of speed=6000(Net1)

Page 19: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

19

2

3

Performance of different reward function

all elements of reward function areproved to have a positive effect whenthe agents learn to handle therouting tasks

Injection of speed=3000(Net1) Injection of speed=4000(Net1)

Injection of speed=5000(Net1) Injection of speed=6000(Net1)

Reward:

Reward1:

Reward2:

Reward3:

Page 20: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Experimental results

20

2

3

Performance of DeepHop in highly-dynamic networks

DeepHop can cope with the time-varying network congestion

To generate time-varying network congestion,the number of injected packets is randomlyselected from 3,000 to 6,000 per second inexperiments

DeepHop has a good robustness andit is suitable for actual edge networkenvironments

Page 21: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Outline

21

2

3

4

• Routing in Edge Networks• Challenges• Our DeepHop Framework• Experimental results• Conclusions

Page 22: DeepHop on Edge: Hop-by-hop Routing by Distributed

State Key Laboratory of Networking and Switching Technology

Conclusions

22

DeepHop utilizes the multi-agent deep reinforcement learning to determine hop-by-hop routing for traffic packets and deploys the agents on edge nodes based onthe MEC technology.

To learn the semantics of complicated state elements, DeepHop designs a self-attention mechanism to help the DRL agents learn better policies faster.

DeepHop adopts a novel MADRL algorithm, MAPOKTR, which introduces thepoint probability to its surrogate loss function for keeping the policy beingmonotonically improved.

For the future work, we will further explore the communication between theagents on nodes and solve the re-train problem of MADRL model when thetopology of edge networks changes significantly.

Page 23: DeepHop on Edge: Hop-by-hop Routing by Distributed

Thanks!