Upload
others
View
27
Download
0
Embed Size (px)
Citation preview
Distributed Inference Between Mobile Edge Devices and the Cloud
Sandeep Chinchali*, Jenya Pergament*, Eyal Cidon*, Marco Pavone, Sachin Katti
Neural Net
1
Can robot perception tasks be done in the cloud?• Automated Sensing from Video/LIDAR
• Compute-intensive Deep Neural Nets (DNNs)
• Can resource-constrained robots scalably use
“the cloud?”
2
Uplink-limited
Credit: Alexander Kazeka, https://www.youtube.com/watch?v=1j_3fh34E44
Sensory Input
Robot Model
Limited Network
Offload Compute
Mobile Robot
Cloud Model
Cloud
Image, MapDatabases
OffloadLogic
Local Compute
Query the cloud for better accuracy?Latency vs. Accuracy vs. Power …
OutlineLearning-Based Approach to Cloud Offloading in Robotics Sandeep Chinchali, Apoorva Sharma, James Harrison, Amine Elhafsi, Daniel Kang, Jenya Pergament, Eyal Cidon, Sachin Katti, Marco Pavone, [accepted to Robotics: Science and Systems (RSS) 2019]
1. Accuracy vs Compute-Efficiency Trade-offs of DNNs2. Network Costs of Streaming Video/ LIDAR
3. A learning-based approach to Cloud Offloading
4. Simulation and Hardware Experiments
4
Accuracy of Robot and Cloud DNNs
Cloud ModelRobot Model
5
If embedded AI gets better, will I still need the cloud?
Cloud is still useful to:1. Pool video from multiple
robots2. Access large map, image
databases3. Query models trained on
more/newer data
“Cloud”: could even be a bigger on-board model
6
Jetson TX2 GPU (~$480)
Google Edge TPU (~$150)Jetson Nano (~$99)
Model Raspberry PI 3
R-pi 3 + Intel Neural Compute Stick
Jetson Nano
Edge TPU
SSD MobileNet-v2 (300x300)
1 FPS 11 FPS 39 FPS 48 FPS
Source: https://devblogs.nvidia.com/jetson-nano-ai-computing/
Outline1. Accuracy vs Compute-Efficiency Trade-offs of DNNs
2. Network Costs of Streaming Video/ LIDAR
3. A learning-based approach to Cloud Offloading
4. Simulation and Hardware Experiments
7
Uplink-limited
Network Costs of Cloud Communication
1. Congested Wireless Links2. High Bandwidth: Designed for Human, Not Robot Perception
8
J. Emmons, S. Fouladi, G. Ananthanarayanan, S. Venkataraman, S. Savarese, K. Winstein, “Cracking Open the DNN blackbox”
Our Network Congestion Experiments
“ROS Ate My Network Bandwidth!”(ROS User Forums)
~70 Mbps
Outline1. Accuracy vs Compute-Efficiency Trade-offs of DNNs
2. Network Costs of Streaming Video/ LIDAR
3. A learning-based approach to Cloud Offloading
4. Simulation and Hardware Experiments
10
WastedQueries
Cloud Offloading as a Decision Problem
Cloud Queries
RobotConfidence
Robot Correct Contending goals• Maximize Accuracy• Minimize latency• Limited Network
Share
Optimal Control
Limited Cloud Queries
RL Approach to Cloud Offloading
DNN
Edge Cloud
12
Reinforcement Learning (RL)
Goal: Maximize the total reward
Agent Environment
Observe state !"
Action #"
Reward $"
13Adapted from Pensieve (Sigcomm 18, Mao et. al.)
Exploration vs. Exploitation Tradeoff
Exploit: On-board Robot Model
Explore: Utility of Cloud by learning
RobotLimited Network
Cloud
Reward
!"#$%&'($))#$*&
!+$,$''-' Offload
Cloud Model Predict*' = /
*' = {1, 3}Past Predictions
*' = 5
State 6'
The Robot Offloading MDP
Cloud Model
Robot Limited Network
!"#$#%%
Reward
Offload
!&'#()%*#++'#,)
-%
,% = /
Cloud
,% = {1, 3}Past Predictions
,% = 5
State 6%
The Robot Offloading MDP: Action Space
Cloud Model
Robot Limited Network
!"#$#%%
Reward
Offload
!&'#()%*#++'#,)
-%
,% = /
Cloud
,% = {1, 3}Past Predictions
,% = 5
State 6%
The Robot Offloading MDP: State Space
Cloud Model
Robot Limited Network
!"#$#%%
Reward
Offload
!&'#()%*#++'#,)
-%
,% = /
Cloud
,% = {1, 3}Past Predictions
,% = 5
State 6%
The Robot Offloading MDP: Reward
Cloud Model
Robot Limited Network
!"#$#%%
Reward
Offload
!&'#()%*#++'#,)
-%
,% = /
Cloud
,% = {1, 3}Past Predictions
,% = 5
State 6%
Outline1. Accuracy vs Compute-Efficiency Trade-offs of DNNs
2. Network Costs of Streaming Video/ LIDAR
3. A learning-based approach to Cloud Offloading
4. Simulation and Hardware Experiments
19
Query Cloud
SVM Classifier
Robot Model
!"FaceNet
Embed Face A
90% Conf
Coherence Time
" = $ " = %
RL beats benchmark offloading policies> 2.6x reward of benchmarks
RL: 70 % of oracle reward
All-Robot: today’s de-facto!"
#$%&'()*+,
RL intelligently, but sparingly queries cloud
Hardware Experiments on Live Video + Embedded Compute Platform
RL for Cloud Offloading in Robotics
• Compute model size and sensory data will grow
• Judicious use of Cloud in Robotics
• RL: General Two-Stage Decision Problem
OffloadLogic
Robot ModelCloud Model
Mobile RobotLimited Network
Sensory Input
Cloud
Offload ComputeLocal ComputeImage, MapDatabases
Query the cloud for better accuracy?Latency vs. Accuracy vs. Power …
Thanks! Please See Sandeep, Eyal, Jenya
25
Emmons et. al, “Neural Networks Are Networks Too”
Uplink-limited
Sensor Representation for Machine Perception
1. Human Eye -> High Bandwidth2. All-edge/All-cloud restrictive
Can we send fewer, relevant bits for the same accuracy? 7
Google Edge TPU ($150), Nvidia Jetson Nano ($99), TX2 ($600)
Future Directions
26
Emmons et. al, “Neural Networks Are Networks Too”
Uplink-limited
Network Costs of Cloud Communication
1. Congested Wireless Links2. High Bandwidth: Designed for Human Perception
27
System Architecture
DNN
Edge Cloud
28
Should we split Vision DNNs between edge/cloud?
Edge Google
Split at Layer 5
PredictPixelsOff-the-shelf
Pixels Intermediates
Do not split rapidly-evolving DNNs!NeuroSurgeon ASPLOS ’17
Google v1 v2
Split at Layer 5 10
29
Off-the-shelf
Idea: Keep Vision DNNs Intact
Decoder Edge Encoder Google, FB
Black-Box w/ API
PredictPixels
Benefit: Extends beyond video or DNNs (e.g. robotic map-making) 30
Learning-based Approach
DNN
Edge Cloud
31
Decoder
PixelEstimateCoded
FeaturesVideo
Edge
Feedback Reward (Training)
Predict
Off-the-Shelf
System Architecture
32
Many Open Questions
• Machines (DNNs) will watch most future video
• Research Avenues:• Small—scale RL simulations [Hotnets 18]
• Practical systems prototype [Under review]
• Active Learning to query the cloud [Under review]
• Deep RL with Real Vision DNNs – next!DNN
33
Simplified Systems Prototype
DNN
Edge Cloud
34
Edge Device
Video
Feature Feedback
Coded Features
1. Active Edge Encoders
Dynamically Encode Task-Relevant Content 35
Modify Sub-Image Resolution,Crop Regions,
“Machine” features, …
DNN
Code 1, Camera 1
Code 2, Camera 2
2. Centralized Active Decoder
Estimate Edge Scenes, “Fill-in” Missing Pixels w/ memory 36
DNN
Predict
State-ful DecoderPixel
Estimates
DNNPredict
Codes
Pixels
Edge Device
Feature Feedback
3. Feature Feedback from the Cloud
What content matters?
37
Content Priorities,Camera Angle,
…
MobileNet
Edge
Audio
Video
AI Offloader:
• New Content?
• BW Sufficient?
• Edge Correct?
Low Latency Result
Cloud Model
Accurate
Cloud
Result
Offload
Don’t Offload
Ba
nd
wid
th
Mobile Offloading for Vision
38
1.2-2.1x accuracy of all-edge, 60-90% BW savings compared to all-cloud
Should we split Vision DNNs between edge/cloud?
Edge Google
Split at Layer 5
PredictPixelsOff-the-shelf
Pixels Intermediates
Do not split rapidly-evolving DNNs!NeuroSurgeon ASPLOS ’17
Google v1 v2
Split at Layer 5 10
39
Results: Mobile Offloading for Vision
1. Trade-off Accuracy for BW Savings2. Adapt to edge model accuracy
Results (normalized to all-cloud):1. 60-90% BW savings 2. 80-90% accuracy of oracle3. 1.2-2.1x accuracy of all-edge
40
Edge MobileNet v1, v2
Accuracy
Insight: Bandwidth and Task-Aware Delivery1. Human Eye -> High Bandwidth
2. All-edge/All-cloud restrictive
3. Use Off-the-Shelf DNNs
Black-boxDecoder / EstimatorFeature Extractor/Filter
41
Problem Insights1. Human Eye -> High Bandwidth
2. All-edge/All-cloud restrictive
3. Use Off-the-Shelf DNNs
Proposal: Bandwidth and Task-Aware Video Delivery
Machine Perception
42
Deep-dive into componentsEdge Cloud
43
Edge Device
!"#
Data Center
Feature Feedback
$"#%"#
&"#
Wireless Network
1. Distributed Edge Encoders
%"# = ()*+,-)(!"#, $"#, 0&"#)44
!"#
!$#
Data Center
2. Centralized Active Decoder
Pretrain Predict%#
%# = '()*#)+,-(/0#)
/0# = '2*342*(/0#5", 78#, !#)
Decoder
98# /0#
45
Pretrain Predict
!"# $%# $&#
%'# "'#('#
Edge Device
)'#
Feature Feedback
3. Feature Feedback from the Cloud
Active Decoder*)# = ,-./0-.($%#23, 5"#, (#)
46