Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Deep VisionBetter object recognition with less power
PurposeWe aim to implement a real time Deep Neural Network Image Recognition and Detection algorithm running on an embedded GPU device
Drones send real time video stream to the mobile GPU device and achieve real time recognition on objects
Participating in 2018 DAC Contest
Sponsored by and collaborated with Prof. Xie and SEAL LAB
Jenny Zeng
Team Members
Charlie Xu Chenghao JiangTerry Xie
2018 DAC Contest
Features embedded system implementation of neural network based object detection for drones
Provide hardware (Nvidia Jetson Tx2) and training dataset
Evaluation○ 20 FPS minimal○ Accuracy of detection○ Power consumption
Hardware - Nvidia Jetson TX2
GPU NVIDIA Pascal™, 256 CUDA cores
CPU HMP Dual Denver 2/2 MB L2 + Quad ARM® A57/2 MB L2
Memory 8 GB 128 bit LPDDR4 1866 MHz59.7 GB/s
Power External 19V AC Adapter 7.5 Watt Typical / 15 Watt Max
Software Support
● Ubuntu 16.04 LTS● Jetpack 3.1 SDK
○ Deep Learning: TensorRT, cuDNN, NVIDIA DIGITS™ Workflow
○ Computer Vision: NVIDIA VisionWorks, OpenCV○ GPU Compute: NVIDIA CUDA, CUDA Libraries○ Multimedia: ISP Support, Camera imaging, Video CODEC
GPU Nvidia Jetson TX2 Nvidia GTX 1080 Founders
Edition
Cuda Cores 256 2560
GPU Frequency 854 MHz Max-Q / 1122 MHz Max-P 1607 MHz Base / 1733 MHz Boost
Memory No dedicated Graphics Memory, 8GB LPDDR4 shared with CPU
8 GB GDDR5X 10Gbps
Power Consumption 7.5 Watt Typical / 15 Watt Max
180 W Max
Performance 1.5 TFLOPS 9 TFLOPS
Software - AlgorithmDetects and tracks people and objects in video captured by drones.
Why not conventional tracking?
Deep learning
We are implementing our design based on the state-of-the-art YOLO algorithm
Deep Learning basic idea
Original ImageCounter-clockwise rotate Clockwise rotate
15% Crop 25% Crop Mirrored Image
High BrightnessLow brightness No object
Dataset Modification
Algorithm: YOLO (You Only Look Once)
● Object detection with a single Convolutional Neural Network (CNN)
Algorithm: YOLOv2
● Fast and Accurate
Model mAP FLOPS FPS
YOLOv2 608x608 48.1 62.94 Bn 40
Tiny YOLO 23.7 5.41 Bn 244
*mAP stands for mean average precision
*Table evaluated on COCO dataset
*FLOPs: Floating Points Operations per second
Training with YOLO● Train for classification: Pretrain with ImageNet 1000-class dataset ● Train for detection: Convert the model to perform detection
Learning through multiple iteration,which takes days
YOLO Architecture
InferenceSpeed: FPS (Frames per Second) Accuracy: IoU (Intersection over Union)
Inferencing through one iteration, which takes minutes
Inference Results
Accuracy (Average IoU) & FPS:
1st Submission: 37.27% & 11.9 FPS
2nd Submission: 34.09% & 29 FPS
Final Submission:72% (measured on test set) & 20 FPS
PARROT BEBOP 2
Camera 14 megapixels with fish-eye lens
Video Resolution
1080p (30 fps) recording / 480p (30 fps) streaming
Battery life 25 minutes flying time (with 2700 mAh battery)
Storage 8 GB flash storage system
Connectivity Wi-Fi 802.11a/b/g/n/ac
Signal range 300 m
ARDroneSDK Connect and get H264 video stream
Block DiagramPARROT BEBOP
2 DRONE
Monitor
HMP Dual Denver + Quad ARM A57 CPU
NVIDIA Pascal GPU
Ethernet Module
Drone Controller
ARDroneSDK3
Trained Neural Networks
USB to Ethernet Adapter
WiFi
Nvidia Jetson TX2
Competition Result
Ranked 13 out of total 53 groups for April
Improved accuracy from 34% to 72% for the May submission
Final rank yet to be determined
Demo Video
Collaborators / Mentors / Sponsor
Prof. Yuan Xie (UCSB SEAL LAB)
Prof. Yogananda Isukapalli (ECE 189)
Caio Motta (ECE 189)
Dr. Lei Deng (UCSB SEAL LAB)
Yiming Gan (Master Student)
NVIDIA (SPONSOR)
DJI (SPONSOR)
Questions?