21
Deep Vision Better object recognition with less power

Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Deep VisionBetter object recognition with less power

Page 2: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

PurposeWe aim to implement a real time Deep Neural Network Image Recognition and Detection algorithm running on an embedded GPU device

Drones send real time video stream to the mobile GPU device and achieve real time recognition on objects

Participating in 2018 DAC Contest

Sponsored by and collaborated with Prof. Xie and SEAL LAB

Page 3: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Jenny Zeng

Team Members

Charlie Xu Chenghao JiangTerry Xie

Page 4: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

2018 DAC Contest

Features embedded system implementation of neural network based object detection for drones

Provide hardware (Nvidia Jetson Tx2) and training dataset

Evaluation○ 20 FPS minimal○ Accuracy of detection○ Power consumption

Page 5: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Hardware - Nvidia Jetson TX2

GPU NVIDIA Pascal™, 256 CUDA cores

CPU HMP Dual Denver 2/2 MB L2 + Quad ARM® A57/2 MB L2

Memory 8 GB 128 bit LPDDR4 1866 MHz59.7 GB/s

Power External 19V AC Adapter 7.5 Watt Typical / 15 Watt Max

Software Support

● Ubuntu 16.04 LTS● Jetpack 3.1 SDK

○ Deep Learning: TensorRT, cuDNN, NVIDIA DIGITS™ Workflow

○ Computer Vision: NVIDIA VisionWorks, OpenCV○ GPU Compute: NVIDIA CUDA, CUDA Libraries○ Multimedia: ISP Support, Camera imaging, Video CODEC

Page 6: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

GPU Nvidia Jetson TX2 Nvidia GTX 1080 Founders

Edition

Cuda Cores 256 2560

GPU Frequency 854 MHz Max-Q / 1122 MHz Max-P 1607 MHz Base / 1733 MHz Boost

Memory No dedicated Graphics Memory, 8GB LPDDR4 shared with CPU

8 GB GDDR5X 10Gbps

Power Consumption 7.5 Watt Typical / 15 Watt Max

180 W Max

Performance 1.5 TFLOPS 9 TFLOPS

Page 7: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Software - AlgorithmDetects and tracks people and objects in video captured by drones.

Why not conventional tracking?

Deep learning

We are implementing our design based on the state-of-the-art YOLO algorithm

Page 8: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Deep Learning basic idea

Page 9: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Original ImageCounter-clockwise rotate Clockwise rotate

15% Crop 25% Crop Mirrored Image

High BrightnessLow brightness No object

Dataset Modification

Page 10: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Algorithm: YOLO (You Only Look Once)

● Object detection with a single Convolutional Neural Network (CNN)

Page 11: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Algorithm: YOLOv2

● Fast and Accurate

Model mAP FLOPS FPS

YOLOv2 608x608 48.1 62.94 Bn 40

Tiny YOLO 23.7 5.41 Bn 244

*mAP stands for mean average precision

*Table evaluated on COCO dataset

*FLOPs: Floating Points Operations per second

Page 12: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Training with YOLO● Train for classification: Pretrain with ImageNet 1000-class dataset ● Train for detection: Convert the model to perform detection

Learning through multiple iteration,which takes days

Page 13: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

YOLO Architecture

Page 14: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

InferenceSpeed: FPS (Frames per Second) Accuracy: IoU (Intersection over Union)

Inferencing through one iteration, which takes minutes

Page 15: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Inference Results

Accuracy (Average IoU) & FPS:

1st Submission: 37.27% & 11.9 FPS

2nd Submission: 34.09% & 29 FPS

Final Submission:72% (measured on test set) & 20 FPS

Page 16: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

PARROT BEBOP 2

Camera 14 megapixels with fish-eye lens

Video Resolution

1080p (30 fps) recording / 480p (30 fps) streaming

Battery life 25 minutes flying time (with 2700 mAh battery)

Storage 8 GB flash storage system

Connectivity Wi-Fi 802.11a/b/g/n/ac

Signal range 300 m

ARDroneSDK Connect and get H264 video stream

Page 17: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Block DiagramPARROT BEBOP

2 DRONE

Monitor

HMP Dual Denver + Quad ARM A57 CPU

NVIDIA Pascal GPU

Ethernet Module

Drone Controller

ARDroneSDK3

Trained Neural Networks

USB to Ethernet Adapter

WiFi

Nvidia Jetson TX2

Page 18: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Competition Result

Ranked 13 out of total 53 groups for April

Improved accuracy from 34% to 72% for the May submission

Final rank yet to be determined

Page 19: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Demo Video

Page 20: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Collaborators / Mentors / Sponsor

Prof. Yuan Xie (UCSB SEAL LAB)

Prof. Yogananda Isukapalli (ECE 189)

Caio Motta (ECE 189)

Dr. Lei Deng (UCSB SEAL LAB)

Yiming Gan (Master Student)

NVIDIA (SPONSOR)

DJI (SPONSOR)

Page 21: Deep Vision - Electrical and Computer Engineeringyoga/capstone/media/DeepVision... · 2020-03-09 · Deep Vision Better object recognition with less power. ... Computer Vision: NVIDIA

Questions?