Upload
embedded-vision-alliance
View
17
Download
0
Tags:
Embed Size (px)
Citation preview
Harnessing Heterogeneous Computing
for Cloud- and Mobile-based Visual
Search
Dr. Ren Wu
Distinguished Scientist, Baidu
Baidu
Everyday
5b+ queries
500m+ users
100m+ mobile users
100m+ photos
…
Google == American Baidu !?
Deep Learning Applications
• Speech recognition
• Image recognition
• Optical character recognition (OCR)
• Language translation
• Web search
• Computational Ads (CTR)
• …
Progress of Deep Learning at Baidu
• Big improvement on speech & image recognition (2013)
• Speech: error rate reduced by 25%
• OCR: error rate reduced by 30%
• Face: LFW benchmark, 94% correct
• DNN CTR for search ads was launched on May 20th 2013,
serving billions of search queries everyday – substantial
improvement
Deep Learning vs. Human Brain
pixels
edges
object parts
(combination
of edges)
object models Deep Architecture in the Brain
Retina
Area V1
Area V2
Area V4
pixels
Edge detectors
Primitive shape
detectors
Higher level visual
abstractions
Slide credit: Andrew Ng
Voic
e
Text
Imag
e User
Neural Network, DNN, and AI
Why it works now but not before?
What have happened?
Big Data
High Performance Computing
Big Data @ Baidu
• >2000PB Storage
• 10-100PB/day Processing
• 100b-1000b Webpages
• 100b-1000b Index
• 1b-10b/day Update
• 100TB~1PB/day Log
Heterogeneous Computing
1993 world #1 Think Machine CM5/1024
131 GFlops
2013 Samsung Note 3 smartphone
(Qualcomm SnapDragon 800)
129 Gflops
2000 world #1 ASCI White (IBM RS/6000SP)
6MW power, 106 tons
12.3 TFlops
2013 Two MacPro workstation
(dual AMD GPUs each)
14 TFlops
Deep Learning at Scale
Voice
, Text
Imag
e
User
DNN for Speech 10k hours of voice data
10b training samples
Months on a GPU cluster
High Performance Computing
Datasets
• Image recognition: 100
millions
• OCR: 100 millions
• Speech: 10 billions
• CTR: 100 billions
Projected training data to
grow 10x each year
Training time:
Weeks to Months
on GPU clusters
Big data + Deep learning + HPC
= Intelligence
Artificial Intelligence
Big data + Deep learning + High performance computing =
Intelligence
Omnipotent
DNN on Mobile Phones
Samsung Galaxy Note 3
AT&T version
SAMSUNG-SM-N900A
Snapdragon 800
Andriod 4.3
No connectivity needed!
Image Recognition
Jen-Hsun Huang’s Keynote
Rob Fergus’s talk yesterday
Image processed at cloud
(data center)
VS.
World’s First Mobile DNN App
• Image recognition on mobile
device
• Real time and no connectivity
needed
• directly from video stream, what
you point is what you get
• Everything is done within the
device
• OpenCL based, highly optimized
• Large deep neural network
models
• Thousands of objects, flowers,
dogs, and bags etc
• Unleashed the full potential of the
device hardware
• Smart phones now, Wearables
and IoTs tomorrow
DNNs Everywhere
Supercomputers Datacenters Tablets, smartphones Wearable devices
IoTs
1000s GPUs 100k-1m servers 700m (in China) Billions?
Supercomputer used for training
Trained DNNs then deployed to data centers (cloud),
smartphones, and even wearables and IoTs
OpenCL-based Open ECO-SYSTEM
• Diverse industry participation, from cell phones to supercomputers
o Processor vendors, system OEMs, middleware vendors, application developers.
• OpenCL is the industry standard embraced by many companies.
Third party names are the property of their owners. * Courtesy of Simon McIntosh-Smith and Tom Deakin