Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
Snapdragon NPE Overview
Mark Charlebois
Director, Engineering
Qualcomm Technologies, Inc.
Linaro ConnectMarch 2018 Hong Kong
2
Snapdragon NeuralProcessing Engine
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc
Software accelerated runtimefor the execution of deep neuralnetworks on device
Available at:developer.qualcomm.com
Efficient executionon Snapdragon
Model framework/network support
Fixed and floating pointoptimizations
Supports Caffe2,CNTK, MxNet
New optimizations fornetworks
What’s new?
Developer Tools
Caffe2
3
Elements of Snapdragon NPE SDK
API
• C++ library in binary form and header files
• Java library for Android integration
• C++ and Python API support for interacting with DLC
DLC
• Snapdragon NPE DNN model format
• Network is a collection of connected layers
• DNN models are stored in DLC files
Tools
• Model converters to create Snapdragon NPE compatible DNN models from popular training framework formats
• Optimization and debugging support tools
Support Assets
• Development host (x86 Ubuntu 14.04)
• User and reference documentation
• Tutorials and examples
• Benchmarking
Snapdragon NPE SDK
4
Snapdragon NPE SW DiagramO
S D
rive
rsC
ore
Ru
nti
me DL Container
OS
Model DebugProfiling Logging
Runtime Engine
Model loader
SDK Productivity Components
3rd Party Apps Benchmarking Tutorial Samples User & Reference Docs
HW CPU Adreno GPU Hexagon DSP
Android & Linux (x86_64, Armv7, Armv7hf, AArch64) QuRT
User Defined Layers (UDL) API
ComputeNetworks
GPU DSPCPU
Network Debug Tools
SDK API
libOpenCL.solibsnpe_[a,c]dsp*.so
libsnpe_dsp_*skel.so
DNN Model Conversion Tools
Caffe/2 -> DLC fixed
TensorFlow -> DLC fixed
UDL Plugin
Caffe/2 -> DLC Float
TensorFlow -> DLC Float
5
Snapdragon NPE SDK
• SDK can be downloaded from Qualcomm Developer Network◦ https://developer.qualcomm.com/software/snapdragon-neural-processing-engine-ai
The NPE SDK supports Qualcomm® Snapdragon™ 845, 820, 835, 625, 626, 650, 652, 653,
660, 630, 636, and 450 as well as the Qualcomm® Snapdragon™ 820Am automotive platform
and Qualcomm Snapdragon™ Flight. For Qualcomm® Adreno™ GPU support, libOpenCL.so
must be present on device.
Toolchains:
• Android (armv7, aarch64) - GCC and Clang toolchains
• Linux (armv7, armv7hf, aarch64, x86_64*) - GCC on ARM, Clang on x86_64
* CPU only
66
NPE SDK Developer Tools
• snpe-net-run
• snpe-caffe-to-dlc
• snpe-caffe2-to-dlc
• snpe-tensorflow-to-dlc
• snpe-onnx-to-dlc*
• snpe-diagview
• snpe-dlc-info
• snpe-dlc-quantize
• snpe_bench.py
*Coming soon
7
Using the Snapdragon NPE
8
Snapdragon NPE Workflow
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc
GoogleNet
Inception
SSD
Alexnet
ResNet
MobileNet
SqueezeNet
Faster – RCNNCaffe2
User Defined Layer (UDL) – enables prototyping of layers not yet supported
9
Input Image Formatting
◦ In the Snapdragon NPE, images must be presented as a tensor of shape (height x width x channel), where
channel is the fastest-changing dimension.
◦ See $SNPE_ROOT/models/alexnet/scripts/create_alexnet_raws.py in the SDK
NCHW NHWC
Caffe Format SNPE Format
NCHWMean
ImageConvert
For current Snapdragon NPE SDK release, N=1. Batch support coming in future release.
(1, 1)(1, 1)(1, 1) (1,2)(1,2)(1,2) (H,W)
10
Quantized vs Non-Quantized Models
• Non-quantized DLC files use 32 bit floating point representations of network parameters.
• Quantized DLC files use 8 bit fixed point representations of network parameters and are
smaller.
.dlc File
Quantized
.dlc File
CPU/GPU
Runtime
(32 bit)
DSP
Runtime
(8-bit)
11
Making a Snapdragon NPE Enabled Application
bool useUserSuppliedBuffers = false;
// Set the Runtime
static zdl::DlSystem::Runtime_t runtime =
zdl::SNPE::SNPEFactory::isRuntimeAvailable(zdl::DlSystem::Runtime_t::GPU)) ?
zdl::DlSystem::Runtime_t::GPU : zdl::DlSystem::Runtime_t::CPU;
// Load DLC Container
std::unique_ptr<zdl::DlContainer::IDlContainer> container =
zdl::DlContainer::IDlContainer::open(dlcPath);
// Build SNPE instance
zdl::SNPE::SNPEBuilder snpeBuilder(container);
std::unique_ptr<zdl::SNPE::SNPE> snpe = snpeBuilder.setOutputLayers({})
.setRuntimeProcessor(runtime)
.setUdlBundle(udlBundle)
.setUseUserSuppliedBuffers(useUserSuppliedBuffers)
.build();
App setup
12
Making a Snapdragon NPE Enabled ApplicationRunning the network (ITensor)
// Load the inputs
std::unique_ptr<zdl::DlSystem::ITensor> inputTensor = loadInputTensor(snpe, fileLine); // See SDK docs
static zdl::DlSystem::TensorMap outputTensorMap;
// Run the network
snpe.execute(inputTensor, outputTensorMap);
zdl::DlSystem::StringList tensorNames = outputTensorMap.getTensorNames();
// Access the results
std::for_each( tensorNames.begin(), tensorNames.end(), [&](const char* name){
auto tensorPtr = outputTensorMap.getTensor(name);
for ( auto it = tensorPtr->cbegin(); it != tensorPtr->cend(); ++it ){
float f = *it;
...
});
}
13
Making a SNPE Enabled ApplicationRunning the network (UserBuffer)
// Load the Inputs
loadInputUserBuffer(applicationInputBuffers, snpe, fileLine); // See SDK Docs
// Run the Network
snpe.execute(inputMap, outputMap);
const zdl::DlSystem::StringList& outputBufferNames = outputMap.getUserBufferNames();
// Access the results
std::for_each(outputBufferNames.begin(), outputBufferNames.end(), [&](const char* name)
{
auto buffer = applicationOutputBuffers.at(name).data();
float *f;
for (auto i=0; i< buffer.size(); i+=sizeof(float)) {
f = reinterpret_cast<float *>(&buffer[i]);
...
}
});
…
applicationOutputBuffers.at(name)
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Thank you!
Nothing in these materials is an offer to sell any of the
components or devices referenced herein.
©2018 Qualcomm Technologies, Inc. and/or its affiliated
companies. All Rights Reserved.
Qualcomm is a trademark of Qualcomm Incorporated,
registered in the United States and other countries. Other
products and brand names may be trademarks or registered
trademarks of their respective owners.
References in this presentation to “Qualcomm” may mean Qualcomm
Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries
or business units within the Qualcomm corporate structure, as
applicable. Qualcomm Incorporated includes Qualcomm’s licensing
business, QTL, and the vast majority of its patent portfolio. Qualcomm
Technologies, Inc., a wholly-owned subsidiary of Qualcomm
Incorporated, operates, along with its subsidiaries, substantially all of
Qualcomm’s engineering, research and development functions, and
substantially all of its product and services businesses, including its
semiconductor business, QCT.