Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
November 2016
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice2
Machine Learning Software Challenges
Machine learning open source frameworks and libraries are often not well-optimized for newer Intel-based systems
Frameworks can be difficult to configure and use
Need to target heterogeneous hardware from model training in the datacenter to deployment on endpoint systems Datacenter Endpoint
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice3
strategy provides the
foundation for success using AI
Machine Learning: Your Path to Deeper Insight
Intel® Math Kernel Library (Intel® MKL &
MKL-DNN)
Intel® Data Analytics Acceleration Library
(Intel® DAAL)
Trusted Analytics Platform
+Network
+Memory
+StorageDatacenter Endpoint
Solutionsfor reference across industries
Tools/Platformsto accelerate deployment
Optimized Frameworksto simplify development
Libraries/Languagesfeaturing optimized building blocks
Hardware Technologyportfolio that is broad and cross-
compatible
Driving increasing innovation and competitive advantage across industries
Intel® Deep Learning SDK for Training &
Deployment
Intel® Distribution for Python*
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice4
• Overview of Intel® Math Kernel Library (Intel® MKL)
• Neural network primitives in Intel MKL 2017 for deep learning
• Introduction to open source Intel MKL-DNN
• Intel® Data Analytics Acceleration Library (Intel® DAAL)
• Example using Intel DAAL: handwritten digit recognition
• Introduction to Intel® Deep Learning SDK
Agenda
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice6
Highly optimized threaded math routines
Performance, Performance, Performance!
Industry’s leading math library
Widely used in science, engineering, data processing
Tuned for Intel® processors – current and next generation
Intel® Math Kernel Library (Intel® MKL) Introduction
More math library users depend on MKL than any other library
EDC North America Development Survey
2016, Volume I
Be multiprocessor aware
• Cross-Platform Support
• Be vectorised , threaded, and distributed multiprocessor aware
Intel Engineering Algorithm Experts
Optimized code path dispatch auto
Intel®CompatibleProcessors
Past Intel®Processors
Current Intel®
Processors
Future Intel®
Processors
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Intel MKL unleashes the performance benefits of Intel architectures
0
50
100
150
200
64 80 96 104 112 120 128 144 160 176 192 200 208 224 240 256 384Pe
rfo
rma
nce
(G
Flo
ps)
Matrix size (M = 10000, N = 6000, K = 64,80,96, …, 384)
Intel® Core™ Processor i7-4770K
Intel MKL - 1 thread Intel MKL - 2 threads Intel MKL - 4 threadsATLAS - 1 thread ATLAS - 2 threads ATLAS - 4 threads
0
500
1000
1500
256 300 450 800 1000 1500 2000 3000 4000 5000 6000 7000 8000
Pe
rfo
rma
nce
(G
Flo
ps)
Matrix size (M = N)
Intel® Xeon® Processor E5-2699 v3
Intel MKL - 1 thread Intel MKL - 18 threads Intel MKL - 36 threadsATLAS - 1 thread ATLAS - 18 threads ATLAS - 36 threads
Configuration Info - Versions: Intel® Math Kernel Library (Intel® MKL) 11.3, ATLAS* 3.10.2; Hardware: Intel® Xeon® Processor E5-2699v3, 2 Eighteen-core CPUs (45MB LLC, 2.3GHz), 64GB of RAM; Intel® Core™ Processor i7-4770K, Quad-core CPU (8MB LLC, 3.5GHz), 8GB of RAM; Operating System: RHEL 6.4 GA x86_64;
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .
DGEMM Performance Boost by using Intel® MKL vs. ATLAS*
7
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice8
0
200
400
600
800
1000
1200
1400
Pe
rfo
rma
nce
(G
Flo
ps)
Matrix size (M = N)
Intel MKL - 1 thread Intel MKL - 22 threads
Intel MKL - 44 threads
DGEMM PerformanceOn Intel® Xeon® Processor E5-2699 v4
Configuration Info - Versions: Intel® Math Kernel Library (Intel® MKL) 2017; Hardware:Hardware: Intel® Xeon® Processor E5-2699 v4, 2 Twenty-two-core CPU (55MB smart cache, 2.2GHz), 64GB of RAM; Intel® Xeon Phi™ Processor 7250, 68 cores (34MB L2 cache, 1.4GHz), 96 GB of DDR4 RAM and 16 GB MCDRAM; Operating System: RHEL 7.2 GA x86_64;
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .
0
500
1000
1500
2000
2500
Pe
rfo
rma
nce
(G
Flo
ps)
Matrix size (M = N)
Intel MKL - 16 threads Intel MKL - 34 threads
Intel MKL - 68 threads
DGEMM PerformanceOn Intel® Xeon Phi™ Processor 7250
Intel MKL 2017 Performance
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice9
Components of Intel MKL 2017
Linear Algebra
• BLAS• LAPACK• ScaLAPACK• Sparse BLAS• Sparse Solvers• Iterative • PARDISO*• Cluster Sparse
Solver
Fast Fourier Transforms
• Multidimensional• FFTW interfaces• Cluster FFT
Vector Math
• Trigonometric• Hyperbolic • Exponential• Log• Power• Root• Vector RNGs
Summary Statistics
• Kurtosis• Variation
coefficient• Order statistics• Min/max• Variance-
covariance
And More…
• Splines• Interpolation• Trust Region• Fast Poisson
Solver
Deep Neural Networks
• Convolution• Pooling• Normalization• ReLU• Softmax
New
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Xeon Xeon Phi FPGA
Deep Learning FrameworksCaffe
Intel® MKL-DNN
Intel® Math Kernel Library
(Intel® MKL)
* GEMM matrix multiply building blocks are binary
Intel® Math Kernel Library and Intel® MKL-DNN for Deep Learning Framework Optimization
Intel® MKL Intel® MKL-DNN
DNN primitives + wide variety of other math
functions
DNN primitives
C DNN APIs C/C++ DNN APIs
Binary distribution Open source DNN code*
Free community license. Premium support
available as part of Parallel Studio XE
Apache 2.0 license
Broad usage DNN primitives; not specific to
individual frameworks
Multiple variants of DNN primitives as required for framework integrations
Quarterly update releasesRapid development ahead of Intel MKL
releases
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice11
Deep Neural Network (DNN) Primitives in Intel MKL 2017
Operations Algorithms
Activation ReLU
Normalization batch, local response
Pooling max, min, average
Convolutional fully connected, direct 3D batched convolution
Inner product forward/backward propagation of inner product computation
Data manipulation layout conversion, split, concat, sum, scale
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice12
DNN Primitives in Intel® MKL Highlights
A plain C API to be used in the existing DNN frameworks
Brings IA-optimized performance to popular image recognition topologies:
– AlexNet, Visual Geometry Group (VGG), GoogleNet, and ResNet
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance . *Other names and brands may be property of othersConfigurations: • 2 socket system with Intel® Xeon® Processor E5-2699 v4 (22 Cores, 2.2 GHz,), 128 GB memory, Red Hat* Enterprise Linux 6.7, BVLC Caffe, Intel Optimized Caffe framework, Intel® MKL 11.3.3, Intel® MKL 2017• Intel® Xeon Phi™ Processor 7250 (68 Cores, 1.4 GHz, 16GB MCDRAM), 128 GB memory, Red Hat* Enterprise Linux 6.7, Intel® Optimized Caffe framework, Intel® MKL 2017All numbers measured without taking data manipulation into account.
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Intel® Data Analytics Acceleration Library(Intel® DAAL)
An industry leading Intel® Architecture based data analytics acceleration library of fundamental algorithms covering all machine learning stages.
(De-)CompressionOutlier DetectionNormalization
PCAStatistical momentsVariance matrixPp-QR, SVD, CholeskyAprioriSorting
Ridge linear regressionNaïve BayesSVMClassifier boosting
KmeansEM GMM
Collaborative filtering
Neural Networks
Pre-processing Transformation Analysis Modeling Decision Making
Sci
en
tifi
c/E
ng
ine
eri
ng
We
b/S
oci
al
Bu
sin
ess
Validation
14
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice15
Big data frameworks: Hadoop, Spark, Cassandra, etc.
SQ
L
sto
res
No
SQ
L
sto
res
In-m
em
ory
sto
res
Co
nn
ecto
rs
Data
mining
Recommendation
engines
Customer
behavior
modeling
BI
analytics
Real time
analytics
Big data analyticsCurrent common practice
• Limited performance
• Many layers of dependencies
• Low ROI on HW investment
• Run on state-of-art hardware
• Built with a patchwork of math libs
• Under-exploiting hardware performance features
Data sources
Finance
Social
media
Marketing
IoT
Mfg
…
Problem Statement
Spark* MLLib
Breeze
Netlib-Java JVM
JNI Netlib BLAS
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice16
Big data frameworks: Hadoop, Spark, Cassandra, etc.
SQ
L
sto
res
No
SQ
L
sto
res
In-m
em
ory
sto
res
Co
nn
ecto
rs
Data
mining
Recommendation
engines
Customer
behavior
modeling
BI
analytics
Real time
analytics
Big data analyticsDesired practice • Optimized
performance• Simpler
development & deployment
• High ROI on HW investment
• Run on state-of-art hardware
• Single library to cover all stages of data analytics
• Fully optimized for underlying hardware
Data sources
Finance
Social
media
Marketing
IoT
Mfg
…
Desired Solution
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice17
Big data frameworks: Hadoop, Spark, Cassandra, etc.
SQ
L
sto
res
No
SQ
L
sto
res
In-m
em
ory
sto
res
Co
nn
ecto
rs
Data
mining
Recommendation
engines
Customer
behavior
modeling
BI
analytics
Real time
analyticsData sources
Finance
Social
media
Marketing
IoT
Mfg
…
Where Intel DAAL Fits?
Intel® Data Analytics Acceleration Library
• Building end-to-end data applications• C++, Java, and Python APIs
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Intel DAAL Components
Data Processing Algorithms
Optimized analytics building blocks for all data analysis stages, from data
acquisition to data mining and machine learning
Data Modeling
Data structures for model representation, and operations to
derive model-based predictions and conclusions
Data Management
Interfaces for data representation and access. Connectors to a variety of data sources and data formats, such HDFS,
SQL, CSV, KDB+, and user-defined data source/format
Data Sources Numeric Tables
Compression / Decompression
Serialization / Deserialization
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Algorithms Batch Distributed Online
Descriptive statisticsLow order moments √ √ √
Quantiles/sorting √
Statistical relationships Correlation / Variance-Covariance √ √ √
(Cosine, Correlation) distance matrices √
Matrix decomposition
SVD √ √ √
Cholesky √
QR √ √ √
Regression Linear/ridge regression √ √ √
Classification
Multinomial Naïve Bayes √ √ √
SVM (two-class and multi-class) √
Boosting (Ada, Brown, Logit) √
Unsupervised learning
Association rules mining (Apriori) √
Anomaly detection (uni-/multi-variate) √
PCA √ √ √
KMeans √ √
EM for GMM √
Recommender systems ALS √ √
Deep learningFully connected, convolution,normalization, activation layers, model, NN, optimization solvers,
√
20
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice21
Processing modes in Intel DAAL
Distributed Processing
Online Processing
D1D2D3
R = F(R1,…,Rk)
Si+1 = T(Si,Di)Ri+1 = F(Si+1)
R1
Rk
D1
D2
Dk
R2 R
Si,Ri
Batch Processing
D1Dk-
1
Dk…
Append
R = F(D1,…,Dk)
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice22
Key Features in Intel DAAL 2017
• Neural Networks API
• Python API (a.k.a. PyDAAL)
– Easy installation through Anaconda
– Intel® Distribution for Python 2017
• Open source project on GitHub
Fork me on GitHub: https://github.com/01org/daal
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice23
Handwritten Digit Recognition
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice24
Handwritten Digit Recognition
Training multi-class SVM for 10 digits recognition.
3,823 pre-processed training data.
– available at http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits
99.6% accuracy with 1,797 test data from the same data provider.Confusion matrix: 177.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 181.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 2.000 173.000 0.000 0.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000 176.000 0.000 1.000 0.000 0.000 3.000 3.000 0.000 1.000 0.000 0.000 179.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 180.000 0.000 0.000 0.000 2.000 0.000 0.000 0.000 0.000 0.000 0.000 180.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 170.000 1.000 8.000 0.000 3.000 0.000 0.000 0.000 0.000 0.000 0.000 166.000 5.000 0.000 0.000 0.000 2.000 0.000 1.000 0.000 0.000 2.000 175.000 Average accuracy: 0.996 Error rate: 0.004 Micro precision: 0.978 Micro recall: 0.978 Micro F-score: 0.978 Macro precision: 0.978 Macro recall: 0.978 Macro F-score: 0.978
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Training Handwritten Digitsvoid trainModel() { /* Initialize FileDataSource<CSVFeatureManager> to retrieve input data from .csv file */ FileDataSource<CSVFeatureManager> trainDataSource(trainDatasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext); /* Load data from the data files */ trainDataSource.loadDataBlock(nTrainObservations); /* Create algorithm object for multi-class SVM training */ multi_class_classifier::training::Batch<> algorithm; algorithm.parameter.nClasses = nClasses; algorithm.parameter.training = training; /* Pass training dataset and dependent values to the algorithm */ algorithm.input.set(classifier::training::data,trainDataSource.getNumericTable()); /* Build multi-class SVM model */ algorithm.compute(); /* Retrieve algorithm results */ trainingResult = algorithm.getResult(); /* Serialize the learned model into a disk file */ ModelFileWriter writer("./model"); writer.serializeToFile(trainingResult->get(classifier::training::model)); }
25
Create a numeric table
Create an alg. Obj.
Set input and parameters
Compute
Serialize the learned model
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Handwritten Digit Prediction
26
void testDigit() { /* Initialize FileDataSource<CSVFeatureManager> to retrieve the test data from .csv file */ FileDataSource<CSVFeatureManager> testDataSource(testDatasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext); testDataSource.loadDataBlock(1); /* Create algorithm object for prediction of multi-class SVM values */ multi_class_classifier::prediction::Batch<> algorithm; algorithm.parameter.prediction = prediction; /* Deserialize a model from a disk file */ ModelFileReader reader("./model"); services::SharedPtr<multi_class_classifier::Model> pModel(new multi_class_classifier::Model()); reader.deserializeFromFile(pModel); /* Pass testing dataset and trained model to the algorithm */ algorithm.input.set(classifier::prediction::data, testDataSource.getNumericTable()); algorithm.input.set(classifier::prediction::model, pModel); /* Predict multi-class SVM values */ algorithm.compute(); /* Retrieve algorithm results */ predictionResult = algorithm.getResult(); /* Retrieve predicted labels */ predictedLabels = predictionResult->get(classifier::prediction::prediction); }
Deserializelearned model
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
SVM Performance Boosts Using Intel® DAAL vs. Scikit-learnon Intel® CPU
27
Configuration Info - Versions: Intel® Data Analytics Acceleration Library 2016 U2, scikit-learn 0.16.1; Hardware: Intel Xeon E5-2680 v3 @ 2.50GHz, 24 cores, 30 MB L3 cache per CPU, 256 GB RAM; Operating System: Red Hat Enterprise Linux Server release 6.6, 64-bit.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .
0
2
4
6
8
10
12
Training Prediction
Sp
ee
du
p
11.25X
6.77X
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice28
Intel® DAAL+ Intel® MKL = Complementary Big Data Libraries Solution
Intel MKL Intel DAAL
C and Fortran APIPrimitive level
Python, Java & C++ API High-level
Processing of homogeneous data in single or double precision
Processing heterogeneous data (mix of integers and floating point), internal conversions are hidden in the library
Type of intermediate computations is defined by type of input data (in some library domains higher precision can be used)
Type of intermediate computations can be configured independently of the type of input data
Most of MKL supports batch computation modeonly
3 computation modes: Batch, streaming and distributed
Cluster functionality uses MPI internally Developer chooses communication method for distributed computation (e.g. Spark, MPI, etc.) Code samples provided.
30
Intel Deep Learning Software Stack
*Other names and brands may be claimed as property of others.
Xeon Xeon Phi FPGA
Intel libraries as path to bring optimized ML/DL frameworks to Intel hardware
Intel MKL extracts max Intel hardware performance and provide common interface to all Intel accelerators
Intel DAAL provides optimized building blocks for machine learning applications on Intel platforms
Intel Optimized Frameworks
Intel® Deep Learning SDK – free tools to accelerate design, training and deployment of deep networks
Intel® Deep Learning SDK
Intel® Data Analytics Acceleration
Library (Intel® DAAL)
Intel® Math Kernal Library
(Intel® MKL)
Intel optimized frameworks – single-node and multi-node CPU optimizations of the most widely used deep learning frameworks
Caffe ** * *
*
31
Intel® Deep Learning SDK
Deep Learning Training Tool Deep Learning Deployment Tool
• IMPORT Trained Model (trained on Intel or 3rd Party HW)
• COMPRESS Model for Inference on Target Intel HW
• GENERATE faster inference with HW-Specific runtime
• INTEGRATE with System SW / Application Stack & TUNE
• EVALUATE Results and ITERATE
• INSTALL / SELECT IA-Optimized Frameworks
• PREPARE / CREATE Dataset with Ground-truth
• DESIGN / TRAIN Model(s) with IA-Opt. Hyper-Parameters
• MONITOR Training Progress across Candidate Models
• EVALUATE Results and ITERATE
configure_nn(fpga/cve,…)
allocate_buffer(…)
fpga_conv(input,output);
fpga_conv(…);
mkl_SoftMax(…);
mkl_SoftMax(…);
…
Technical preview available at software.intel.com/deep-learning-sdk
Increased Productivity
Faster Time-to-market for training and inference,
Improve model accuracy, Reduce total cost of
ownership
Maximum Performance
Optimized performance for training and inference on
Intel® Architecture
Intel® Deep Learning SDK BenefitsAccelerate Your Deep Learning Solution
“Plug & Train/Deploy”
Simplify installation & preparation of deep learning models using popular deep
learning frameworks on Intel hardware
32
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice33
Summary
• Intel MKL and Intel DAAL provide high performance and optimized building blocks for data analytics and machine learning algorithms on Intel platforms.
• Deep learning applications can benefit from DNN primitives in Intel MKL 2017 and Neural Networks API in Intel DAAL 2017
• Intel Deep Learning SDK offers tools to facilitate develop, train, and deploy deep learning solutions.
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice34
Resources
Intel machine learning strategy
https://software.intel.com/en-us/machine-learning
Intel MKL available in Intel® Parallel Studio XE and Intel® System Studio
https://software.intel.com/en-us/intel-mkl-support
Intel DAAL available as standalone, in Intel® Parallel Studio XE, as well as open source
https://software.intel.com/en-us/intel-daal-support
https://github.com/01org/daal
Free community License for Intel libraries
https://registrationcenter.intel.com/en/forms/?productid=2558&licensetype=2
Caffe* enabled by Intel MKL
https://github.com/intelcaffe/caffe
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Optimization Notice
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance ofthat product when combined with other products.
Copyright © 2016, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
3535