Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Choosing a Deep Learning Library
There are a lot of them
JesseBrizzi{.com,@gmail.com,@curalate.com}
Choosing a Deep Learning Library
Who am I/What do I do?● Research Engineer
○ Focus in Computer Vision and Machine learning
○ CS background
● Work on Image Intelligence Team @Curalate
○ E-Commerce SaaS
○ Platform to enable brands to find image based social media content to repurpose for
e-commerce purposes.
○ Image Intelligence Team owns entire pipeline of researching new ML application to
training, development, and then getting it into production.
○ Intelligent Product Tagging - technology that can analyze an image and use machine
learning to identify specific products depicted within that image.
Choosing a Deep Learning Library
Choosing a Deep Learning Library
at’s L’s a Neural Net?
Choosing a Deep Learning Library
at’s L’s a Neural Net?● FCN - Fully Connected Network
○ Multilayer perceptron/fundamental neural net where each neuron is connect to all neurons in
the previous layer of the network.
● CNN - Convolutional Neural Network
○ Neural net that uses convolutional layers, heavily used in Computer Vision applications.
● RNN - Recurrent Neural Network
○ Neural net that feeds its output back into itself to process the next input, heavily used in
Natural Language Processing applications.
● LSTM - Long Short-Term Memory Recurrent Neural Net
○ Fancy RNNs that contain additional control over what output is passed to the next input.
Choosing a Deep Learning Library
Important Factors● Academia vs Industry
○ Who is the target audience?
● Community support
○ Pretrained models?
○ Research paper repos?
○ How googleable are bugs and issues?
● Development speed/barriers for entry
○ Abstractions of low level concepts.
○ Documentation quality
○ Supported programming languages
○ The ability to Scale
Choosing a Deep Learning Library
Important Factors● Codebase Quality
○ Is the code actively maintained?
●Performance
○ Benchmarks (oldish) https://arxiv.org/pdf/1608.07249.pdf
○ Performance does not scale very well on CPUs. 16 core CPUs are only slightly better than 4
or 8 core CPUs.
○ GPUs perform much better than many-core CPUs.
○ Scalability across multiple GPUs
○ Performance is also affected by the design of configuration files/implementation paradigm.
Choosing a Deep Learning Library
Important Factors● Train to Production pipeline
○ Support for a fast to prototype language (python, R) and deployment in your production
language (java/scala, c++, JS, whatever).
○ Train locally if you have the hardware vs training on pre-prepared, simplified cloud
services.
○ Ability to run on different platforms ranging from mobile phones to massive server farms
○ Transfer your work to other libraries
Choosing a Deep Learning Library
Imperative vs Symbolic paradigms● Dynamic Computation Graphing (Imperative Programming)
○ Are built at runtime which lets you use standard language statements.
○ At run time the system generation the graph structure.
○ Useful for when the graph structure needs to change at run time.
○ Makes debugging easy.
● Imperative programs tend to be more flexible
○ It’s easier to use native language features.
○ The graph can follow your programs logical control flow.
Choosing a Deep Learning Library
Imperative vs Symbolic paradigms● Symbolic Programs Tend to be More Efficient
○ Both in terms of memory and speed.
○ Can safely reuse the memory for in-place computation.
○ Can also operation folding optimizations.
● Static Computation Graphing (Symbolic Paradigm)
○ Define the computation graph once, execute graph many times.
○ Can optimized the graph at the start
○ Good for fixed size Net (feed-forward, CNN)
● Easier to manage in terms of loading and resources
Libraries That People Should Know About
Choosing a Deep Learning Library
Caffe● IMO the first mainstream production ready lib.
○ high performance and well tested C++ codebase.
● One of the first, and largest, model zoos.
● Large community of open source research projects.
● Able to train a net from your data without writing any code.
● Good for feedforward networks, image processing, and for fine-tuning
pretrained nets
● Main advantage was being first to market.
● Can convert models to almost any other relevant lib.
UC Berkeley
Watches: 2,241 Stars: 27,296Forks: 16,454
Avg Issue Resolution: 3 DaysOpen issues: 13%
Symbolic Paradigm
Research Citations (2014): 10,159
Model zoo
Choosing a Deep Learning Library
● Has bad design choices that are inherited from its original use case:
conventional CNN applications.
● Not good for recurrent networks
● Does not support Auto differentiation
● Very verbose in layer and network definitions
○ the graph is treated as a collection of layers, as opposed to
nodes of single tensor operations
CaffeUC Berkeley
Watches: 2,241 Stars: 27,296Forks: 16,454
Avg Issue Resolution: 3 DaysOpen issues: 13%
Symbolic Paradigm
Research Citations (2014): 10,159
Model zoo
Choosing a Deep Learning Library
Keras● A library that sits on top of other DL libs and provides a single, easy to use, high level interface.
● Very modular, minimal, readable, object oriented code.
● Great for beginners, with great documentation
● Lacks in optimizations
● Supported backends
○ Tensorflow, Theano, CNTK, MXNet
● Can export your trained models into the backends format.
● Fork included in TensorFlow’s Python library.
● Not as customizable
Keras
Watches: 1,982 | Stars: 38,796Forks: 14,799
Avg Issue Resolution: 23 DaysOpen issues: 24%
Symbolic Paradigm
Model zoo
Choosing a Deep Learning Library
Tensorflow● The current most popular option.
○ Largest active community
○ More open source projects and models.
● Google’s attempt to build a single deep learning framework for
everything deep learning related.
○ Built with massive distributed computing in mind (powers G-apps).
○ Has mobile capabilities in the form of TensorFlow Mobile and
TensorFlow Light.
● TensorBoard is amazing for debugging and training.
● TensorFlow Serving for prod deployments (python)
● A lot of documentation (official and 3rd party)
Watches: 8,606 Stars: 121,864Forks: 72,545
Avg Issue Resolution: 8 DaysOpen issues: 16%
Symbolic/Dynamic Paradigm
Research Citations (2016): 6233
Model zoo
CNN Example Code (Keras R)
CNN Example Code (Keras Py)
CNN Example Code
Choosing a Deep Learning Library
● Deep Google Cloud integration.
● Pretty low level (Keras and Sonnet help solve this)
● Most things outside of the core c/python library are “experimental”
○ All of the APIs outside of the Python API are not covered by
their API stability promises.
● Biggest issue with library is performance.
○ TensorFlow is just slower and more of a resource hog when
compared to the other libraries.
○ Other libs can perform twice as fast on typical deep net tasks.
○ Avoid for performant RNNs or LSTMs networks.
○ Worst at scaling efficiency.
TensorflowGoogle
Watches: 8,606 Stars: 121,864Forks: 72,545
Avg Issue Resolution: 8 DaysOpen issues: 16%
Symbolic/Dynamic Paradigm
Research Citations (2016): 6233
Model zoo
CNN Example Code (Keras R)
CNN Example Code (Keras Py)
CNN Example Code
Choosing a Deep Learning Library
Torch/PyTorch● Torch was one of the original academic
focused libs.
● Many maintainers went to work at
Facebook and created PyTorch.
● They use the same underlying C lib.
○ Provide similar performance.
● They differ in
○ Interface (Lua vs Python)
○ Auto diff capabilities
○ Paradigms
Deepmind, NYU, IDIAP
Watches: 665 | Stars: 8,218Forks: 2,340
Avg Issue Resolution: 69 DaysOpen issues: 34%
Symbolic Paradigm
Research Citations: 1,246
Model zoo
Watches: 1,197 | Stars: 25,450Forks: 6,044
Avg Issue Resolution: 6 DaysOpen issues: 24%
Symbolic/Dynamic Paradigm
Research Citations: 879
Model zoo
CNN Example Code
Choosing a Deep Learning Library
PyTorch● PyTorch was made with the goal of fixing or modernizing Torch.
● Hybrid fronted for switching between paradigms.
● PyTorch also has its own visualization dashboard called Visdom.
● Probably should avoid if want to deploy into production.
○ Facebook maintains a separate lib targeted at developers,
Caffe2.
○ Making changes to make PyTorch production ready.
○ Caffe2 recently merged into PyTorch
● Researchers tend to prefer PyTorch over Tensorflow
○ Makes prototyping easy
Watches: 1,197 | Stars: 25,450Forks: 6,044
Avg Issue Resolution: 6 DaysOpen issues: 24%
Symbolic/Dynamic Paradigm
Research Citations: 879
Model zoo
CNN Example Code
Choosing a Deep Learning Library
● Newer and growing option.
● Largest officially supported API selection.
○ High compatibility and consistency.
● Direct competitor to TensorFlow across all applications.
○ It can run on everything from a web browser, a mobile
phone, to a massive distributed server farm.
○ Amazon has found that you can get up to an 85% scaling
efficiency with MXNet.
● Has its own serving framework and deep integration with AWS.
● Also has its own Tensorboard forks.
MXNetApache, Amazon
Watches: 1,180 | Stars: 16,450Forks: 5,889
Avg Issue Resolution: 40 DaysOpen issues: 13%
Symbolic/Dynamic Paradigm
Research Citations: 712
Model zoo
CNN Example Python Code
CNN Example Code (Gluon)
Choosing a Deep Learning Library
MXNet Gluon● Collaboration between AWS and Microsoft.
● Provides a clear, concise, and simple API for deep learning.
○ Full set of plug-and-play neural network building blocks.
■ predefined layers, optimizers, and initializers
○ Built in model zoo.
● Hybridization is awesome
○ Hybrid Symbolic/Dynamic graph functionality.
○ Offers benefits of both.
○ Can make Gluon 3x faster than PyTorch
● Great documentation for absolute beginners.
Choosing a Deep Learning Library
● The non Python API’s are lacking in certain aspects.
○ The documentation can be weak.
○ Stability issues at full production scale.
● Community is growing, but is still small
○ Never the first library used for open source projects
MXNetApache, Amazon
Watches: 1,180 | Stars: 16,450Forks: 5,889
Avg Issue Resolution: 40 DaysOpen issues: 13%
Symbolic/Dynamic Paradigm
Research Citations: 712
Model zoo
CNN Example Python Code
CNN Example Code (Gluon)
Choosing a Deep Learning Library
CNTK● Microsoft Cognitive Tooklit was originally created by MSR Speech
researchers
○ Now it has expanded to all types of deep learning applications.
● Used in Skype, Xbox, Cortana, anything “Azure”
● Focus on NLP with unbeatable RNN/LSTM performance
● Supports distributed training like TensorFlow
● Only library with first class support for the Windows ecosystem.
○ No support for OSX
○ Simple Azure deployment
○ .NET language support
Microsoft
Watches: 1,388 | Stars: 15,850Forks: 4,217
Avg Issue Resolution: 28 DaysOpen issues: 15%
Symbolic/Dynamic Paradigm
Research Citations: 140
Model zoo
CNN Example Code
CNN Example Code (Keras)
Choosing a Deep Learning Library
● Average model zoo size/quality
● Good documentation consistent with other Microsoft products
● Non conventional open source license history.
● Small community
● Used the least in research
CNTKMicrosoft
Watches: 1,388 | Stars: 15,850Forks: 4,217
Avg Issue Resolution: 28 DaysOpen issues: 15%
Symbolic/Dynamic Paradigm
Research Citations: 140
Model zoo
CNN Example Code
CNN Example Code (Keras)
Choosing a Deep Learning Library
● https://onnx.ai/
● Open Neural Network Exchange Format
● Created in collaboration with AWS, Facebook and Microsoft
● Library and format for converting trained Neural Net models
between libraries
● Provides a standardized onnx model format.
ONNX
Choosing a Deep Learning Library
Performance Comparisons Summary● Benchmarks (oldish 2017) https://arxiv.org/pdf/1608.07249.pdf
○ Compares CNTK, Torch, Caffe, MXNet, Tensorflow
○ CPU’s to Multiple GPU performance on Synthetic/Real data across various deep
learning architectures (CNN, FCN, RNN, LSTM...).
● Single GPU
○ Caffe, CNTK and Torch perform better than MXNet and TensorFlow on FCNs.
○ MXNet is outstanding in CNNs, especially the larger size of networks, while Caffe and
CNTK also achieve good performance on smaller CNNs.
○ RNNs or LSTMs, CNTK obtains excellent time efficiency, which is up to 5-10x the rest.
Choosing a Deep Learning Library
Performance Comparisons Summary● Multiple GPUs
○ MXNet and Torch scale the best and TensorFlow scales the worst.
○ CNTK performs better scaling on FCNs specifically.
● Library specific optimizations
○ CNTK allows the trade off GPU memory for better computing efficiency.
○ MXNet can enable model auto-tuning using the NVidia cuDNN library.
● Overall the performance of TensorFlow is lacking compared to the other tools.
Other Libraries to take note of...
Choosing a Deep Learning Library
Theano● University of Montreal
● Research Citations - 290
● Development has ended, may it rest in peace ⚰● Makes you do a lot of things from scratch, which leads to more verbose code.
● Single GPU support
● Numerous open-source deep-libraries have been created and built on top of Theano,
including Keras, Lasagne and Blocks
● CNN Example Code (Keras) or CNN Example Code (Lasagne)
● No real reason to use over TensorFlow unless you are working with old code.
Choosing a Deep Learning Library
Caffe 2● Facebook
● CNN Example Code
● Merged into the PyTorch codebase.
● Caffe2 targets supporting production applications with a focus on mobile.
● Caffe2 is built to excel at large scale deployments.
○ Caffe2 is built to utilizing both multiple GPUs on a single-host and multiple hosts with GPUs.
● Caffe2 improves Caffe in a series of directions:
○ first-class support for large-scale distributed training
○ mobile deployment
○ new hardware support (in addition to CPU and CUDA)
○ flexibility for future directions such as quantized computation
○ stress tested by the vast scale of Facebook applications
Choosing a Deep Learning Library
Fast.ai● fastai
● The library is based on research into deep learning best practices.
● Built on top of PyTorch
● Free, online, yearly updated courses in deep learning
○ Can even take it in person in SF
● Quickest at integrating new research examples
● Great for beginners getting into research.
Watches 555 Star 12,306 Forks 4,479 Median Issue Resolution 8 HOURS Open Issues 1%
Choosing a Deep Learning Library
CoreML● Apple
● Closed source
● Not a full DL library (you can not use it to train models at the moment), but mainly focused on
deploying pretrained models to IOS and OSX devices
○ If you need to train your own model you will need to use one of the above libraries
○ Model converters available for Keras, Caffe, Scikit-learn, libSVM, XGBoost, MXNet, and
TensorFlow
Choosing a Deep Learning Library
● https://www.mathworks.com/products/deep-learning.html
● a MATLAB toolbox implementing CNNs and LSTMs.
● GPU support and cloud GPU on AWS with MATLAB Distributed Computing Server
● Create, edit, visualize, and analyze deep learning networks with interactive apps.
● Visualize network topologies, training progress, and activations of the learned features in a
deep learning network.
● Import models from Caffe/Tensorflow-Keras/Onnx
● Not open source
○ $500 annual license
○ $1250 perpetual license
Deep Learning Toolbox
Choosing a Deep Learning Library
Deeplearning4j● Skymind
● Keras Support (Python API)
● Written with Java and the JVM in mind
● Focus on enterprise scale
● Great Documentation
● DL4J takes advantage of the latest distributed computing frameworks including Hadoop and
Apache Spark to accelerate training. On multi-GPUs, it is equal to Caffe in performance.
● Can import models from Tensorflow
Watches 835 Star 10,431 Forks 4,602 Median Issue Resolution 6 days Open Issues 20%
Choosing a Deep Learning Library
Chainer● Preferred Networks
● Research Citations(2015) - 207
● CNN Example Code
● Dynamic computation graph
● Used by IBM, Intel
● Japanese and English Community
Watches 328 Star 4,626 Forks 1,228 Median Issue Resolution 44 days Open Issues 11%
Choosing a Deep Learning Library
Darknet● https://github.com/pjreddie/darknet
● Very small open source effort with a laid back dev group.
○ Emojis and jokes everywhere.
○ Seems more of an exercise by the developers.
● Not useful for production environments.
● Maintainer wrote my favorite research paper.
Watches 786 Star 11,980 Forks 6,770 Median Issue Resolution 26 days Open Issues 76%
Choosing a Deep Learning Library
Sonnet● DeepMind
● Google DeepMind
○ One of the biggest name in industry research
○ AlphaGo, AlphaStar
● Built on Tensorflow, makes NN construction and
training easy and extensible.
Watches 475 Star 7,362 Forks 1,011 Median Issue Resolution 14 days Open Issues 14%
Choosing a Deep Learning Library
Knet.jl● https://github.com/denizyuret/Knet.jl
● is the Koç University deep learning framework implemented in Julia
● supports GPU operation, automatic differentiation, and dynamic computational graphs
● Model code can use the full power and expressivity of Julia.
● CNN Example Code
Watches 75 Star 833 Forks 149 Median Issue Resolution 9 days Open Issues 17%
Choosing a Deep Learning Library
Paddle● Baidu
● PArallel Distributed Deep LEarning
● Chinese documentation with an English translation.
● originally developed by Baidu scientists and engineers for the purpose of applying deep
learning to many products at Baidu.
● Really only use if you are in the chinese market/ecosystem.
Watches 649 Star 8,224 Forks 2,232 Median Issue Resolution 14 days Open Issues 18%
Choosing a Deep Learning Library
ConvNetJS● Stanford
● Train Neural Networks entirely in your browser.
● Start training a net now!
● Great for visualizing the full network and training process.
● Mainly used for demonstrating and teaching deep learning on the web
○ See Stanford’s CS231n
Watches 645 Star 9,563 Forks 1,891 Median Issue Resolution 59 days Open Issues 69%
Choosing a Deep Learning Library
Neon● Intel
● Written with Intel Nervana MKL accelerated hardware in mind (Xeon and Phi processors)
● Intel's reference deep learning framework committed to best performance on all hardware.
● One of the fastest libraries
● One of the first half precision floating point enabled libraries.
Watches 366 Star 3,730 Forks 830 Median Issue Resolution 25 days Open Issues 17%
Choosing a Deep Learning Library
DyNet● Carnegie Mellon University
● Dynamic computation graph
● Small user community
Watches 200 Star 2,688 Forks 626 Median Issue Resolution 7 days Open Issues 12%
Choosing a Deep Learning Library
TLDR● Choose TensorFlow or MXNet-Gluon for Industry/Production Environments
○ TensorFlow if you prioritize community support and documentation, MXNet if you need
performance
● Pytorch if you are doing research/developing new models/layers.
● Keras if you are new and want to get started quick.
● Fast.ai + PyTorch if you are here to learn.
● CNTK if you ❤ Windows/Visual Studio/.NET or want to do high performance NLP
● CoreML for deploying things to Apple devices
● Deeplearning4j if you really like to keep things in the JVM.
Choosing a Deep Learning Library