23
skymind.io | deeplearning.org | gitter.im/deeplearning4j DL4J and DataVec Building Production Class Deep Learning Workflows for the Enterprise Josh Patterson / Director Field Org MLConf 2016 / Atlanta, GA

Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

  • Upload
    mlconf

  • View
    330

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

skymind.io | deeplearning.org | gitter.im/deeplearning4j

DL4J and DataVecBuilding Production Class Deep Learning Workflows for the Enterprise

Josh Patterson / Director Field OrgMLConf 2016 / Atlanta, GA

Page 2: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Josh Patterson

Director Field Engineering / SkymindCo-Author: O’Reilly’s “Deep Learning: A Practitioners Approach”

Past:

Self-Organizing Mesh Networks / Meta-Heuristics Research

Smartgrid work / TVA + NERC

Principal Field Architect / Cloudera

Page 3: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Topics

• Deep Learning in Production for the Enterprise

• DL4J and DataVec

• Example Workflow: Modeling Sensor Data with RNNs

Page 4: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Deep Learning in Production

Page 5: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Defining Deep Learning

Higher neuron counts than in previous generation neural networks

Different and evolved ways to connect layers inside neural networks

More computing power to train

Automated Feature Learning

“machines that learn to represent the world”

Page 6: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Quick Usage Guide

• If I have Timeseries or Audio Input: Use a Recurrent Neural Network

• If I have Image input: Use a Convolutional Neural Network

• If I have Video input: Use a hybrid Convolutional + Recurrent Architecture!

Page 7: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

The Challenge of the Fortune 500

Take business problem and translate it into a product-izable solution

• Get data together

• Understand modeling, pull together expertise

Get the right data workflow / infra architecture to production-ize application

• Security

• Integration

Page 8: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

“Google is living a few years in the future and sending the rest of us messages”

-- Doug Cutting in 2013

HoweverMost organizations are not built like Google

(and Jeff Dean does not work at your company…)

Anyone building Next-Gen infrastructure has to consider these things

Page 9: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Production Considerations

• Security – even though I can build a model, will IT let me run it?

• Data Warehouse Integration – can I easily run this In the existing IT footprint?

• Speedup – once I need to go faster, how hard is it to speed up modeling?

Page 10: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

DL4J and DataVec

Page 11: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

DL4J and DataVec

• DL4J – ASF 2.0 Licensed JVM Platform for Enterprise Deep Learning

• DataVec - a tool for machine learning ETL (Extract, Transform, Load) operations.

• Both run natively on Spark on CPU or GPU as Backends

• DL4J Suite certified on CDH5, HDP2.4, and upcoming IBM IOP platform.

Page 12: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

ND4J: The Need for SpeedJavaCPP• Auto generate JNI Bindings for C++• Allows for easy maintenance and deployment of C++ binaries in Java

CPU Backends• OpenMP (multithreading within native operations)• OpenBLAS or MKL (BLAS operations)• SIMD-extensions

GPU Backends• DL4J supports Cuda 7.5 (+cuBLAS) at the moment, and will support 8.0 support as soon as it comes

out.• Leverages cuDNN as well

https://github.com/deeplearning4j/dl4j-benchmark

Page 13: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Prepping Data is Time Consuming

http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#633ea7f67f75

Page 14: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Preparing Data for Modeling is Hard

Page 15: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

DL4J Workflow Toolchain

ETL(DataVec)

Vectorization

(DataVec)

Modeling(DL4J)

Evaluation(Arbiter)

Execution Platforms: Spark/Hadoop, Single Machine

ND4J - Linear Algebra Runtime: CPU, GPU

Page 16: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Modeling Sensor Data with RNNs and DL4J

Page 17: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

NERC Sensor Data CollectionopenPDC PMU Data Collection circa 2009

• 120 Sensors• 30 samples/second• 4.3B Samples/day• Housed in Hadoop

Page 18: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Classifying UCI Sensor Data: Trends

A – Downward TrendB – CyclicC – NormalD – Upward ShiftE – Upward TrendF – Downward Shift

Page 19: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Loading and Transforming Timeseries Data with DataVec

SequenceRecordReader trainFeatures = new CSVSequenceRecordReader();trainFeatures.initialize(new NumberedFileInputSplit(featuresDirTrain.getAbsolutePath() + "/%d.csv", 0, 449));SequenceRecordReader trainLabels = new CSVSequenceRecordReader();trainLabels.initialize(new NumberedFileInputSplit(labelsDirTrain.getAbsolutePath() + "/%d.csv", 0, 449));

int minibatch = 10;int numLabelClasses = 6;DataSetIterator trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, minibatch, numLabelClasses, false, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);

//Normalize the training dataDataNormalization normalizer = new NormalizerStandardize();normalizer.fit(trainData); //Collect training data statistics

trainData.reset();trainData.setPreProcessor(normalizer); //Use previously collected statistics to normalize on-the-fly

Page 20: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Configuring a Recurrent Neural Network with DL4JMultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT).iterations(1) .updater(Updater.NESTEROVS).momentum(0.9).learningRate(0.005) .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue) .gradientNormalizationThreshold(0.5) .list() .layer(0, new GravesLSTM.Builder().activation("tanh").nIn(1).nOut(10).build()) .layer(1, new RnnOutputLayer.Builder(LossFunctions.LossFunction.MCXENT) .activation("softmax").nIn(10).nOut(numLabelClasses).build()) .pretrain(false).backprop(true).build();

MultiLayerNetwork net = new MultiLayerNetwork(conf);net.init();

Page 21: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Train the Network on Local Machineint nEpochs = 40;String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f";

for (int i = 0; i < nEpochs; i++) { net.fit(trainData);

//Evaluate on the test set: Evaluation evaluation = net.evaluate(testData); System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1()));

testData.reset(); trainData.reset();}

Page 22: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Train the Network on SparkTrainingMaster tm = new ParameterAveragingTrainingMaster(true,executors_count,1,batchSizePerWorker,1,0); //Create Spark multi layer network from configurationSparkDl4jMultiLayer sparkNetwork = new SparkDl4jMultiLayer(sc, net, tm);

int nEpochs = 40;String str = "Test set evaluation at epoch %d: Accuracy = %.2f, F1 = %.2f";

for (int i = 0; i < nEpochs; i++) { sparkNetwork.fit(trainDataRDD);

//Evaluate on the test set: Evaluation evaluation = net.evaluate(testData); System.out.println(String.format(str, i, evaluation.accuracy(), evaluation.f1()));

testData.reset(); trainData.reset();}

Page 23: Josh Patterson, Advisor, Skymind – Deep learning for Industry at MLconf ATL 2016

Thank you!

Please visit skymind.io/learn for more information

OR

Visit us at booth P33