H2O World - Python Pipelines - Spencer Aiello

Preview:

Citation preview

To Production and Beyond

Spencer Aiello

The Problem

• Goal: o Move from prototype to production

• Road block:o Prototyping Environment Cages Your:

• Feature preprocessing• Models• Ideas

The Problem

• Even if your code is beautiful:

The Problem

• You cannot drag-n-drop into a new environment.

• Translation may be difficult;humans make mistakes

A Solution

H2O gives you wings:

• Export Preprocessing

• Export Models

H2OAssembly

o Build Rich Feature Preprocessing Assembly Lines• Clean, reduce, and expand datasets by composing any

of the 100s of primitives available in H2O• Build hygenic processing assembly lines that can be

applied to new batches of data• Export your feature preprocessing steps as a plain old

java object and apply to streaming tuples

H2OAssembly

H2OAssembly

Python

Java

Live Demo

• Lending Club Data: Predict Interest Rateo Four-part dataset of loan datao 500K rows, 52 columnso Preprocess 5 columns within a 16 step assemblyo Build a simple GBM to predict interest rateo Export everything into a Storm topology

Live Demo

Storm Topology

Recommended