13
Caffe + H2O Cyprien Noel

Caffe + H2O - By Cyprien noel

Embed Size (px)

Citation preview

Page 1: Caffe + H2O - By Cyprien noel

Caffe + H2OCyprien Noel

Page 2: Caffe + H2O - By Cyprien noel

Context - me● Distributed systems - trading, air control, neural nets● Multi-GPU Caffe● Caffe over InfiniBand in Spark

Now at UCB● Caffe: python, help merge forks● Project: how to generalize work above?

○ Help leverage devices, e.g. in H2O○ New distributed Caffe, meta graph

Page 3: Caffe + H2O - By Cyprien noel

Context - industry

Page 4: Caffe + H2O - By Cyprien noel

Example

Page 5: Caffe + H2O - By Cyprien noel

Problem● DPDK● Libfabric● Accelio● UCX● PMEM● More every week...

● GPUDirect● NVM Express● HMM● CAPI● CCIX● HSA● OFED

Page 6: Caffe + H2O - By Cyprien noel

A single abstraction?● Intra (device bus) vs inter-machine (networks)

○ E.g. CUDA copy and sockets○ RDMA blurs local and remote devices

● Communication vs persistence○ Sockets vs files is orthogonal to location○ NVMe allows storage on remote disks

● Ephemeral vs durable○ 3D XPoint & ReRAM are in-between RAM and SSD○ Intel’s pmem exposes device directly as memory

Page 7: Caffe + H2O - By Cyprien noel

Proposal● An in-memory file system

○ Location transparent mmap○ Transactional

Page 8: Caffe + H2O - By Cyprien noel

Example - GPU kernel on data in storage

Today

BFS

● Client reads HDFS path● HDFS client resolves worker● Establishes connection● Server accepts connection

● Authentication, authorization● File system operation● Network transfer● CUDA transfer

data = mmap("/path")gpu_kernel(data)

Page 9: Caffe + H2O - By Cyprien noel

Example - Compute graph in hardware/app/jpgs/* /layers/* /vars/* // Access DB /redis db = redis.open("./redis")

● Everything is a file○ Using mmap, named pipes, unix sockets○ E.g. inputs jpgs, weights, activations, counters

● All state and coordination in fs○ Minimal code, e.g. persistent GPU kernels○ Location independent → dynamic placement○ Arbitrary graph splitting, e.g. data & model parallel ML

Page 10: Caffe + H2O - By Cyprien noel

Example - Caffe & H2O

● H2O can write to Caffe input layers○ Data directly placed GPUs○ RDMA atomic ops to count dependencies

● Can form pipelines○ No need for pair wise integrations○ Uniform monitoring, logging etc.○ Leverage best device for each step

Page 11: Caffe + H2O - By Cyprien noel

Benefits● Performance

○ mmap lowest possible overhead○ Leverages hardware, e.g. GPUDirect, RDMA, NVMe, atomic ops

● Complexity○ Unified naming, permissioning, distributed state management○ Hierarchical naming & location transparency → HA, placement

● Security○ File permissions familiar & kernel level, other networking disabled○ Mounting folder gives access to well defined resources / capabilities

Page 12: Caffe + H2O - By Cyprien noel

Prototype● Single master with meta data● Distributed mmap (CPU)● Embedded platform (X1)● Ethernet, InfiniBand

Page 13: Caffe + H2O - By Cyprien noel

Summary● Caffe progress - multi-GPU in python, merge NV work● Working on new programming model

○ “Unix philosophy for modern apps”○ Helps leverage devices, e.g. in H2O○ Simplifies apps integration & pipelines○ Distributed version of Caffe first use case