Performance Tuning in Production - USENIX...tensorflow-tracing Performance Tuning in Production May...

tensorflow-tracing Performance Tuning in Production

May 2019

Sayed Hadi Hashemi Paul Rausch Benjamin Rabe Kuan-Yen Chou Simeng Liu Volodymyr Kindratenko Roy H Campbell

Performance Changes

Change of ModelHyper parameters: e.g. Batch Size

Performance Changes

StorageNetworkMemory

Performance Changes

StorageNetworkMemory

DriverSoftware StackMisconfiguration

Performance Tuning Developer

Code Probe

Python

Images Credit: Google Brain

Code Probe DAG Probe

Python TensorBoard

Code Probe DAG Probe Whole DAG Runtime Execution

Python TensorBoard Chrome Tracing

Performance Tuning Admin

Application-Level

Pros Effective

Cons Code Modification Advance Planning Could be complicated (e.g. T2T)

Application-Level Resource-Level

Pros Effective

Cons Code Modification Advance Planning Could be complicated (e.g. T2T)

netstat nvidia-smi

NSight dstat

Pros Easy to Use

No Code Modification No Advance Planning

General Availability

Cons Too Coarse

Don’t distinguish different tasks The report time is too small

Data is hard to interpret without context

Challenges Admin

Detect Problems Find the Baseline Detect Anomaly

Root Cause Analysis Runtime Profiling/Tracing without modification/planning Data Exchange

MonkeyPatching Intercepts Framework Calls No need for code modification

Admin Portal Runs at the start of a job Collects Task-Base Profiling to Establish Baseline On Demand Tracing/No need for advanced planning

Tracing File Format Portable format CLI to explore traces

tensorflow-tracing MonkeyPatching

session.runTensorFlowTensorflow

Application

ApplicationMonkeyPatching

tensorflow-tracing

Disabled No interception Only Manage Selected Sessions

ApplicationMonkeyPatching

tensorflow-tracing

Per Application Intercept an application Manage all the sessions

System-wide Intercept the global library Manage all applications

tensorflow-tracing Admin Portal

Separate Different Tasks

tensorflow-tracing Collection

Profile Collect Automatically Low Overhead (≈0%) Establish the Baseline

Trace Collect On Demand High Overhead (≈3%) Root Cause Analysis

tensorflow-tracing Availability

Deploy Campus-wide Deep Learning Cluster In use at NCSA since Fall 2018

Apache-2 Downloaded +4k times from Pip

Quick Start pip install tensorflow-tracer

Source Code https://github.com/xldrx/tensorflow-tracer

Experiences Common Causes

Network Transfer Timing [Hashemi et al, SysML19] Congestion Wrong Network Interface

Storage NFS Exhaustion - Rogue Application Small Reads vs TFRecords

Platform Software Stack Drivers Containers

Device Placement CPU/GPU Locality

TicTac Result

Hashemi et al, SysML19

Tensor2Tensor

tennsor2tensor

Questions

Image Credit: The Neverhood

This work is supported by: National Science Foundation under Grant No. 1725729

Quick Start pip install tensorflow-tracer

Source Code https://github.com/xldrx/tensorflow-tracer

hashemi3@illinois.edu

Performance Tuning in Production - USENIX...tensorflow-tracing Performance Tuning in Production May...

Documents

Homepage Mariani Tuning - Mariani Tuning Engineering

CHANGING ROLES OF LIBRARIES By Margaret Simeng UNIMAS Margaret Simeng

Powered by TCPDF () - Bc Tuning - BC Tuning

FLEXIBLE AUTOMATION IS TUNING YOUR PRODUCTION

Hydrogen production by Tuning the Photonic Band Gap … · Hydrogen production by Tuning the Photonic Band Gap with the ... Riyadh and KAUST, Saudi Arabia ... (UK). d KAUST, Thuwal,

KEKB instrumentation and tuning for crab cavity tuning

MICROSTRIP POST PRODUCTION TUNING BAR …oaktrust.library.tamu.edu/bitstream/handle/1969.1/2337/etd-tamu... · iii ABSTRACT Microstrip Post Production Tuning Bar Error and Compact

Tuning electrospinning parameters for production of 3D-fiber-fleeces

Accessport - COBB Tuning · Standard Tuning vs SD (Speed Density) Tuning Standard Tuning uses MAF (Mass Air Flow) sensor for air flow calculations. COBB Tuning provides Off-The-Shelf

Cryogenic Target Production for Commissioning Tuning ... · PDF fileCryogenic Target Production for Commissioning Tuning Techniques and the First Ignition Tuning ... S. Bhandarkar,

Tuning Derby - Apache DB Projectdb.apache.org/derby/docs/10.8/tuning/tuningderby.pdf · Tuning Derby • Tuning databases and applications

Tipi di tuning: tuning dell’architettura fisica tuning dell’istanza tuning dell’architettura logica tuning applicativo Metodi di tuning: il tuning prevede

Cosine Tuning Minimizes Motor Errorstodorov/papers/Todorov... · 2010. 10. 14. · Here we argue that cosine tuning minimizes expected errors in force production, which makes it a

Jinfang Zhang , Simeng Ren, Hongchen Xia, Wen Jia, Chi ... · Jinfang Zhang, Simeng Ren, Hongchen Xia, Wen Jia, Chi Zhang International Joint Research Center for Photoresponsive

Oracle9iPerformance Tuning - · PDF fileOracle9iPerformance Tuning Student Guide . Volume 1 D11299GC10 Production 1.0 July 2001 D33513. ... 11 Tuning the Oracle Shared Server Objectives

Abstract Quadruple Magnets Tuning Tsyganov Aleksandr, Budker Institute of Nuclear Physics, Russia The tuning of quadruples within the production cycle

Tuning in Russia Tuning of Russia

Performance Tuning in Production - GitHub Pages › assets › slide-decks › optimizing-java.pdf · Performance Tuning in Production James Gough Sadiq Jaffer Kirk Pepperdine Richard

ADF Power Tuning · PDF file— ADF Power Tuning ADF Power Tuning — ADF Power Tuning ADF Power Tuning These disturbances include network unbalance, ﬂ icker, and especially

Overview Tuning Defined Tuning in the US The Tuning Process Benefits of Tuning Why Tuning is Different