20
Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1* , V. Pondenkandath 1* , L. Vögtlin 1 , M. Würsch 12 , R. Ingold 1 , M. Liwicki 13 *Equal contribution 1 DIVA Group, University of Fribourg, Switzerland 2 IIT, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Switzerland 3 EISLAB Machine Learning, Luleå University of Technology, Sweden

New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Improving Reproducible Deep Learning Workflows with DeepDIVA

M. Alberti1*, V. Pondenkandath1*, L. Vögtlin1, M. Würsch12, R. Ingold1, M. Liwicki13

*Equal contribution

1DIVA Group, University of Fribourg, Switzerland2IIT, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Switzerland3EISLAB Machine Learning, Luleå University of Technology, Sweden

Page 2: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Reproducibility Crisis: Trust or Verify?

2

Joelle Pineau, “Reproducible, Reusable, and Robust Reinforcement Learning”,

invited talk @NeurIPS 2018, Montreal, Canada

Page 3: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

No possibility to verify

No possibility to extend

Lots of overhead created

Leads to no trust in scientific results

Why Is This a Problem?

3

Page 4: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Ensure reproducibility

Of your own experiments

Of other people’s experiments

Promote open-source code

Make it easy to have “good enough” code

Enable code trustworthiness

How To Make Steps Forward?

4

Page 5: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Open-Source

Python framework

Built on top of PyTorch

Makes your life easer for:Reproducing your own and other people’s experiments

Provides boilerplate code for:Common deep learning scenarios

Handling time consuming everyday problems

Documentation & Tutorial available

How We Contribute: DeepDIVA

5

Page 6: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Reproducing Your Own Experiments

Short-term, or work in progress

Long-term, or finished work

6

Page 7: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Kilometres of poor or incomplete log files

Stochasticity in the process

Short-term Reproducibility Dangers

7

Page 8: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Meaningful logging

Saving all run parameters and command line args

Providing concise coloured logs

Deterministic runs

Seeding the pseudo-random numbers generators: Python, Numpy and PyTorch.

Disabling CuDNN (NVIDIA Deep Neural Network library) when necessary

How DeepDIVA Ensures Short-term Reproducibility

8

Page 9: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Poor (or non-existent!) use of version control

Hard-to-die bad programming habits

Silent data modifications

Long-term Reproducibility Dangers

9

Page 10: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Git status

Linking every run to a specific commit in Git

Allowing this feature to be disabled for dev purposes

Copy code

Copying the entire running code in the output folder

Data Integrity Management

Footprint of the data in a JSON file using SHA-1 hashes

How DeepDIVA Ensures Long-term Reproducibility

10

Page 11: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Reproducing Other People’s Experiments

Given a paper, try to replicate the results and observations

11

Page 12: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

In order to reproduce an experiment one needs:

Git repository URL

Git commit identifier (full SHA)

List of command line arguments used

The data

Reproducing Other People’s Experiments

12

Page 13: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Productivity Out-of-the-box

Making your life easier: do not reinvent the wheel!

13

Page 14: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

“One click away” Deep Learning Scenarios

14

Page 15: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

“when the data is ready the task is solved”

Download a dataset with a click

Natural images, medical images, historical documents, …

Split your datasetTrain, Validation and Test splits

Analyse the data

Mean/std and class distributions

Ensure data integrity

Compare the footprints

Prepare Your Data

15

Page 16: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Real-time Visualizations

16

Tensorboard (from TensorFlow)

Confusion Matrix

Features Visualization

Weight Histograms

Performance Evaluation

Page 17: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Let machine learning find the best values

No expensive grid or random search

Automatic Hyper-Parameter Optimization

17

Page 18: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Be A Part Of It

Getting Started With DeepDIVA

18

Page 19: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

No Setup Time From source on Ubuntu (or other flavours of Linux)Docker Image Coming Soon

DocumentationOnline and in the code

TutorialsLearn new features efficiently

Fork ItExtensive and modular for easy modifications

How To Use It

19

Page 20: New Improving Reproducible Deep Learning Workflows with DeepDIVA · 2019. 7. 1. · Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1*, V. Pondenkandath1*,

Make Your Experiment Reproducible

bit.ly/DeepDIVA

20