28
Deep Learning with Myia Olivier Breuleux Research Developer, MILA Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain)

Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Deep Learning with Myia

Olivier Breuleux Research Developer, MILA

Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain)

Pascal Lamblin (Google Brain)

Page 2: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

The Needs What we need from a language for deep learning

Autodiff What it is, how it works, what the challenges are

Representation The best representation for our needs

Type system Flexible inference for performance and robustness

2

Page 3: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

The Needs What we need from a language for deep learning

Autodiff What it is, how it works, what the challenges are

Representation The best representation for our needs

Type system Flexible inference for performance and robustness

3

Page 4: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Deep Learning

4

DL algorithms are increasingly complex

Feedforward (trivial)

Recurrent (loops)

Recursive (recursion)

…?

Page 5: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Deep Learning

5

DL algorithms are increasingly complex

• More and more language features needed

• Most existing frameworks are limited

• High level abstraction increases productivity • Focus on the algorithm over implementation details

• Effortless abstractions encourage their use

Page 6: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Needs

Debuggable: Clear errors, inspectable, instrumentable.

Goal: a language adapted to the needs of machine learning, past and future

6

General purpose: Capable of expressing complex control flow.

Differentiable: Should be able to take nth-order derivative of any program.

Portable: Serializable, support multiple hardware.

Fast: Must leverage parallelism and GPU.

Page 7: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Needs

Debuggable: Type+shape inference, step debugger.

Myia: a language adapted to the needs of machine learning, past and future

7

General purpose: Conditionals, loops, recursion, data structures.

Differentiable: Transformation at the intermediate representation level.

Fast & portable: Choose from various backends such as NNVM/Relay.

Page 8: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

The Needs What we need from a language for deep learning

Autodiff What it is, how it works, what the challenges are

Representation The best representation for our needs

Type system Flexible inference for performance and robustness

8

Page 9: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Differentiability

How to train a model• Initialize a model’s parameters • Compute some quantity using the parameters • Compute a cost or “loss function” • Update parameters using the gradient of the loss • Rinse and repeat

Gradients• Can be computed exactly and automatically • But: no mainstream language supports this natively • Computational strategies: forward or reverse • Implementation strategies: operator overloading or source transform

9

θ ← θ − λ∂L( f(x; θ), y)

∂θ

θf(x; θ)L( f(x; θ), y)

Page 10: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Forward vs Reverse

y1 = f(x)

y2 = g(y1)

y3 = h(y2)<latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit>

f : Rm ! Rp

g : Rp ! Rq

h : Rq ! Rn<latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="C39OhB+IczRcjLNINXH29e9lt8M=">AAAB2HicbZDNSgMxFIXv1L86Vq1rN8EiuCpTN+pOcOOygmML7VAymTttaCYzJHeEMvQFXLhRfDB3vo3pz0KtBwIf5yTk3hMXSloKgi+vtrW9s7tX3/cPGv7h0XGz8WTz0ggMRa5y04+5RSU1hiRJYb8wyLNYYS+e3i3y3jMaK3P9SLMCo4yPtUyl4OSs7qjZCtrBUmwTOmtowVqj5ucwyUWZoSahuLWDTlBQVHFDUiic+8PSYsHFlI9x4FDzDG1ULcecs3PnJCzNjTua2NL9+aLimbWzLHY3M04T+zdbmP9lg5LS66iSuigJtVh9lJaKUc4WO7NEGhSkZg64MNLNysSEGy7INeO7Djp/N96E8LJ90w4eAqjDKZzBBXTgCm7hHroQgoAEXuDNm3iv3vuqqpq37uwEfsn7+Aap5IoM</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">AAACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhhiegGSBJGOXEVVYx0U0FQHDDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9UUih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RRHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBBTTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At00ur4=</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">AAACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhhiegGSBJGOXEVVYx0U0FQHDDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9UUih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RRHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBBTTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At00ur4=</latexit><latexit sha1_base64="EkpZKbT9mWbEBvTrFrQGFQwFolA=">AAACbnicdVE7T8MwGHTCq4RXKQNDQVhUoE5VwsJjqsTCWCpKKzVR5bhOatV51HZAVdSVH8jGf2DhH+C0GUpKP8nS6e4++Xx2Y0aFNM0vTd/Y3NreKe0ae/sHh0fl48qriBKOSQdHLOI9FwnCaEg6kkpGejEnKHAZ6brjx0zvvhEuaBS+yGlMnAD5IfUoRlJRg/KHB68foB0gOXLdtD0bBNDm1B9JxHn0vizE0LYNv+CO17knmXtUcE/WuVWQmtkw5wNXgZWDGsinNSh/2sMIJwEJJWZIiL5lxtJJEZcUMzIz7ESQGOEx8klfwRAFRDjpvK8ZvFLMEHoRVyeUcM4ub6QoEGIauMqZZRRFLSP/0/qJ9O6clIZxIkmIFxd5CYMygln5cEg5wZJNFUCYU5UV4hHiCEv1RYYqwSo+eRV0bhr3DfPZrDXbeRslUAWXoA4scAua4Am0QAdg8K1VtKp2pv3op/q5frGw6lq+cwL+jF7/Bf/Ru6Q=</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit>

Jh�g�f (x)| {z }n⇥m

= Jh(y2)| {z }n⇥q

Jg(y1)| {z }q⇥p

Jf (x)| {z }p⇥m

<latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit>

The derivative of a straight composition of functions is the multiplication of their Jacobians

In what order?

10

Page 11: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Forward vs Reverse

( Jh(y2)| {z }n⇥q

Jg(y1)| {z }q⇥p| {z }

n⇥p

) Jf (x)| {z }p⇥m

<latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit>

Jh(y2)| {z }n⇥q

( Jg(y1)| {z }q⇥p

Jf (x)| {z }p⇥m

| {z }q⇥m

)

<latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit>

Forward Reverse

Cost Cost

qpm+ nqm<latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit>

nqp+ npm<latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit>

= m(qp+ nq)<latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit>

= n(qp+ pm)<latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit>

11

Page 12: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Forward vs Reverse

Forward mode is good when there are few inputs.• Easy to implement: dual numbers.

x !✓y1,

dy1dx

◆!

✓y2,

dy2dx

◆!

✓y3,

dy3dx

<latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit>

12

Reverse mode is good when there are few outputs.• Hard to implement: execution is reversed.

x ! y1 ! y2 ! y3 ! dy3dy2

! dy3dy1

! dy3dx

<latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit>

Page 13: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Forward vs Reverse

Deep learning involves computing the gradient of millions of parameters with respect to a loss.

We need reverse mode.

✓ ✓ � ✏@L@✓

where ✓ = (W1,W2, . . . ,b1,b2, . . .)<latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit>

13

Page 14: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

OO vs SCT: Operator Overloading

def f(x): i = 0 while i < 3: i = i + 1 x = tanh(x) x = x * 10 return x

i = 0 i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) x = x * 10

TraceBackprop

• Overload every operation to log itself on a tape. • At the end, we walk the tape backward. • “Define-by-run”, “Dynamic graph” • Easy to implement, but lots of overhead

• Discourages composing small & cheap operations

Program Tape

14

Page 15: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

OO vs SCT: Source Code Transformation

• Transform a function that computes a value into a new function that computes the derivative.

• Operate on source code or intermediate representation • Applies the chain rule to code

• Standard language optimizations apply: can eliminate overhead

• Easier to apply to functional languages • Reverse mode AD interacts badly with mutation and side effects • Requires deep analysis and optimization to remove dead code

15

def bprop_pow(x, y, out, dout): dx = dout * y * x ** (y - 1) dy = dout * out * log(x) return dx, dy

What if we don’t need dy?

Page 16: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

The Needs What we need from a language for deep learning

Autodiff What it is, how it works, what the challenges are

Representation The best representation for our needs

Type system Flexible inference for performance and robustness

16

Page 17: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

About syntax

17

Myia is an intermediate representation• High level • No syntax of its own • Multiple languages may target it

Python frontend• Why? Most used language in DL • Productive for research and prototyping • Translate functional subset to Myia

• Control flow: if, while, for, def, lambda • Data: lists, tuples, arrays, @dataclass • Not supported: mutation, side effects, eval

• One issue: translate dynamically typed code

def fact(x): if x <= 1: return 1 else: return x * fact(x - 1)

Output

Operation

Input

Constant

Free var.

Page 18: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Needs

18

Requirements for our representation• Powerful enough to represent recursion • Minimal • Easy to parallelize • Easy to optimize • Easy to extend

Solutions• Functional (ANF) • Graph-based • Typed

Page 19: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Why functional programming?

19

Easier to transform• Referential transparency: same expression, same result Easier to think about • No side effects Easier to optimize• Order of operations can be changed • Parallelizable • Common subexpression elimination easy Type system is easier • No side effects Easier for automatic differentiation

Page 20: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Why graphs?

20

Easy to parallelize• Only data flow relationships

Easy to optimize• Direct use-def pointers (no names) • Dead code elimination is trivial • Inlining is easy

Output

Operation

Input

Page 21: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Why static typing?

Guarantees• Correctness of the user’s program • Type correctness of code transforms (autodiff)

Performance• No runtime type checking = better performance • Leverage shape information for optimization

User experience• Prevent errors late in process

Page 22: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

The Needs What we need from a language for deep learning

Autodiff What it is, how it works, what the challenges are

Representation The best representation for our needs

Type system Flexible inference for performance and robustness

22

Page 23: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Myia’s Types

Scalars: Int/UInt/Float<8/16/32/64>, Bool Tuples: Tuple<T1, T2, …> • Heterogeneously typed, static length Lists: List<T> • Homogeneously typed, dynamic length, fast append Arrays: Array<T, Shape<D1, D2, …">> • Homogeneously typed, shape part of the type Functions: Function<Args<TIn1, TIn2, ""...>, TOut>

Struct types are reduced to tuples in pre-processing.

23

Page 24: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Why inference?

Annotations are annoying• Polymorphic types are awkward to express • Function types are awkward to express • Impede rapid prototyping • Duck typing is more natural • This is why people like Python

Type/shape inference • Infer from the input types from entry point • Implicit polymorphism • Feels dynamic • Functions are re-compiled when they are given new input types

Page 25: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Myia’s inference pipeline

1. Transform inputs into abstract inputs• Represent type and shape — no concrete values • More types: structs, polymorphic functions

2. Run abstract interpreter on abstract inputs• Bounded input signatures for each function • Recursive functions become fixed points

3. Specialize functions to their possible signatures• If function called with int, make int version, etc. • Higher order uses require signature uniqueness

4. Update or re-run inference after optimizations or AD

Page 26: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Error reporting

Abstract inferrer shows compile-time tracebacks for type/shape errors.

Page 27: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

Debugging

Tracking correspondence to source code• Through parsing • Through optimization • Through automatic differentiation • Through macros/code generation

Debugging tools we need • Custom debugger for step by step execution • Watching variables and gradients • Breakpoints that trigger during the reverse phase • Profiling and reporting which parts of the code are “hot”

Page 28: Deep Learning with Myia - Nvidia · Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain) Pascal Lamblin (Google Brain) The Needs What we need from a language for deep

In Conclusion: Myia’s focus

General purpose, including recursion

Automatic differentiation• Code transform • Optimizable, higher order gradients

Type and shape inference• Can handle duck typed code

Good debugging facilities• Step debugger, profiling • Gradient debugging

⭐ us on GitHub: https://github.com/mila-iqia/myia