Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Deep Learning with Myia
Olivier Breuleux Research Developer, MILA
Arnaud Bergeron (MILA) Bart van Merriënboer (MILA, Google Brain)
Pascal Lamblin (Google Brain)
The Needs What we need from a language for deep learning
Autodiff What it is, how it works, what the challenges are
Representation The best representation for our needs
Type system Flexible inference for performance and robustness
2
The Needs What we need from a language for deep learning
Autodiff What it is, how it works, what the challenges are
Representation The best representation for our needs
Type system Flexible inference for performance and robustness
3
Deep Learning
4
DL algorithms are increasingly complex
Feedforward (trivial)
Recurrent (loops)
Recursive (recursion)
…?
Deep Learning
5
DL algorithms are increasingly complex
• More and more language features needed
• Most existing frameworks are limited
• High level abstraction increases productivity • Focus on the algorithm over implementation details
• Effortless abstractions encourage their use
Needs
Debuggable: Clear errors, inspectable, instrumentable.
Goal: a language adapted to the needs of machine learning, past and future
6
General purpose: Capable of expressing complex control flow.
Differentiable: Should be able to take nth-order derivative of any program.
Portable: Serializable, support multiple hardware.
Fast: Must leverage parallelism and GPU.
Needs
Debuggable: Type+shape inference, step debugger.
Myia: a language adapted to the needs of machine learning, past and future
7
General purpose: Conditionals, loops, recursion, data structures.
Differentiable: Transformation at the intermediate representation level.
Fast & portable: Choose from various backends such as NNVM/Relay.
The Needs What we need from a language for deep learning
Autodiff What it is, how it works, what the challenges are
Representation The best representation for our needs
Type system Flexible inference for performance and robustness
8
Differentiability
How to train a model• Initialize a model’s parameters • Compute some quantity using the parameters • Compute a cost or “loss function” • Update parameters using the gradient of the loss • Rinse and repeat
Gradients• Can be computed exactly and automatically • But: no mainstream language supports this natively • Computational strategies: forward or reverse • Implementation strategies: operator overloading or source transform
9
θ ← θ − λ∂L( f(x; θ), y)
∂θ
θf(x; θ)L( f(x; θ), y)
Forward vs Reverse
y1 = f(x)
y2 = g(y1)
y3 = h(y2)<latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit><latexit sha1_base64="FR8JrMfD1GtSrL9l9jua0DwEKV4=">AAACFXicbZBNS8MwGMfT+TbrW9Wjl+BQtoOjnYJ6EAZePE6xbrCOkmbpFpa+kKRiKfsUXvwqXjyoeBW8+W1Mtx508w+Bf37P85A8fy9mVEjT/NZKC4tLyyvlVX1tfWNzy9jeuRNRwjGxccQi3vGQIIyGxJZUMtKJOUGBx0jbG13m9fY94YJG4a1MY9IL0CCkPsVIKuQaR6lrwcML6FcfatBx9NRt5NdBVfECHOdgqECj5hoVs25OBOeNVZgKKNRyjS+nH+EkIKHEDAnRtcxY9jLEJcWMjHUnESRGeIQGpKtsiAIietlkrTE8UKQP/YirE0o4ob8nMhQIkQae6gyQHIrZWg7/q3UT6Z/1MhrGiSQhnj7kJwzKCOYZwT7lBEuWKoMwp+qvEA8RR1iqJHUVgjW78ryxG/Xzunl9UmneFGmUwR7YB1VggVPQBFegBWyAwSN4Bq/gTXvSXrR37WPaWtKKmV3wR9rnD9xrmc4=</latexit>
f : Rm ! Rp
g : Rp ! Rq
h : Rq ! Rn<latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="C39OhB+IczRcjLNINXH29e9lt8M=">AAAB2HicbZDNSgMxFIXv1L86Vq1rN8EiuCpTN+pOcOOygmML7VAymTttaCYzJHeEMvQFXLhRfDB3vo3pz0KtBwIf5yTk3hMXSloKgi+vtrW9s7tX3/cPGv7h0XGz8WTz0ggMRa5y04+5RSU1hiRJYb8wyLNYYS+e3i3y3jMaK3P9SLMCo4yPtUyl4OSs7qjZCtrBUmwTOmtowVqj5ucwyUWZoSahuLWDTlBQVHFDUiic+8PSYsHFlI9x4FDzDG1ULcecs3PnJCzNjTua2NL9+aLimbWzLHY3M04T+zdbmP9lg5LS66iSuigJtVh9lJaKUc4WO7NEGhSkZg64MNLNysSEGy7INeO7Djp/N96E8LJ90w4eAqjDKZzBBXTgCm7hHroQgoAEXuDNm3iv3vuqqpq37uwEfsn7+Aap5IoM</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">AAACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhhiegGSBJGOXEVVYx0U0FQHDDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9UUih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RRHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBBTTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At00ur4=</latexit><latexit sha1_base64="KTxKCtjSTZwxigvlCFitzU1Zg10=">AAACY3icdVG9TsMwGHTCXwkFSheGgrCoQExVwsLPhMTCWBChlZqoclwntXCc1HZAVdSVB2TjHVh4A5y2Q0npJ1k63Z31nc9ByqhUtv1lmGvrG5tblW1rp7q7t187qL7IJBOYuDhhiegGSBJGOXEVVYx0U0FQHDDSCV7vC73zRoSkCX9W45T4MYo4DSlGSlP92kcIz2+hFyM1DIL8adKPoSdoNFRIiOR9UUih51lRyZ2uco8K97DkHq1y6yBNu2VPBy4DZw6aYD7tfu3TGyQ4iwlXmCEpe46dKj9HQlHMyMTyMklShF9RRHoachQT6efTvibwTDMDGCZCH67glF28kaNYynEcaGeRUZa1gvxP62UqvPZzytNMEY5ni8KMQZXAonw4oIJgxcYaICyozgrxEAmElf4iS5fglJ+8DNzL1k3LfrRBBTTAKbgADrgCd+ABtIELMPg26kbDODJ+zEPzeNaWacxrq4M/Y578At00ur4=</latexit><latexit sha1_base64="EkpZKbT9mWbEBvTrFrQGFQwFolA=">AAACbnicdVE7T8MwGHTCq4RXKQNDQVhUoE5VwsJjqsTCWCpKKzVR5bhOatV51HZAVdSVH8jGf2DhH+C0GUpKP8nS6e4++Xx2Y0aFNM0vTd/Y3NreKe0ae/sHh0fl48qriBKOSQdHLOI9FwnCaEg6kkpGejEnKHAZ6brjx0zvvhEuaBS+yGlMnAD5IfUoRlJRg/KHB68foB0gOXLdtD0bBNDm1B9JxHn0vizE0LYNv+CO17knmXtUcE/WuVWQmtkw5wNXgZWDGsinNSh/2sMIJwEJJWZIiL5lxtJJEZcUMzIz7ESQGOEx8klfwRAFRDjpvK8ZvFLMEHoRVyeUcM4ub6QoEGIauMqZZRRFLSP/0/qJ9O6clIZxIkmIFxd5CYMygln5cEg5wZJNFUCYU5UV4hHiCEv1RYYqwSo+eRV0bhr3DfPZrDXbeRslUAWXoA4scAua4Am0QAdg8K1VtKp2pv3op/q5frGw6lq+cwL+jF7/Bf/Ru6Q=</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit><latexit sha1_base64="dO5pQ9/rPJkhK3xmvzvEsxqnp+Q=">AAACbnicdVHNT8IwHO3mF84vxIMHNDYSjScyjIkfJxIvHpE4IWEL6UrHGtpttJ2GLFz9A735P3jxP7ADDgjyS5q8vPd+6eurnzAqlW1/Geba+sbmVmHb2tnd2z8oHpZeZZwKTBwcs1i0fSQJoxFxFFWMtBNBEPcZafmDx1xvvREhaRy9qFFCPI76EQ0oRkpT3eJHAC8foMuRCn0/a467HLqC9kOFhIjf54UEuq7VX3Anq9zD3B0uuIer3DpIxa7ak4HLoDYDFTCbRrf46fZinHISKcyQlJ2anSgvQ0JRzMjYclNJEoQHqE86GkaIE+llk77G8EIzPRjEQp9IwQk7v5EhLuWI+9qZZ5SLWk7+p3VSFdx5GY2SVJEITy8KUgZVDPPyYY8KghUbaYCwoDorxCESCCv9RZYuobb45GXgXFfvq/bzTaXenLVRAGVwDq5ADdyCOngCDeAADL6NklE2Towf89g8Nc+mVtOY7RyBP2Ne/QIBILuo</latexit>
Jh�g�f (x)| {z }n⇥m
= Jh(y2)| {z }n⇥q
Jg(y1)| {z }q⇥p
Jf (x)| {z }p⇥m
<latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit><latexit sha1_base64="bA4x/LBIAvRyVfls/ao3G8bvaiM=">AAADSXicpVLNa9RAFJ9k/ajr17YevQwuwvayJKWgHoSCF+mpimsLmyVOJi+boZNJOvMiXYb8fV566s0/wosHFU9O0oj2Qyj4YHg/3sfv/d5jkkoKg0Hw2fMHN27eur12Z3j33v0HD0frG+9NWWsOM17KUh8kzIAUCmYoUMJBpYEViYT95PBVm9//CNqIUr3DVQWLgi2VyARn6ELxuvchqlUKOtGMg40KhnmS2d3Y5jTiQnO67H3WTI43mya2ikYoCjC0aOjLYVTlTGFZ2CgpZWpWhXM2QjjGTpvVkDaTphlePSWfrOKt86xHzX9yLh1n2HEe/easrse5+U/OrN+9+rP7tRnj0TiYBp3RyyDswZj0thePTqO05HUBCrlkxszDoMKFZRoFl9BKNFAxfsiWMHdQMSdoYbuhDX3qIinNSu2eQtpF/+6wrDCtVlfZbmcu5trgVbl5jdnzhRWqqhEUPxuU1ZJiSdt/RVOhgaNcOcC4Fk4r5Tlzd0T3+9ojhBdXvgxmW9MX0+DN9njnbX+NNfKYPCETEpJnZIe8JntkRrj3yfviffO++yf+V/+H//Os1Pf6nkfknA0GvwDrPhlM</latexit>
The derivative of a straight composition of functions is the multiplication of their Jacobians
In what order?
10
Forward vs Reverse
( Jh(y2)| {z }n⇥q
Jg(y1)| {z }q⇥p| {z }
n⇥p
) Jf (x)| {z }p⇥m
<latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit><latexit sha1_base64="JaQOtQ2WAS20Br+08aKQTmorcBM=">AAADGXiclZJLb9QwEMed8Crh0S0cuViskLaXVVIhUW6VuCBOBbG00mYVOc5k16rtpPYEdWXlc3DpV+mFAyCOcOLb4KRb0QcgGMny3zOe34xHzmspLMbxjyC8dv3GzVtrt6M7d+/dXx9sPHhnq8ZwmPBKVmY/Zxak0DBBgRL2awNM5RL28oMXXXzvPRgrKv0WlzXMFJtrUQrO0LuyjSCOXJpXsrBL5TeXIhxhj3UGinbUtlHa6AJMbhgHd+GQKoaLvHSvssVomW1ttm3mNE1RKLD00CfWC6axUv9T4Bdz7plJzzw8Y9ZtdL6EP/4NvflHdDk66sH1GUn9W7MdMcoGw3gc90avimQlhmRlu9ngW1pUvFGgkUtm7TSJa5w5ZlBwCV2PFmrGD9gcpl5q5juaub5qS594T0HLyvilkfbe8xmOKds16292z7OXY53zd7Fpg+X2zAldNwianxYqG0mxot0/oYUwwFEuvWDcCN8r5QvmB4n+N3VDSC4/+aqYbI2fj+PXT4c7b1bTWCOPyGMyIgl5RnbIS7JLJoQHH4KT4FPwOTwOP4Zfwq+nV8NglfOQXLDw+089SAWq</latexit>
Jh(y2)| {z }n⇥q
( Jg(y1)| {z }q⇥p
Jf (x)| {z }p⇥m
| {z }q⇥m
)
<latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit><latexit sha1_base64="JFRosXlhA1ZZCM9t94Snf5Ck5Vc=">AAADGHiclZJNb9QwEIad8FXC17YcuViskLaXJamQWm6VuFScCmJppc0qcpzJrlXbSe0J6srK3+DCX+HCARDX3vg3OOlWKi1FMFKUVzOeZ16PnNdSWIzjn0F44+at23fW7kb37j94+GiwvvHeVo3hMOGVrMxhzixIoWGCAiUc1gaYyiUc5EevuvrBBzBWVPodLmuYKTbXohScoU9l68HztF4wjZVyaV7Jwi6V/7kU4QR7ujNQtKO2jdJGF2Bywzi4VDFc5KV7nS1Gy2xrs20zp2mKQoGlx230P6xrwHMPTnrw8Tm49o3/YHbzWrPl6KQn1udE1UYXB6i/O+/AUTYYxuO4D3pVJCsxJKvYzwanaVHxRoFGLpm10ySuceaYQcEldFYt1IwfsTlMvdTMO5m5fmpLn/lMQcvK+E8j7bMXOxxTtjPrT3a3tJdrXfJPtWmD5c7MCV03CJqfDSobSbGi3TOhhTDAUS69YNwI75XyBfP7RP+YuiUkl698VUy2xi/H8ZsXw923q22skSfkKRmRhGyTXbJH9smE8OBj8Dn4GnwLP4Vfwu/hj7OjYbDqeUx+i/D0F0lCBZY=</latexit>
Forward Reverse
Cost Cost
qpm+ nqm<latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit><latexit sha1_base64="PsrxQgC2l/owSl1oC8PH6vvetvM=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMquFNRbwYvHKq6ttEvJptk2NMluk6xQlv4KLx5UvPp3vPlvTNs9aOuDgcd7M8zMCxPOtHHdb6ewsrq2vlHcLG1t7+zulfcPHnScKkJ9EvNYtUKsKWeS+oYZTluJoliEnDbD4fXUbz5RpVks7804oYHAfckiRrCx0uMoEegMyZHolitu1Z0BLRMvJxXI0eiWvzq9mKSCSkM41rrtuYkJMqwMI5xOSp1U0wSTIe7TtqUSC6qDbHbwBJ1YpYeiWNmSBs3U3xMZFlqPRWg7BTYDvehNxf+8dmqiyyBjMkkNlWS+KEo5MjGafo96TFFi+NgSTBSztyIywAoTYzMq2RC8xZeXiX9evaq6t7VK/S5PowhHcAyn4MEF1OEGGuADAQHP8ApvjnJenHfnY95acPKZQ/gD5/MHZ/6PvA==</latexit>
nqp+ npm<latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit><latexit sha1_base64="kaAj2Joxew7rtxsfYL7H8PSMEWY=">AAAB73icbVBNSwMxEJ2tX7V+VT16CRZBEMpWBPVW8OKximsr7VKyabYNTbIxyQpl6a/w4kHFq3/Hm//GtN2Dtj4YeLw3w8y8SHFmrO9/e4Wl5ZXVteJ6aWNza3unvLt3b5JUExqQhCe6FWFDOZM0sMxy2lKaYhFx2oyGVxO/+US1YYm8syNFQ4H7ksWMYOukB/mo0AmSSnTLFb/qT4EWSS0nFcjR6Ja/Or2EpIJKSzg2pl3zlQ0zrC0jnI5LndRQhckQ92nbUYkFNWE2PXiMjpzSQ3GiXUmLpurviQwLY0Yicp0C24GZ9ybif147tfFFmDGpUkslmS2KU45sgibfox7TlFg+cgQTzdytiAywxsS6jEouhNr8y4skOK1eVv2bs0r9Nk+jCAdwCMdQg3OowzU0IAACAp7hFd487b14797HrLXg5TP78Afe5w9n/o+8</latexit>
= m(qp+ nq)<latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit><latexit sha1_base64="Fj7AFZDYpM3/JTcVuDvNC8XBago=">AAACBnicbVBNSwMxEM3Wr1q/qh4FCRahIpRdEdSDUPDisYq1hbaUbDptQ7PZbTIrlqU3L/4VLx5UvPobvPlvTD8OWn0w8HhvJpN5fiSFQdf9clJz8wuLS+nlzMrq2vpGdnPr1oSx5lDmoQx11WcGpFBQRoESqpEGFvgSKn7vYuRX7kAbEaobHETQCFhHibbgDK3UzO6e06SOcI/jpxJfxjAMhvl+RA+p6h80szm34I5B/xJvSnJkilIz+1lvhTwOQCGXzJia50bYSJhGwSUMM/XYQMR4j3WgZqliAZhGMl4+pPtWadF2qG0ppGP150TCAmMGgW87A4ZdM+uNxP+8Wozt00YiVBQjKD5Z1I4lxZCOQqEtoYGjHFjCuBb2r5R3mWYcbXQZG4I3e/JfUj4qnBXcq+Nc8XqaRprskD2SJx45IUVySUqkTDh5IE/khbw6j86z8+a8T1pTznRmm/yC8/ENxgKY2A==</latexit>
= n(qp+ pm)<latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit><latexit sha1_base64="qCVs3i7yfWKSyoNHEwdEcOm/FL8=">AAACNXicbZDPSgMxEMaz9V+t/6oevQSLUFHKVhT1IAhePKpYW2hLyWZn29Bsdk1mxbLsA/g0XvVRvHgSr76AB9Pag1Y/GPj4ZiYZfl4shUHXfXFyU9Mzs3P5+cLC4tLySnF17cZEieZQ45GMdMNjBqRQUEOBEhqxBhZ6Eupe/2zYr9+BNiJS1ziIoR2yrhKB4Axt1CmWTmjaQrjH0VOpz3S/qwFUprLybUx3aBxu2ym34o5E/5rq2JTIWBed4mfLj3gSgkIumTHNqhtjO2UaBZeQFVqJgZjxPutC01rFQjDtdHRBRrds4tMg0rYU0lH6cyNloTGD0LOTIcOemewNw/96zQSDo3YqVJwgKP79UZBIihEdkqG+0MBRDqxhXAt7K+U9phlHy6/Q8iGwjCcxpbrrZamFsetWDmy5WcHiqk7C+Wtqe5Xjinu5Xzq9GnPLkw2yScqkSg7JKTknF6RGOHkgj+SJPDtPzqvz5rx/j+ac8c46+SXn4wtE86rd</latexit>
11
Forward vs Reverse
Forward mode is good when there are few inputs.• Easy to implement: dual numbers.
x !✓y1,
dy1dx
◆!
✓y2,
dy2dx
◆!
✓y3,
dy3dx
◆
<latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit><latexit sha1_base64="PFsqI4r5+MEFXWPxzrrrIbvPhf4=">AAACnXichZFNSxxBEIZ7JiaazYdrPOTgpZMlYECWmVUx3gRBPIhoyGaFnWXp6amZbbanZ+iuMS7N/AJ/oT/Duwd7Pw7rB6Sg4OWtp+nirbiUwmAQ3Hn+m5W371bX3jc+fPz0eb258eWvKSrNocsLWeirmBmQQkEXBUq4KjWwPJbQi8fH03nvGrQRhfqDkxIGOcuUSAVn6Kxh8/aGRlpkI2RaF/9oJCHF7ckw3KFRqhm3idO1TW7qOfXzVbqzRHf+S+8u0bvL9LDZCtrBrOhLES5EiyzqYth8iJKCVzko5JIZ0w+DEgeWaRRcQt2IKgMl42OWQd9JxXIwAzsLraY/nJPQtNCuFdKZu/zCstyYSR47Mmc4Ms9nU/O1Wb/C9NfAClVWCIrPP0orSbGg0wvQRGjgKCdOMK6F25XyEXOBoLtTI0ogdbecrWMTpseZBlC11VlcWxfGTtDedx3UDRdX+Dycl6LbaR+2g8u91tHvRW5rZIt8J9skJAfkiJySC9IlnNx7Xz3qffOpf+Kf+edz1PcWbzbJk/J7jybizgI=</latexit>
12
Reverse mode is good when there are few outputs.• Hard to implement: execution is reversed.
x ! y1 ! y2 ! y3 ! dy3dy2
! dy3dy1
! dy3dx
<latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit><latexit sha1_base64="QdhX18/+A7E+bYo918Uf7uHg8S4=">AAACm3icfVHbSsNAEN3Ee71VxSdBF4vgg5SkKupbQR9EfFAxKjSlbDaTdOlmE3Y3agn5AD/Rr/AHfHB7edAqDsxy5pwZdjgTZJwp7Tjvlj01PTM7N79QWVxaXlmtrq0/qDSXFDya8lQ+BUQBZwI8zTSHp0wCSQIOj0HvfKA/PoNULBX3up9BOyGxYBGjRBuqU317xb5kcVcTKdMX3O+4E3Vjoj78UfuRJLQIDV0O3kb5n+r+o76WnWrNqTvDwL+BOwY1NI6bTvXTD1OaJyA05USplutkul0QqRnlUFb8XEFGaI/E0DJQkARUuxhaVuI9w4Q4SqVJofGQ/T5RkESpfhKYzoTorprUBuRfWivX0Wm7YCLLNQg6+ijKOdYpHviPQyaBat43gFDJzK6YdolxQpsrVfwQInPJ4TpFSGQvlgCiLGQclIUx48CpH5t0yoqxy5005zfwGvWzunN7VGvejX2bR1toF+0jF52gJrpEN8hDFH1Ym9a2tWNv2xf2lX09arWt8cwG+hG29wVrqM5p</latexit>
Forward vs Reverse
Deep learning involves computing the gradient of millions of parameters with respect to a loss.
We need reverse mode.
✓ ✓ � ✏@L@✓
where ✓ = (W1,W2, . . . ,b1,b2, . . .)<latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit><latexit sha1_base64="pS/V8h9CqT4Uo4yd/AVrDRTdq+g=">AAACrHicbVFNb9QwEHXCVwkf3cKRi8WKqkjLNqmogANSJS4cOBRE2ErraOU4k421zofsCWVl5Y/wz/gvHHCykSgtI4309GbeePwmbZQ0GIa/PP/W7Tt37+3dDx48fPR4f3Lw5JupWy0gFrWq9UXKDShZQYwSFVw0GniZKlikmw99ffEdtJF19RW3DSQlX1cyl4Kjo1aTnwwLQE4PmYIcudb1JR2pV5RBY6SqK8pyzYVlDdcouaKs5FgIruynrrvCDrKOMhYwhB9oLwvQQDs3ezfw/U6Y5vZosYpmdLE6mVFWZDWaGU17Jv3LvOxWk2k4D4egN0E0gikZ43w1+c2yWrQlVCgUN2YZhQ0mtl9PKOgC1hpouNjwNSwdrHgJJrGDhx194ZiM5rV2WSEd2KsKy0tjtmXqOvtPmOu1nvxfbdli/jaxsmpahErsHspbRbGm/UFoJjUIVFsHuNDS7UpFwZ3f6M4WsAxyd9phHZtxvVlrgKqzep121pkxC+enLsMucHZF1825CeKT+bt5+Pn19OzL6NseeUaekyMSkTfkjHwk5yQmwiPeoXfshf6xH/tLP9m1+t6oeUr+CT//A1m30Bc=</latexit>
13
OO vs SCT: Operator Overloading
def f(x): i = 0 while i < 3: i = i + 1 x = tanh(x) x = x * 10 return x
i = 0 i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) i = i + 1 x = tanh(x) x = x * 10
TraceBackprop
• Overload every operation to log itself on a tape. • At the end, we walk the tape backward. • “Define-by-run”, “Dynamic graph” • Easy to implement, but lots of overhead
• Discourages composing small & cheap operations
Program Tape
14
OO vs SCT: Source Code Transformation
• Transform a function that computes a value into a new function that computes the derivative.
• Operate on source code or intermediate representation • Applies the chain rule to code
• Standard language optimizations apply: can eliminate overhead
• Easier to apply to functional languages • Reverse mode AD interacts badly with mutation and side effects • Requires deep analysis and optimization to remove dead code
15
def bprop_pow(x, y, out, dout): dx = dout * y * x ** (y - 1) dy = dout * out * log(x) return dx, dy
What if we don’t need dy?
The Needs What we need from a language for deep learning
Autodiff What it is, how it works, what the challenges are
Representation The best representation for our needs
Type system Flexible inference for performance and robustness
16
About syntax
17
Myia is an intermediate representation• High level • No syntax of its own • Multiple languages may target it
Python frontend• Why? Most used language in DL • Productive for research and prototyping • Translate functional subset to Myia
• Control flow: if, while, for, def, lambda • Data: lists, tuples, arrays, @dataclass • Not supported: mutation, side effects, eval
• One issue: translate dynamically typed code
def fact(x): if x <= 1: return 1 else: return x * fact(x - 1)
Output
Operation
Input
Constant
Free var.
Needs
18
Requirements for our representation• Powerful enough to represent recursion • Minimal • Easy to parallelize • Easy to optimize • Easy to extend
Solutions• Functional (ANF) • Graph-based • Typed
Why functional programming?
19
Easier to transform• Referential transparency: same expression, same result Easier to think about • No side effects Easier to optimize• Order of operations can be changed • Parallelizable • Common subexpression elimination easy Type system is easier • No side effects Easier for automatic differentiation
Why graphs?
20
Easy to parallelize• Only data flow relationships
Easy to optimize• Direct use-def pointers (no names) • Dead code elimination is trivial • Inlining is easy
Output
Operation
Input
Why static typing?
Guarantees• Correctness of the user’s program • Type correctness of code transforms (autodiff)
Performance• No runtime type checking = better performance • Leverage shape information for optimization
User experience• Prevent errors late in process
The Needs What we need from a language for deep learning
Autodiff What it is, how it works, what the challenges are
Representation The best representation for our needs
Type system Flexible inference for performance and robustness
22
Myia’s Types
Scalars: Int/UInt/Float<8/16/32/64>, Bool Tuples: Tuple<T1, T2, …> • Heterogeneously typed, static length Lists: List<T> • Homogeneously typed, dynamic length, fast append Arrays: Array<T, Shape<D1, D2, …">> • Homogeneously typed, shape part of the type Functions: Function<Args<TIn1, TIn2, ""...>, TOut>
Struct types are reduced to tuples in pre-processing.
23
Why inference?
Annotations are annoying• Polymorphic types are awkward to express • Function types are awkward to express • Impede rapid prototyping • Duck typing is more natural • This is why people like Python
Type/shape inference • Infer from the input types from entry point • Implicit polymorphism • Feels dynamic • Functions are re-compiled when they are given new input types
Myia’s inference pipeline
1. Transform inputs into abstract inputs• Represent type and shape — no concrete values • More types: structs, polymorphic functions
2. Run abstract interpreter on abstract inputs• Bounded input signatures for each function • Recursive functions become fixed points
3. Specialize functions to their possible signatures• If function called with int, make int version, etc. • Higher order uses require signature uniqueness
4. Update or re-run inference after optimizations or AD
Error reporting
Abstract inferrer shows compile-time tracebacks for type/shape errors.
Debugging
Tracking correspondence to source code• Through parsing • Through optimization • Through automatic differentiation • Through macros/code generation
Debugging tools we need • Custom debugger for step by step execution • Watching variables and gradients • Breakpoints that trigger during the reverse phase • Profiling and reporting which parts of the code are “hot”
In Conclusion: Myia’s focus
General purpose, including recursion
Automatic differentiation• Code transform • Optimizable, higher order gradients
Type and shape inference• Can handle duck typed code
Good debugging facilities• Step debugger, profiling • Gradient debugging
⭐ us on GitHub: https://github.com/mila-iqia/myia