38
Photos placed in horizontal position with even amount of white space between photos and header Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2019-10588 C High performance tensor decomposition on emerging manycore architectures Eric Phipps ([email protected]) and Tammy Kolda, Sandia National Laboratories Albuquerque New Mexico and Livermore California, USA AI and Tensor Factorization for Physical Chemical and Biological Systems September 17-20, 2019

High performance tensor decomposition on emerging

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High performance tensor decomposition on emerging

Photos placed in horizontal position with even amount

of white spacebetween photos

and header

Photos placed in horizontal position

with even amount of white space

between photos and header

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2019-10588 C

High performance tensor decomposition on emerging

manycore architectures

Eric Phipps ([email protected])and Tammy Kolda,Sandia National LaboratoriesAlbuquerque New Mexico and Livermore California, USA

AI and Tensor Factorization for Physical Chemical and Biological Systems

September 17-20, 2019

Page 2: High performance tensor decomposition on emerging

⇡ = + + · · ·+

§ N-way array used to represent multi-relationship data§ E.g., word frequencies in Amazon product rankings:

§ Canonical Polyadic (CP) tensor decomposition§ Approximate tensor as a sum of rank-1 tensors§ Discovers dominant relationships in data

Tensors

user

item

word

X

x(i, j, k) ⇡ m(i, j, k) =RX

l=1

a(i, l)b(j, l)c(k, l)<latexit sha1_base64="VVvAxqfDXMFNesPQ6MQZ+9Grtrw=">AAADF3icbVFNa9RAGJ6NXzV+bfXoZTAUtlKXJBf1UCgI4kWo0rWFJC6TyezuNPMRZybadcgf8eofEQTx6tGf4FX/gLNpKk3XF2Z4eN7n/c4rRrUJw58D79LlK1evbVz3b9y8dfvOcPPuGy1rhckESybVUY40YVSQiaGGkaNKEcRzRg7z8tnKf/ieKE2lODDLimQczQWdUYyMo6bDLM3JnApL3tUt87DxT0Z053in3IYpqiolTyA/I3Zhqms+tWw3at6+hsjxbDsfHbsfj0r3+ykRxblc02EQjsPW4DqIOhCAzvanm4PPaSFxzYkwmCGtkyisTGaRMhQz0vhprUmFcInmJHFQIE50Zts9NHDLMQWcSeWeMLBlz0dYxLVe8twpOTILfdG3Iv/nS2oze5JZKqraEIFPC81qBo2Eq6XCgiqCDVs6gLCirleIF0ghbNzq+1XYXDrBglPcG8YaWn58hIv+hG0nWGWW1BgxVxc+d8OlHd2Lz/k/d879rTSXrFjp/FSQD1hyjtxl0oMmibMks05jg+gskQ3ipmn6ypfrSsU7oTtrdPGI62ASj5+Ow1dxsBd3990A98EDMAIReAz2wAuwDyYAg6/gF/gN/nifvC/eN+/7qdQbdDH3QM+8H38BYi389A==</latexit><latexit sha1_base64="VVvAxqfDXMFNesPQ6MQZ+9Grtrw=">AAADF3icbVFNa9RAGJ6NXzV+bfXoZTAUtlKXJBf1UCgI4kWo0rWFJC6TyezuNPMRZybadcgf8eofEQTx6tGf4FX/gLNpKk3XF2Z4eN7n/c4rRrUJw58D79LlK1evbVz3b9y8dfvOcPPuGy1rhckESybVUY40YVSQiaGGkaNKEcRzRg7z8tnKf/ieKE2lODDLimQczQWdUYyMo6bDLM3JnApL3tUt87DxT0Z053in3IYpqiolTyA/I3Zhqms+tWw3at6+hsjxbDsfHbsfj0r3+ykRxblc02EQjsPW4DqIOhCAzvanm4PPaSFxzYkwmCGtkyisTGaRMhQz0vhprUmFcInmJHFQIE50Zts9NHDLMQWcSeWeMLBlz0dYxLVe8twpOTILfdG3Iv/nS2oze5JZKqraEIFPC81qBo2Eq6XCgiqCDVs6gLCirleIF0ghbNzq+1XYXDrBglPcG8YaWn58hIv+hG0nWGWW1BgxVxc+d8OlHd2Lz/k/d879rTSXrFjp/FSQD1hyjtxl0oMmibMks05jg+gskQ3ipmn6ypfrSsU7oTtrdPGI62ASj5+Ow1dxsBd3990A98EDMAIReAz2wAuwDyYAg6/gF/gN/nifvC/eN+/7qdQbdDH3QM+8H38BYi389A==</latexit><latexit sha1_base64="VVvAxqfDXMFNesPQ6MQZ+9Grtrw=">AAADF3icbVFNa9RAGJ6NXzV+bfXoZTAUtlKXJBf1UCgI4kWo0rWFJC6TyezuNPMRZybadcgf8eofEQTx6tGf4FX/gLNpKk3XF2Z4eN7n/c4rRrUJw58D79LlK1evbVz3b9y8dfvOcPPuGy1rhckESybVUY40YVSQiaGGkaNKEcRzRg7z8tnKf/ieKE2lODDLimQczQWdUYyMo6bDLM3JnApL3tUt87DxT0Z053in3IYpqiolTyA/I3Zhqms+tWw3at6+hsjxbDsfHbsfj0r3+ykRxblc02EQjsPW4DqIOhCAzvanm4PPaSFxzYkwmCGtkyisTGaRMhQz0vhprUmFcInmJHFQIE50Zts9NHDLMQWcSeWeMLBlz0dYxLVe8twpOTILfdG3Iv/nS2oze5JZKqraEIFPC81qBo2Eq6XCgiqCDVs6gLCirleIF0ghbNzq+1XYXDrBglPcG8YaWn58hIv+hG0nWGWW1BgxVxc+d8OlHd2Lz/k/d879rTSXrFjp/FSQD1hyjtxl0oMmibMks05jg+gskQ3ipmn6ypfrSsU7oTtrdPGI62ASj5+Ow1dxsBd3990A98EDMAIReAz2wAuwDyYAg6/gF/gN/nifvC/eN+/7qdQbdDH3QM+8H38BYi389A==</latexit>

a1<latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit><latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit><latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit>

b1<latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit><latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit><latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit>

c1<latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit><latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit><latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit>

a2<latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit><latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit><latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit>

b2<latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit><latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit><latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit>

c2<latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit><latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit><latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit>

aR<latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit><latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit><latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit>

bR<latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit><latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit><latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit>

cR<latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit><latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit><latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit>

X<latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit><latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit><latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit>

M<latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit><latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit><latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit>

X ⇡ M = a1 � b1 � c1 + a2 � b2 � c2 · · ·+ aR � bR � cR<latexit sha1_base64="GmowxCMebr9fN2MppOopI6+XfO4=">AAAEhHicbVPdahNBFN7UaOv61yp4481gKIjakEREBSMFsXhTqaWxhewSZmdPNkNndtaZ2dp0mHfyBXwQ38az27Rm2w4snPOd7/yfTQrBje31/rZWbrVv31lduxveu//g4aP1jcc/jCo1gxFTQumjhBoQPIeR5VbAUaGBykTAYXL8ubIfnoA2XOUHdl5ALGmW8yln1CI0Wf8dJZDx3MHPskZe+jA6cEeeRLQotDolqO16MiRRIh31k37EuGa1ljQ0hhp5dUEbNGiDBm1AolRZ85+83yDvN8j7YQR5ulTeZL3T6/bqR64L/YXQCRZvb7Kx8gfzsVJCbpmgxrgE5wPah5tRaaCg7JhmMEYxR9jErh6qJ5uIpGSqNH65JTUaRgbsubsqqnrcjKdAcmXB+CWjBVkIasFVFlJgfD9GgOdxWOW0M5DgUpjSUlj0azrl9IRndbdmLhMlDOJ2HkYzXJ9G57JwRTqtgkqVwnBk4JvKYTl9XSvm1pIKYuHUepdkQx+GSw07Kk0VHxuV1M7MVVsF3mQbl3b6PnY8L0oslZ3PaVpiHkWqAyMp18CsmKNAmeaWM8JmVFNmsfpmFpEpJMwkZ75hsPz4bIulDXBcV8J07KBkVGBesoO7iRZwwz+Rl+ZE4ppxiGnFC6McfjElJcWTig78eBCPY1ddWqd/Ech1Bt77JnP3OlPLm4hyQWySsILL1Rh+hv3hSoikGn88ImBqhz0pXy+DmmezGsWd4cH3r573dWE06H7o9r4POtu9xeWvBc+C58GLoB+8C7aDr8FeMApY62nrY+tLa6e92t5qv2m/PaeutBY+T4LGa3/6B9iVhzU=</latexit><latexit sha1_base64="GmowxCMebr9fN2MppOopI6+XfO4=">AAAEhHicbVPdahNBFN7UaOv61yp4481gKIjakEREBSMFsXhTqaWxhewSZmdPNkNndtaZ2dp0mHfyBXwQ38az27Rm2w4snPOd7/yfTQrBje31/rZWbrVv31lduxveu//g4aP1jcc/jCo1gxFTQumjhBoQPIeR5VbAUaGBykTAYXL8ubIfnoA2XOUHdl5ALGmW8yln1CI0Wf8dJZDx3MHPskZe+jA6cEeeRLQotDolqO16MiRRIh31k37EuGa1ljQ0hhp5dUEbNGiDBm1AolRZ85+83yDvN8j7YQR5ulTeZL3T6/bqR64L/YXQCRZvb7Kx8gfzsVJCbpmgxrgE5wPah5tRaaCg7JhmMEYxR9jErh6qJ5uIpGSqNH65JTUaRgbsubsqqnrcjKdAcmXB+CWjBVkIasFVFlJgfD9GgOdxWOW0M5DgUpjSUlj0azrl9IRndbdmLhMlDOJ2HkYzXJ9G57JwRTqtgkqVwnBk4JvKYTl9XSvm1pIKYuHUepdkQx+GSw07Kk0VHxuV1M7MVVsF3mQbl3b6PnY8L0oslZ3PaVpiHkWqAyMp18CsmKNAmeaWM8JmVFNmsfpmFpEpJMwkZ75hsPz4bIulDXBcV8J07KBkVGBesoO7iRZwwz+Rl+ZE4ppxiGnFC6McfjElJcWTig78eBCPY1ddWqd/Ech1Bt77JnP3OlPLm4hyQWySsILL1Rh+hv3hSoikGn88ImBqhz0pXy+DmmezGsWd4cH3r573dWE06H7o9r4POtu9xeWvBc+C58GLoB+8C7aDr8FeMApY62nrY+tLa6e92t5qv2m/PaeutBY+T4LGa3/6B9iVhzU=</latexit><latexit sha1_base64="GmowxCMebr9fN2MppOopI6+XfO4=">AAAEhHicbVPdahNBFN7UaOv61yp4481gKIjakEREBSMFsXhTqaWxhewSZmdPNkNndtaZ2dp0mHfyBXwQ38az27Rm2w4snPOd7/yfTQrBje31/rZWbrVv31lduxveu//g4aP1jcc/jCo1gxFTQumjhBoQPIeR5VbAUaGBykTAYXL8ubIfnoA2XOUHdl5ALGmW8yln1CI0Wf8dJZDx3MHPskZe+jA6cEeeRLQotDolqO16MiRRIh31k37EuGa1ljQ0hhp5dUEbNGiDBm1AolRZ85+83yDvN8j7YQR5ulTeZL3T6/bqR64L/YXQCRZvb7Kx8gfzsVJCbpmgxrgE5wPah5tRaaCg7JhmMEYxR9jErh6qJ5uIpGSqNH65JTUaRgbsubsqqnrcjKdAcmXB+CWjBVkIasFVFlJgfD9GgOdxWOW0M5DgUpjSUlj0azrl9IRndbdmLhMlDOJ2HkYzXJ9G57JwRTqtgkqVwnBk4JvKYTl9XSvm1pIKYuHUepdkQx+GSw07Kk0VHxuV1M7MVVsF3mQbl3b6PnY8L0oslZ3PaVpiHkWqAyMp18CsmKNAmeaWM8JmVFNmsfpmFpEpJMwkZ75hsPz4bIulDXBcV8J07KBkVGBesoO7iRZwwz+Rl+ZE4ppxiGnFC6McfjElJcWTig78eBCPY1ddWqd/Ech1Bt77JnP3OlPLm4hyQWySsILL1Rh+hv3hSoikGn88ImBqhz0pXy+DmmezGsWd4cH3r573dWE06H7o9r4POtu9xeWvBc+C58GLoB+8C7aDr8FeMApY62nrY+tLa6e92t5qv2m/PaeutBY+T4LGa3/6B9iVhzU=</latexit>

1

Page 3: High performance tensor decomposition on emerging

§ CP minimization problem§ This assumes Gaussian model for tensor, resulting in least-squares

§ Solve via alternating linear least squares§ Fix all but one term, solve linear least squares problem, iterate

§ Matricized Tensor Times Khatri-Rao Product (MTTKRP):§ Most expensive part of CP-ALS

CP via Alternating Least-Squares (ALS)

Repeat until convergence…

M(1) = A(C �B)T

M(2) = B(C �A)T

M(3) = C(B �A)T<latexit sha1_base64="8upADMLkaXc7mLdnTbjDnw9oqMQ=">AAADX3icbZHbattAEIZXdg+u0yZOe1V6s9SkOIUaSb3oiUJiQ+lNIAW7CXhVs1qt7cV7EKtVi7voaUofqrd9kq5sJansDAj9zHyjmdEfp5xlxvf/eI3mnbv37rcetPcePto/6Bw+/pqpXBM6JoorfRnjjHIm6dgww+llqikWMacX8XJY1i++U50xJUdmldJI4LlkM0awcalp5zeK6ZxJizmby5dFG43sWTG1veC4gC8+QhQLe1 r0ytewQCpRppSD4vjbCCJ0TYc39GCHPt2mX9/Qww09uI2mMrlaa9rp+n1/HXBXBJXogirOp4feL5QokgsqDeE4yyaBn5rIYm0Y4dTdmWc0xWSJ53TipMSCZpFd/88CHrlMAmdKu0cauM7+32GxyLKViB0psFlk27UyeVttkpvZ28gymeaGSrIZNMs5NAqW5sCEaUoMXzmBiWZuV0gWWGNinIX1KXyuHLAQjNSOsYYtf74iSf3C9SZER5bmBHM3F35yx6EqXeuPxXU5Fu0jFCuelFwbSfqDKCGwswWNikkYTSJbOtYNrj5ku2FRFHXybJfUogKdrcG2ibtiHPbf9f0vYffEr/xtgWfgOeiBALwBJ+AzOAdjQLw9L/Deex8af5ut5n6zs0EbXtXzBNSi+fQf/UIRqg==</latexit><latexit sha1_base64="8upADMLkaXc7mLdnTbjDnw9oqMQ=">AAADX3icbZHbattAEIZXdg+u0yZOe1V6s9SkOIUaSb3oiUJiQ+lNIAW7CXhVs1qt7cV7EKtVi7voaUofqrd9kq5sJansDAj9zHyjmdEfp5xlxvf/eI3mnbv37rcetPcePto/6Bw+/pqpXBM6JoorfRnjjHIm6dgww+llqikWMacX8XJY1i++U50xJUdmldJI4LlkM0awcalp5zeK6ZxJizmby5dFG43sWTG1veC4gC8+QhQLe1 r0ytewQCpRppSD4vjbCCJ0TYc39GCHPt2mX9/Qww09uI2mMrlaa9rp+n1/HXBXBJXogirOp4feL5QokgsqDeE4yyaBn5rIYm0Y4dTdmWc0xWSJ53TipMSCZpFd/88CHrlMAmdKu0cauM7+32GxyLKViB0psFlk27UyeVttkpvZ28gymeaGSrIZNMs5NAqW5sCEaUoMXzmBiWZuV0gWWGNinIX1KXyuHLAQjNSOsYYtf74iSf3C9SZER5bmBHM3F35yx6EqXeuPxXU5Fu0jFCuelFwbSfqDKCGwswWNikkYTSJbOtYNrj5ku2FRFHXybJfUogKdrcG2ibtiHPbf9f0vYffEr/xtgWfgOeiBALwBJ+AzOAdjQLw9L/Deex8af5ut5n6zs0EbXtXzBNSi+fQf/UIRqg==</latexit><latexit sha1_base64="8upADMLkaXc7mLdnTbjDnw9oqMQ=">AAADX3icbZHbattAEIZXdg+u0yZOe1V6s9SkOIUaSb3oiUJiQ+lNIAW7CXhVs1qt7cV7EKtVi7voaUofqrd9kq5sJansDAj9zHyjmdEfp5xlxvf/eI3mnbv37rcetPcePto/6Bw+/pqpXBM6JoorfRnjjHIm6dgww+llqikWMacX8XJY1i++U50xJUdmldJI4LlkM0awcalp5zeK6ZxJizmby5dFG43sWTG1veC4gC8+QhQLe1 r0ytewQCpRppSD4vjbCCJ0TYc39GCHPt2mX9/Qww09uI2mMrlaa9rp+n1/HXBXBJXogirOp4feL5QokgsqDeE4yyaBn5rIYm0Y4dTdmWc0xWSJ53TipMSCZpFd/88CHrlMAmdKu0cauM7+32GxyLKViB0psFlk27UyeVttkpvZ28gymeaGSrIZNMs5NAqW5sCEaUoMXzmBiWZuV0gWWGNinIX1KXyuHLAQjNSOsYYtf74iSf3C9SZER5bmBHM3F35yx6EqXeuPxXU5Fu0jFCuelFwbSfqDKCGwswWNikkYTSJbOtYNrj5ku2FRFHXybJfUogKdrcG2ibtiHPbf9f0vYffEr/xtgWfgOeiBALwBJ+AzOAdjQLw9L/Deex8af5ut5n6zs0EbXtXzBNSi+fQf/UIRqg==</latexit>

Using… Repeat until convergence…

A = X(1)(C �B)(CTC ⇤BTB)†

B = X(2)(C �A)(CTC ⇤ATA)†

C = X(3)(B �A)(BTB ⇤ATA)†<latexit sha1_base64="gjwwhrkGM+q4UC4q6GiRV7gQL2o=">AAAECXicfVLditNAGE02/qzxb1cvvRksK1VwSeqFeiFstyDeKCu0bqGTLZPJNB06MwkzE6UOeQPfwjcQBPHWpxBv9T2ctHHdbKoDgZNzznfm5/vinFGlg+C7u+VduHjp8vYV/+q16zdu7uzeeqOyQmIywhnL5DhGijAqyEhTzcg4lwTxmJHjeDGo9OO3RCqaiaFe5iTiKBV0RjHSlpruuq+gIO9wxjkSiYGSJKWBq1hT4dKHMUmpMIjRVDyofrnpl+DeM7AyAzg043JquuH9sltpgxJmSaYreFiecifDWkOqltaMtZzABKUpkQBCf021w3vt8P6/w/s102+HDzaEP6qDDjeE/z3nf8J9SOzT1Q803ekE+8FqgTYIa9Bx6nVkO/ARJhkuOBEaM6TUJAxyHRkkNcWM2BcvFMkRXqCUTCwUiBMVmVWLSrBnmQTMMmk/ocGKPVthEFdqyWPr5EjP1XmtIjdpk0LPnkSGirzQROD1RrOCAZ2BaoxAQiXBmi0tQFhSe1aA50girO2wNXdhaWYNc05x4zJG08X7hzhp3nB1EiwjQwqMmN0XPLeXgzXdqI/5qRxzfw/GGUsqn9+Y6GE56UWTyFRN64R/gkynV1bDfdb5su2UvDbatobnm9gGo97+0/3gda9zENT93XbuOHedrhM6j50D54Vz5Iwc7H52f7g/3V/eB++T98X7urZuuXXNbaexvG+/AXigWIc=</latexit><latexit sha1_base64="gjwwhrkGM+q4UC4q6GiRV7gQL2o=">AAAECXicfVLditNAGE02/qzxb1cvvRksK1VwSeqFeiFstyDeKCu0bqGTLZPJNB06MwkzE6UOeQPfwjcQBPHWpxBv9T2ctHHdbKoDgZNzznfm5/vinFGlg+C7u+VduHjp8vYV/+q16zdu7uzeeqOyQmIywhnL5DhGijAqyEhTzcg4lwTxmJHjeDGo9OO3RCqaiaFe5iTiKBV0RjHSlpruuq+gIO9wxjkSiYGSJKWBq1hT4dKHMUmpMIjRVDyofrnpl+DeM7AyAzg043JquuH9sltpgxJmSaYreFiecifDWkOqltaMtZzABKUpkQBCf021w3vt8P6/w/s102+HDzaEP6qDDjeE/z3nf8J9SOzT1Q803ekE+8FqgTYIa9Bx6nVkO/ARJhkuOBEaM6TUJAxyHRkkNcWM2BcvFMkRXqCUTCwUiBMVmVWLSrBnmQTMMmk/ocGKPVthEFdqyWPr5EjP1XmtIjdpk0LPnkSGirzQROD1RrOCAZ2BaoxAQiXBmi0tQFhSe1aA50girO2wNXdhaWYNc05x4zJG08X7hzhp3nB1EiwjQwqMmN0XPLeXgzXdqI/5qRxzfw/GGUsqn9+Y6GE56UWTyFRN64R/gkynV1bDfdb5su2UvDbatobnm9gGo97+0/3gda9zENT93XbuOHedrhM6j50D54Vz5Iwc7H52f7g/3V/eB++T98X7urZuuXXNbaexvG+/AXigWIc=</latexit><latexit sha1_base64="gjwwhrkGM+q4UC4q6GiRV7gQL2o=">AAAECXicfVLditNAGE02/qzxb1cvvRksK1VwSeqFeiFstyDeKCu0bqGTLZPJNB06MwkzE6UOeQPfwjcQBPHWpxBv9T2ctHHdbKoDgZNzznfm5/vinFGlg+C7u+VduHjp8vYV/+q16zdu7uzeeqOyQmIywhnL5DhGijAqyEhTzcg4lwTxmJHjeDGo9OO3RCqaiaFe5iTiKBV0RjHSlpruuq+gIO9wxjkSiYGSJKWBq1hT4dKHMUmpMIjRVDyofrnpl+DeM7AyAzg043JquuH9sltpgxJmSaYreFiecifDWkOqltaMtZzABKUpkQBCf021w3vt8P6/w/s102+HDzaEP6qDDjeE/z3nf8J9SOzT1Q803ekE+8FqgTYIa9Bx6nVkO/ARJhkuOBEaM6TUJAxyHRkkNcWM2BcvFMkRXqCUTCwUiBMVmVWLSrBnmQTMMmk/ocGKPVthEFdqyWPr5EjP1XmtIjdpk0LPnkSGirzQROD1RrOCAZ2BaoxAQiXBmi0tQFhSe1aA50girO2wNXdhaWYNc05x4zJG08X7hzhp3nB1EiwjQwqMmN0XPLeXgzXdqI/5qRxzfw/GGUsqn9+Y6GE56UWTyFRN64R/gkynV1bDfdb5su2UvDbatobnm9gGo97+0/3gda9zENT93XbuOHedrhM6j50D54Vz5Iwc7H52f7g/3V/eB++T98X7urZuuXXNbaexvG+/AXigWIc=</latexit>

MTTKRP

minM

kX�Mk2F s.t. M = a1 � b1 � c1 + · · ·+ aR � bR � cR<latexit sha1_base64="ZUHqiAxqamATQe3VIAsScTaD67s=">AAAEjHicbVNtaxNBEL7UaOv50lY/CrIYCqI2JAFRKYGCUvzSUktjC9kY9vYml6W3e+fuXm263f/lF3+I/8a5Syy5tgshM8888z4X5akwttP521i517z/YHXtYfjo8ZOn6xubz76brNAcBjxLM30aMQOpUDCwwqZwmmtgMkrhJDr7XNpPzkEbkaljO8thJFmixERwZhEab/ymESRCOfhZVMgbH1Ip1NjRY7fvPd2hO4ReoXLqtyuIXo33fvTI3CCj7MKZtm37BVAySJ/QSDrmx13KheaVFtU0jhp5S2icWVP+z+lHNfpRjX4UUlDxUpnjjVan3akeuS10F0IrWLzD8ebKH8zHCwnK8pQZ4yKcE2gfbtHCQM74GUtgiKJC2IxcNVxPthCJySTT+FOWVGhIDdi5e5aX9bipiIGozILxS0YLMk+ZBVdaSI7x/RABoUZhmdNOQYKLYcKK1KJf3Umxc5FU3ZoZDjo1iNtZSKe4Ro3ORe7yeFIGlVkM/YGBg0zBcvqqVsytJUuJhQvrXZT0fRguNeyYNGV8bFQyOzU3bSV4l21Y2MnHkRMqL7BUPp/TpMA8GSkPjcRCA7fpDAXGtbCCEz5lmnGL1dezpEmGhKkU3NcMVpxdbvO4Bg6rSrgeOSg4SzEv2cPd0AVc84/ktTmSuGYcYlzyQqrgF8+kZHhS9NgPe6PhyJWX1ur+D+RaPe99nbl/m6nlXUS5INZJWMH1aoy4xP5wJUQyjR8gSWFi+x0p3y2DWiTTCsWd4cF3b573bWHQa39qd771WrudxeWvBS+CV8HroBt8CHaDr8FhMAh442XjS2O/cdBcb75v7jT7c+pKY+HzPKi95t4/KV+JHQ==</latexit><latexit sha1_base64="ZUHqiAxqamATQe3VIAsScTaD67s=">AAAEjHicbVNtaxNBEL7UaOv50lY/CrIYCqI2JAFRKYGCUvzSUktjC9kY9vYml6W3e+fuXm263f/lF3+I/8a5Syy5tgshM8888z4X5akwttP521i517z/YHXtYfjo8ZOn6xubz76brNAcBjxLM30aMQOpUDCwwqZwmmtgMkrhJDr7XNpPzkEbkaljO8thJFmixERwZhEab/ymESRCOfhZVMgbH1Ip1NjRY7fvPd2hO4ReoXLqtyuIXo33fvTI3CCj7MKZtm37BVAySJ/QSDrmx13KheaVFtU0jhp5S2icWVP+z+lHNfpRjX4UUlDxUpnjjVan3akeuS10F0IrWLzD8ebKH8zHCwnK8pQZ4yKcE2gfbtHCQM74GUtgiKJC2IxcNVxPthCJySTT+FOWVGhIDdi5e5aX9bipiIGozILxS0YLMk+ZBVdaSI7x/RABoUZhmdNOQYKLYcKK1KJf3Umxc5FU3ZoZDjo1iNtZSKe4Ro3ORe7yeFIGlVkM/YGBg0zBcvqqVsytJUuJhQvrXZT0fRguNeyYNGV8bFQyOzU3bSV4l21Y2MnHkRMqL7BUPp/TpMA8GSkPjcRCA7fpDAXGtbCCEz5lmnGL1dezpEmGhKkU3NcMVpxdbvO4Bg6rSrgeOSg4SzEv2cPd0AVc84/ktTmSuGYcYlzyQqrgF8+kZHhS9NgPe6PhyJWX1ur+D+RaPe99nbl/m6nlXUS5INZJWMH1aoy4xP5wJUQyjR8gSWFi+x0p3y2DWiTTCsWd4cF3b573bWHQa39qd771WrudxeWvBS+CV8HroBt8CHaDr8FhMAh442XjS2O/cdBcb75v7jT7c+pKY+HzPKi95t4/KV+JHQ==</latexit><latexit sha1_base64="ZUHqiAxqamATQe3VIAsScTaD67s=">AAAEjHicbVNtaxNBEL7UaOv50lY/CrIYCqI2JAFRKYGCUvzSUktjC9kY9vYml6W3e+fuXm263f/lF3+I/8a5Syy5tgshM8888z4X5akwttP521i517z/YHXtYfjo8ZOn6xubz76brNAcBjxLM30aMQOpUDCwwqZwmmtgMkrhJDr7XNpPzkEbkaljO8thJFmixERwZhEab/ymESRCOfhZVMgbH1Ip1NjRY7fvPd2hO4ReoXLqtyuIXo33fvTI3CCj7MKZtm37BVAySJ/QSDrmx13KheaVFtU0jhp5S2icWVP+z+lHNfpRjX4UUlDxUpnjjVan3akeuS10F0IrWLzD8ebKH8zHCwnK8pQZ4yKcE2gfbtHCQM74GUtgiKJC2IxcNVxPthCJySTT+FOWVGhIDdi5e5aX9bipiIGozILxS0YLMk+ZBVdaSI7x/RABoUZhmdNOQYKLYcKK1KJf3Umxc5FU3ZoZDjo1iNtZSKe4Ro3ORe7yeFIGlVkM/YGBg0zBcvqqVsytJUuJhQvrXZT0fRguNeyYNGV8bFQyOzU3bSV4l21Y2MnHkRMqL7BUPp/TpMA8GSkPjcRCA7fpDAXGtbCCEz5lmnGL1dezpEmGhKkU3NcMVpxdbvO4Bg6rSrgeOSg4SzEv2cPd0AVc84/ktTmSuGYcYlzyQqrgF8+kZHhS9NgPe6PhyJWX1ur+D+RaPe99nbl/m6nlXUS5INZJWMH1aoy4xP5wJUQyjR8gSWFi+x0p3y2DWiTTCsWd4cF3b573bWHQa39qd771WrudxeWvBS+CV8HroBt8CHaDr8FhMAh442XjS2O/cdBcb75v7jT7c+pKY+HzPKi95t4/KV+JHQ==</latexit>

minA

��X(1) �A(C �B)T��2F

minB

��X(2) �B(C �A)T��2F

minC

��X(3) �C(B �A)T��2F

<latexit sha1_base64="Wzop7MCYT0TmCPey/FDT31LwqRU=">AAAD93ichVJbb9MwFHYWLqPcNnjkxaIa6pBWJeEBkHjYWmniBWlILatUZ5XjuKlV24kSB9SZ/BekSYhXfgav8MS/wUkDahoElqJ8Puc73/G5BAlnmXKcn9aOfe36jZu7tzq379y9d39v/8G7LM5TQsck5nE6CXBGOZN0rJjidJKkFIuA0/NgOSz95+9pmrFYjtQqob7AkWRzRrAyptm+9QpJ+oHEQmAZapTSsNCoktUlLjoooBGTGnMWyafmKpicVTwUCH1SFBAFLEIf0UhPipnuuYcFfHIENxm98j8sUBzGqoSD4vBitI6anV54EKEt1UFL1dtSHbRVT/6nOmypPttSHdaqg3+rUtOouh2zva7Td6oD28CtQRfU58z0+wqFMckFlYpwnGVT10mUr3GqGOHU9DfPaILJEkd0aqDEgma+rgZSwANjCeE8Ts0nFaysmxEaiyxbicAwBVaLbNtXGv/mm+Zq/sLXTCa5opKsE81zDlUMy6WBIUspUXxlACYpM2+FZIFTTJRZrWYWHsWGsBCMNIrRii0vj0jYrLB6CUl9TXOCuckLT01xqDY34gPxxx2IzgEKYh6WvE5jf0fF1POnvi5n13V/C+muV5SrvMl802amoiaasbrbQ2yDsdd/2Xfeet1jp57vLngEHoMecMFzcAxegzMwBsS6sr5Z360f9qX9yf5sf1lTd6w65iFoHPvrL6JzUTo=</latexit><latexit sha1_base64="Wzop7MCYT0TmCPey/FDT31LwqRU=">AAAD93ichVJbb9MwFHYWLqPcNnjkxaIa6pBWJeEBkHjYWmniBWlILatUZ5XjuKlV24kSB9SZ/BekSYhXfgav8MS/wUkDahoElqJ8Puc73/G5BAlnmXKcn9aOfe36jZu7tzq379y9d39v/8G7LM5TQsck5nE6CXBGOZN0rJjidJKkFIuA0/NgOSz95+9pmrFYjtQqob7AkWRzRrAyptm+9QpJ+oHEQmAZapTSsNCoktUlLjoooBGTGnMWyafmKpicVTwUCH1SFBAFLEIf0UhPipnuuYcFfHIENxm98j8sUBzGqoSD4vBitI6anV54EKEt1UFL1dtSHbRVT/6nOmypPttSHdaqg3+rUtOouh2zva7Td6oD28CtQRfU58z0+wqFMckFlYpwnGVT10mUr3GqGOHU9DfPaILJEkd0aqDEgma+rgZSwANjCeE8Ts0nFaysmxEaiyxbicAwBVaLbNtXGv/mm+Zq/sLXTCa5opKsE81zDlUMy6WBIUspUXxlACYpM2+FZIFTTJRZrWYWHsWGsBCMNIrRii0vj0jYrLB6CUl9TXOCuckLT01xqDY34gPxxx2IzgEKYh6WvE5jf0fF1POnvi5n13V/C+muV5SrvMl802amoiaasbrbQ2yDsdd/2Xfeet1jp57vLngEHoMecMFzcAxegzMwBsS6sr5Z360f9qX9yf5sf1lTd6w65iFoHPvrL6JzUTo=</latexit><latexit sha1_base64="Wzop7MCYT0TmCPey/FDT31LwqRU=">AAAD93ichVJbb9MwFHYWLqPcNnjkxaIa6pBWJeEBkHjYWmniBWlILatUZ5XjuKlV24kSB9SZ/BekSYhXfgav8MS/wUkDahoElqJ8Puc73/G5BAlnmXKcn9aOfe36jZu7tzq379y9d39v/8G7LM5TQsck5nE6CXBGOZN0rJjidJKkFIuA0/NgOSz95+9pmrFYjtQqob7AkWRzRrAyptm+9QpJ+oHEQmAZapTSsNCoktUlLjoooBGTGnMWyafmKpicVTwUCH1SFBAFLEIf0UhPipnuuYcFfHIENxm98j8sUBzGqoSD4vBitI6anV54EKEt1UFL1dtSHbRVT/6nOmypPttSHdaqg3+rUtOouh2zva7Td6oD28CtQRfU58z0+wqFMckFlYpwnGVT10mUr3GqGOHU9DfPaILJEkd0aqDEgma+rgZSwANjCeE8Ts0nFaysmxEaiyxbicAwBVaLbNtXGv/mm+Zq/sLXTCa5opKsE81zDlUMy6WBIUspUXxlACYpM2+FZIFTTJRZrWYWHsWGsBCMNIrRii0vj0jYrLB6CUl9TXOCuckLT01xqDY34gPxxx2IzgEKYh6WvE5jf0fF1POnvi5n13V/C+muV5SrvMl802amoiaasbrbQ2yDsdd/2Xfeet1jp57vLngEHoMecMFzcAxegzMwBsS6sr5Z360f9qX9yf5sf1lTd6w65iFoHPvrL6JzUTo=</latexit>

V = X(3)(A�B) =) v(i, l) =X

(i,j,k)2N (X)

x(i, j, k)a(j, l)b(k, l), l = 1, . . . , R

<latexit sha1_base64="dnny4rw3Vy3sfCNLGdzO+uyJGOE=">AAADZ3icbVLdbtMwGHVWfkb5WTckhMSNoZrUoqxKugt2M2kbEuIGNFDbVaqjynHc1qvthNgZFMtPwy0vxCPwFjhdQMuKpShfznfO95PjOONM6SD45W017ty9d3/7QfPho8dPdlq7eyOVFjmhQ5LyNB/HWFHOJB1qpjkdZznFIub0Il6+LfMXVzRXLJUDvcpoJPBcshkjWDto2vqJYjpn0tAvxRp5bZsoFmZk4TFEAzO2U9M57NpOCZ5aiNIk1bD8OLNdiJhwM1IFrzrM591SogrhFMy/9JddxCREAusFwdx8dDXKel0Lv1V5iDuXpSzuLN3Lh8jNkEB+HPrIdVH+5yaiMrkx2rTVDnrB+sDNIKyCNqjO+XTX++FKkUJQqQnHSk3CINORwblmhFO3a6FohskSz+nEhRILqiKz/q0W7jskgbM0d4/UcI3eVBgslFqJ2DHLLdXtXAn+Lzcp9OwoMkxmhaaSXDeaFRzqFJYewYTllGi+cgEmOXOzQrLAOSbaOVnvwuepIywEI7VljGbL7wckqW+4noTkkaGFs8T1he/ccqiCa/pY/EvHormP4pQnJa+JJP1KUiGwcwYN7KQfTSJTXoh2+LeQafettXXmh01mLiqiszW8beJmMOr3wsNe8KnfPvErg7fBC/AKdEAI3oAT8B6cgyEg3p535J16Z1u/GzuNZ43n19Qtr9I8BbXTePkH1g0TaQ==</latexit>

2

Page 4: High performance tensor decomposition on emerging

GenTen : Software for Generalized Canonical Polyadic Tensor Decompositions

§ New software package Genten developed at SNL§ E. Phipps, T. Kolda, D. Dunlavy, G. Ballard, T. Plantenga§ Based on C++ port of Matlab Tensor Toolbox§ Publicly available at https://gitlab.com/tensors/genten§ Implements full CP-ALS algorithm for sparse (and dense)

tensors, as well as GCP algorithm for sparse tensors

§ Incorporates shared memory parallelism for emerging manycore hardware using Kokkos§ Multicore CPUs via OpenMP, pThreads§ GPUs via Nvidia Cuda (Intel and AMD coming soon)§ Intel Xeon Phi (a.k.a. KNC/KNL) via OpenMP

§ Implements parallelism for all performance-critical operations§ MTTKRP, tensor inner product, norms, ...§ Can use optimized third-party libraries (MKL, cuBLAS, ...)§ Natively handles data transfers between CPU, GPU memory

3

Page 5: High performance tensor decomposition on emerging

What is Kokkos?

DDR

HBM

DDR

HBM

DDRDDR

DDR

HBMHBM

Multi-Core Many-Core APU CPU+GPU

Drekar

TrilinosSPARC

Applications & Libraries

Kokkosperformance portability for C++ applications

Albany

EMPIRELAMMPS

SIERRA etc...

H.C. Edwards, C. Trott, et al., https://github.com/kokkos/kokkos

4

Page 6: High performance tensor decomposition on emerging

Genten CP-ALS Performance*§ Synthetic data tensor (double precision)

§ 30K x 40K x 50K, 10M non-zeros, R = 128 (perf_CpAlsRandomKtensor performance test in Genten)§ 10 CP-ALS iterations

§ Architectures§ HSW (OpenMP): Intel Xeon E5-2698v3 CPU, 2.3 GHz, 2 sockets, 16 cores/socket, 2 threads/core§ KNL (OpenMP): Intel Xeon Phi 7250, 68 cores (using up to 64), 4 threads/core, 16 GB HBM in cache mode§ K80 (Cuda): Nvidia Kepler K80 GPU, 12 GB GDDR5§ P100 (Cuda): Nvidia Pascal P100 GPU, 16 GB HBM

*E. Phipps and T. Kolda, Software for Sparse Tensor Decomposition on Emerging Computing Architectures, SIAM SISC, vol. 41 (3), 2019, (https://doi.org/10.1137/18M1210691).

5

Page 7: High performance tensor decomposition on emerging

MTTKRP

6

3 6 21 1 42 7 31 4 42 5 13 1 2

X.subs X.vals

�<latexit sha1_base64="TlncEkZ3eE0xh1tqB82m540GTR0=">AAAEEnicbVNNb9NAEN02fBTz1cKRi0VUgRBUSTjApVIlJMQFVKSmrRRb0Xo9tlfZD7O7LqQr/wRu/BNuCA6IKzdO/BvGTqjititZGr03b+bNjjcpBbduMPi7tt67cvXa9Y0bwc1bt+/c3dy6d2h1ZRiMmRbaHCfUguAKxo47AcelASoTAUfJ7FXDH52AsVyrAzcvIZY0VzzjjDqEppuPogRyrjx8qFrkSR1EOtUuiEClK+h0sz/YGbQnvBgMl0GfLM/+dGv9T5RqVklQjglqrU/QFpg62I4qCyVlM5rDBEOFsI19O0sdbiOShpk2+CkXtmgQWXALuS4bP77gKYRKO7D1CulAloI68A0Tlli/niDAVRw0PV0BEnwKGa2EQ11XpOgJz9tp7VwmWljE3TyICrw1g+Kq9GWaNUWlTmF3bOGdVrDavvWKvY2kInTwydU+yXfrIFgZ2FNpm/o4qKSusOe5BryMm1Quexl7rsoKrbLFPWUV9tFhs9cw5QaYE3MMKDPccRayghrKHLrvdhG5xoRCclZ3CMdnp89Y2gEnrRNmYg8VowL7hq9xN9ES7ugTeUYnssNYJ6mZm7RZPl5t2qiDSMFHpqWk+KNFB/VkFE9ij0rfH/4v7/ujuq67mW8vZhp5WaJcJnaT0MHZwiw/xalxUSHaw1cQCsjc7kDKp6ug4XnRorhJfAbD8z/9xeBwtDN8vjN4P+rvjZYPYoM8IA/JYzIkL8geeUP2yZgw8pl8Jd/Jj96X3rfez96vRer62lJzn3RO7/c/USpjAQ==</latexit>

⇥<latexit sha1_base64="cqKZeU0AoxwaG9P/Z/YuEXfOwaU=">AAAEE3icbVNNaxRBEO1k/YjjV6JHL4NLIIiG3fWgl0BAEC9KhGwS2BmWnp6amWb7Y+zuiW6a+Qve/CfeRA/i1ZMn/401s2vYSdIwULxXr+pV13RSCm7dYPB3bb137fqNmxu3gtt37t67v7n14MjqyjAYMy20OUmoBcEVjB13Ak5KA1QmAo6T2auGPz4FY7lWh25eQixprnjGGXUITTd3ogRyrjx8qFrkSR1EjkuwQQQqXYGnm/3B7qA94eVguAz6ZHkOplvrf6JUs0qCckxQa32CvsDUwXZUWSgpm9EcJhgqhG3s22HqcBuRNMy0wU+5sEWDyIJbyHXZ+PEFTyFU2oGtV0gHshTUgW+YsMT69QQBruKg6ekKkOBTyGglHOq6IkVPed5Oa+cy0cIi7uZBVOC1GRRXpS/TrCkqdQp7YwvvtILV9q1X7G0kFaGDT672Sb5XB8HKwJ5K29THQSV1hb3INeBV3KRy2cvYc1VWaJUt7imrsI8Om8WGKTfAnJhjQJnhjrOQFdRQ5tB9t4vINSYUkrO6Qzg+O3vG0g44aZ0wE3uoGBXYN3yNu4mWcEefyHM6kR3GOknN3KTN8vFq00YdRAo+Mi0lxR8tOqwno3gSe1T6/vB/ed8f1XXdzXx7OdPIqxLlMrGbhA7OF2b5GU6NiwrRHj6DUEDm9gZSPl0FDc+LFsVN4jMYXvzpLwdHo93h893B+1F/f7R8EBvkEXlMdsiQvCD75A05IGPCyGfylXwnP3pfet96P3u/Fqnra0vNQ9I5vd//AAT2Y3c=</latexit>

=<latexit sha1_base64="Sqiovq9umAtAI3wYwYoDsWYcH5k=">AAAEDnicbVNNaxRBEO1k/YjjRxI9ehlcAiK67K4HvSwEBPGiRMgmgZ0h9PTUzDTpj7G7J7pp5gd48594Ez2IV++e/DfWTNawk03DQPHqvapXXdNJKbh1w+HftfXetes3bm7cCm7fuXtvc2v7/oHVlWEwZVpoc5RQC4IrmDruBByVBqhMBBwmJ6+a/OEpGMu12nfzEmJJc8UzzqhD6HirHyWQc+XhQ9UiT+pgEkSg0iUEWcPBsD3hajBaBH2yOHvH2+t/olSzSoJyTFBrfYKWwNTBTlRZKCk7oTnMMFQI29i3c9ThDiJpmGmDn3JhiwaRBXcu12Xjxxc8hVBpB7ZeSjqQpaAOfJMJS6xfzxDgKg6anq4ACT6FjFbCoa4rUvSU5+20di4TLSzibh5EBd6YQXFV+jLNmqJSpzCZWninFSy3b71ibyOpCB18crVP8kkdBEsDeyptUx8HldQV9nKuAa/KzSqXvYw9V2WFVtn5PWUV9tFhs9Mw5QaYE3MMKDPccRayghrKHLrvdhG5RkIhOas7CcdPzp6xtAPOWifMxB4qRgX2DV/jbqIF3NEn8iKdyE7GOknN3KTN8vFq00YdRAo+Mi0lxR8t2q9n43gWe1T6/uh/ed8f13XdZb5dZRp5FVEuiF0SOrhYmOVnODUuKkR7+AJCAZmbDKV8ugwanhctipvEZzC6/NOvBgfjwej5YPh+3N8dLx7EBnlIHpHHZERekF3yhuyRKWHkM/lKvpMfvS+9b72fvV/n1PW1heYB6Zze738CA2EE</latexit>

A<latexit sha1_base64="eXdKyuqx5r+N/vTwtEuOafZD0Qc=">AAAEE3icbVNLb9QwEHa7PEp4dAtHLhGrShWCanc5wKVSERLiAipSt620iSrHmSRW/Qi2U9ha+Qvc+CfcEBwQV06c+DdMsku1aWspyuj75vGNx5OUgls3HP5dWe1du37j5tqt4Padu/fW+xv3D6yuDIMJ00Kbo4RaEFzBxHEn4Kg0QGUi4DA5edXwh6dgLNdq381KiCXNFc84ow6h4/5WlEDOlYcPVYs8roMokf4l/kClS/BxfzDcHrYnvGyMFsaALM7e8cbqnyjVrJKgHBPUWp+gLjB1sBlVFkrKTmgOUzQVwjb2bTN1uIlIGmba4Kdc2KJBZMHNw3XZ6PEFTyFU2oGtl0gHshTUgW+YsMT89RQBruKgqekKkOBTyGglXNNoJ0jRU5633dqZTLSwiLtZEBV4bQaDq9KXadYklTqFnYmFd1rBcvlWK9Y2korQwSdX+yTfqYNgqWFPpW3yY6OSusJe5BrwKm5auexF7LkqK5TK5veUVVhHh81gw5QbYE7M0KDMcMdZyApqKHOovltF5BodCslZ3SEcPzl7ytIOOG2VMBN7qBgVWDd8jbOJFnAnPpHndCI7jHWSmplJm+Hj1aZNdBAp+Mi0lBQfWrRfT8fxNPbN6xuM/qf3g3Fd113Pt5c9jbzKUS4cu06o4Hxglp9h1zioEOXhGoQCMrczlPLJMmh4XrQoThLXYHTx0V82Dsbbo2fbw/fjwe54sRBr5CF5RLbIiDwnu+QN2SMTwshn8pV8Jz96X3rfej97v+auqyuLmAekc3q//wGnZ2Nd</latexit>

B<latexit sha1_base64="yDsOwuEimKagtOJ16lKY5DZeBPo=">AAAEE3icbVNLb9QwEHa7PEp4dAtHLhGrShWCanc5wKVSBRLiAipSt620iSrHmSRW/Qi2U9ha+Qvc+CfcEBwQV06c+DdMsku1aWspyuj75vGNx5OUgls3HP5dWe1du37j5tqt4Padu/fW+xv3D6yuDIMJ00Kbo4RaEFzBxHEn4Kg0QGUi4DA5edXwh6dgLNdq381KiCXNFc84ow6h4/5WlEDOlYcPVYs8roMokf4l/kClS/BxfzDcHrYnvGyMFsaALM7e8cbqnyjVrJKgHBPUWp+gLjB1sBlVFkrKTmgOUzQVwjb2bTN1uIlIGmba4Kdc2KJBZMHNw3XZ6PEFTyFU2oGtl0gHshTUgW+YsMT89RQBruKgqekKkOBTyGglXNNoJ0jRU5633dqZTLSwiLtZEBV4bQaDq9KXadYklTqFnYmFd1rBcvlWK9Y2korQwSdX+yTfqYNgqWFPpW3yY6OSusJe5BrwKm5auexF7LkqK5TK5veUVVhHh81gw5QbYE7M0KDMcMdZyApqKHOovltF5BodCslZ3SEcPzl7ytIOOG2VMBN7qBgVWDd8jbOJFnAnPpHndCI7jHWSmplJm+Hj1aZNdBAp+Mi0lBQfWrRfT8fxNPbN6xuM/qf3g3Fd113Pt5c9jbzKUS4cu06o4Hxglp9h1zioEOXhGoQCMrczlPLJMmh4XrQoThLXYHTx0V82Dsbbo2fbw/fjwe54sRBr5CF5RLbIiDwnu+QN2SMTwshn8pV8Jz96X3rfej97v+auqyuLmAekc3q//wGq/mNe</latexit>

C<latexit sha1_base64="xm5GZo/8DPaFbu5KCYNnXoOShdw=">AAAEE3icbVNLaxRBEO5kfcTxkY0evQwugSAadteDXgKBgHhRImSTwM4QenpqZpr0Y+zuiW6a+Qve/CfeRA/i1ZMn/401s2vYSdIwTPF99fiqqyspBbduOPy7stq7cfPW7bU7wd179x+s9zceHlpdGQYTpoU2xwm1ILiCieNOwHFpgMpEwFFyutfwR2dgLNfqwM1KiCXNFc84ow6hk/5WlEDOlYcPVYs8rYMokX4Pf6DSJfikPxhuD9sTXjVGC2NAFmf/ZGP1T5RqVklQjglqrU9QF5g62IwqCyVlpzSHKZoKYRv7tpk63EQkDTNt8FMubNEgsuDm4bps9PiCpxAq7cDWS6QDWQrqwDdMWGL+eooAV3HQ1HQFSPApZLQSrmm0E6ToGc/bbu1MJlpYxN0siAq8NoPBVenLNGuSSp3CzsTCO61guXyrFWsbSUXo4JOrfZLv1EGw1LCn0jb5sVFJXWEvcw14HTetXPYq9lyVFUpl83vKKqyjw2awYcoNMCdmaFBmuOMsZAU1lDlU360ico0OheSs7hCOn54/Z2kHnLZKmIk9VIwKrBu+xtlEC7gTn8gLOpEdxjpJzcykzfDxatMmOogUfGRaSooPLTqop+N4Gvvm9Q1G/9P7wbiu667n26ueRl7nKBeOXSdUcDEwy8+xaxxUiPJwDUIBmdsZSvlsGTQ8L1oUJ4lrMLr86K8ah+Pt0Yvt4fvxYHe8WIg18pg8IVtkRF6SXfKG7JMJYeQz+Uq+kx+9L71vvZ+9X3PX1ZVFzCPSOb3f/wCulWNf</latexit>

§ Sparse tensors stored in coordinate (COO) format§ List of nonzero values and coordinates (subscripts)

X 2 R3⇥7⇥4, R = 4<latexit sha1_base64="bxY0ArI9IgAfxh/L/s7ZwOb7MOw=">AAADD3icbVHdihMxFE7Hv3X86+qlN8GyIKJlZnZhvVlYEMQbYV3a3UIzliSTtqH5GZPMSg3zED6DDyF4Id76CN55q09h2h1lZ+uBkI/vfCfnfDmkFNy6JPnRia5cvXb9xtbN+NbtO3fvdbfvn1hdGcqGVAttRgRbJrhiQ8edYKPSMCyJYKdk8WKVPz1jxnKtBm5ZslzimeJTTrEL1KQ7QITNuPLsXbVmntQxGvhRjbiCSGI3J8Qf12/9LnJcMgv3m3uvfgpRqCng8cFejJgqLjwx6faSfrIOuAnSBvRAE0eT7c4nVGhaSaYcFdjacZqULvfYOE4FCzNVlpWYLvCMjQNUOMyQ+7X9Gu4EpoBTbcJRDq7ZixUeS2uXkgTlypG9nFuR/8uNKzd9nnuuysoxRc8bTSsBnYarv4QFN4w6sQwAU8PDrJDOscHUhR9vdxEzHQRzyWnLjHd88eEZLdoO15NQk3tWUSxCX/gymEMN3aon8l+ayHgHES2KlS5Gir2nWkocNoMG9TjLx7kPGt9L/z7ke1ld123l602lkY0wrDW9vMRNcJL1091+8ibrHWbNgrfAQ/AIPAYp2AeH4BU4AkNAwRfwE/wCv6OP0efoa/TtXBp1mpoHoBXR9z96ffxf</latexit>

C = X(3)(A�B) =) c(k, l) =X

(i,j,k)2N (X)

x(i, j, k)a(i, l)b(j, l), l = 1, . . . , R

<latexit sha1_base64="P7CGdemQdXFiejvR847XpcDH9nE=">AAADenicbVJbb9MwGHVWLqPc2vHIi7VqUgOhSjok9jJpYxLiBQSo3SrVUeU4buvVdkLsAMXyr+GVP8R/4QGnC2hZsRTl5Hznu+U4yTlTOgx/eTutW7fv3N29177/4OGjx53u3rnKyoLQMcl4VkwSrChnko4105xO8oJikXB6kazOqvjFF1oolsmRXuc0Fngh2ZwRrB016/xECV0waejncsM8s22UCHNm4TFEIzOxM9M/9G2/Ik8tRFmaaVh9vLY+REy4GamCpL8KuF+lqFK4DBZcBisfMQmRwHpJMDfvXY2qnm/htzoOsQMuLelfulcAkZshhfw4CpDrooJPbURlem20WacXDsLNgdsgqkEP1GfW6W5tN+t6P1x1UgoqNeFYqWkU5jo2uNCMcOrWLxXNMVnhBZ06KLGgKjabP23hgWNSOM8K90gNN+z1DIOFUmuROGW1uLoZq8j/xaalnh/Fhsm81FSSq0bzkkOdwco2mLKCEs3XDmBSMDcrJEtcYKKduc0ufJE5wVIw0ljGaLb6/oKkzQ03k5AiNrR0Lrm+8I1bDtV0Iz8R/8KJaB+gJONppWsjSb+STAjszEIjOx3G09hUd6QX/S1kekNrbVP5bltZiFronI5u+roNzoeD6HAQfhz2Toa157vgKdgHfRCBV+AEvAUfwBgQb8878k698c7v1n7Lbz2/ku54dc4T0Ditl38AbRka1A==</latexit>

Page 8: High performance tensor decomposition on emerging

Sptensor X;Ktensor u;FacMatrix v;

const int nnz = X.nnz();for (int i=0; i<nnz; ++i){double *tmp = new double[R];

const size_t k = X.subscript(i,n);const double x_val = X.value(i);for (int j = 0; j < R; ++j)tmp[j] = x_val;

for (int m = 0; m < N; m++) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));for (int j = 0; j < R; ++j)tmp[j] *= row[j];

}}

for (int j = 0; j < R; ++j)v.entry(k,j) += tmp[j];

delete [] tmp;}

Serial MTTKRP Algorithm

Matricized Tensor Times Khatri-Rao Product (MTTKRP)§ Factor matrices stored row-wise (Kokkos::LayoutRight)

V = X(n)

�UN � · · ·�Un+1 �Un�1 � · · ·�U1

v(in, :) =X

i2N (X)

xiu1(i1, :) . . . u

n�1(in�1, :)un+1(in+1, :) . . . u

N (iN , :)

<latexit sha1_base64="lT79pwrtj0wpvxeooj05/pIADPM=">AAADz3icfVJbb9MwFE5WLiNctgFvvFhUQy3bqqQvXCSkSUiIF6qB2m1SHSrHcVurthPZzlixjHjlZyDe4Bfxb3DSdKItwlLio+/ic3yOk5xRpcPwt7/VuHb9xs3tW8HtO3fv7ezu3T9VWSExGeCMZfI8QYowKshAU83IeS4J4gkjZ8nsdcmfXRCpaCb6ep6TmKOJoGOKkXbQaM9/CBMyocIgRifiqQ1gws2pBU9eAdg353ZkWqJtISNj3Sqpgf1oehbALM00gO6nlnFNioPIriFH0f8MJSnpZKrbAMLgokVH4vBluypAFXxkKKQCQI70FCPmcrequtoWXI4oKEq/s0SlZXF6scjowGoviWJRVgUdVNCVtFeiPYcEkIh02YXRbjPshNUCm0FUB02vXieujz/ckbjgRGjMkFLDKMx1bJDUFDPi2lookiM8QxMydKFAnKjYVPOzYN8hKRhn0n1Cgwr922EQV2rOE6cs+6DWuRL8Fzcs9Ph5bKjIC00EXiQaFwzoDJSPAaRUEqzZ3AUIS+pqBXiKJMLaPZnVLGySOcGUU7xyGaPp7PMRTldvWFWCZWxI4Ybm8oI37nKwhlf8Cb+iEx7swyRjaakLoCCfcMY5cmOBfTvsxsPYlI+mGS0PMs2utXZV+W5TKXktdGON1oe4GQy6nRed8H23eXxYz3fbe+Q99lpe5D3zjr233ok38LBv/O/+T/9X40PjsvGl8XUh3fJrzwNvZTW+/QH4aDtI</latexit><latexit sha1_base64="lT79pwrtj0wpvxeooj05/pIADPM=">AAADz3icfVJbb9MwFE5WLiNctgFvvFhUQy3bqqQvXCSkSUiIF6qB2m1SHSrHcVurthPZzlixjHjlZyDe4Bfxb3DSdKItwlLio+/ic3yOk5xRpcPwt7/VuHb9xs3tW8HtO3fv7ezu3T9VWSExGeCMZfI8QYowKshAU83IeS4J4gkjZ8nsdcmfXRCpaCb6ep6TmKOJoGOKkXbQaM9/CBMyocIgRifiqQ1gws2pBU9eAdg353ZkWqJtISNj3Sqpgf1oehbALM00gO6nlnFNioPIriFH0f8MJSnpZKrbAMLgokVH4vBluypAFXxkKKQCQI70FCPmcrequtoWXI4oKEq/s0SlZXF6scjowGoviWJRVgUdVNCVtFeiPYcEkIh02YXRbjPshNUCm0FUB02vXieujz/ckbjgRGjMkFLDKMx1bJDUFDPi2lookiM8QxMydKFAnKjYVPOzYN8hKRhn0n1Cgwr922EQV2rOE6cs+6DWuRL8Fzcs9Ph5bKjIC00EXiQaFwzoDJSPAaRUEqzZ3AUIS+pqBXiKJMLaPZnVLGySOcGUU7xyGaPp7PMRTldvWFWCZWxI4Ybm8oI37nKwhlf8Cb+iEx7swyRjaakLoCCfcMY5cmOBfTvsxsPYlI+mGS0PMs2utXZV+W5TKXktdGON1oe4GQy6nRed8H23eXxYz3fbe+Q99lpe5D3zjr233ok38LBv/O/+T/9X40PjsvGl8XUh3fJrzwNvZTW+/QH4aDtI</latexit><latexit sha1_base64="lT79pwrtj0wpvxeooj05/pIADPM=">AAADz3icfVJbb9MwFE5WLiNctgFvvFhUQy3bqqQvXCSkSUiIF6qB2m1SHSrHcVurthPZzlixjHjlZyDe4Bfxb3DSdKItwlLio+/ic3yOk5xRpcPwt7/VuHb9xs3tW8HtO3fv7ezu3T9VWSExGeCMZfI8QYowKshAU83IeS4J4gkjZ8nsdcmfXRCpaCb6ep6TmKOJoGOKkXbQaM9/CBMyocIgRifiqQ1gws2pBU9eAdg353ZkWqJtISNj3Sqpgf1oehbALM00gO6nlnFNioPIriFH0f8MJSnpZKrbAMLgokVH4vBluypAFXxkKKQCQI70FCPmcrequtoWXI4oKEq/s0SlZXF6scjowGoviWJRVgUdVNCVtFeiPYcEkIh02YXRbjPshNUCm0FUB02vXieujz/ckbjgRGjMkFLDKMx1bJDUFDPi2lookiM8QxMydKFAnKjYVPOzYN8hKRhn0n1Cgwr922EQV2rOE6cs+6DWuRL8Fzcs9Ph5bKjIC00EXiQaFwzoDJSPAaRUEqzZ3AUIS+pqBXiKJMLaPZnVLGySOcGUU7xyGaPp7PMRTldvWFWCZWxI4Ybm8oI37nKwhlf8Cb+iEx7swyRjaakLoCCfcMY5cmOBfTvsxsPYlI+mGS0PMs2utXZV+W5TKXktdGON1oe4GQy6nRed8H23eXxYz3fbe+Q99lpe5D3zjr233ok38LBv/O/+T/9X40PjsvGl8XUh3fJrzwNvZTW+/QH4aDtI</latexit>

7

Page 9: High performance tensor decomposition on emerging

typedef Kokkos::RangePolicy<ExecSpace> Policy;const int nnz = X.nnz();Policy policy(0,nnz);Kokkos::parallel_for( policy,

KOKKOS_LAMBDA(const int i){double *tmp = new double[R];

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

for (int j = 0; j < R; ++j)tmp[j] = x_val;

for (int m = 0; m < N; m++) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));for (int j = 0; j < R; ++j)tmp[j] *= row[j];

}}

for (int j = 0; j < R; ++j)Kokkos::atomic_add(&v.entry(k,j), tmp[j]);

delete [] tmp;});

Flat Parallelism Over Tensor Non-zeros

Parallelize MTTKRP over tensor non-zeros

Multiple threads writing to same row of V requires atomic update (not the only way to do it)

8

Page 10: High performance tensor decomposition on emerging

typedef Kokkos::RangePolicy<ExecSpace> Policy;const int nnz = X.nnz();Policy policy(0,nnz);Kokkos::parallel_for( policy,

KOKKOS_LAMBDA(const int i){double *tmp = new double[R];

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

for (int j = 0; j < R; ++j)tmp[j] = x_val;

for (int m = 0; m < N; m++) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));for (int j = 0; j < R; ++j)tmp[j] *= row[j];

}}

for (int j = 0; j < R; ++j)Kokkos::atomic_add(&v.entry(k,j), tmp[j]);

delete [] tmp;});

Flat Parallelism Over Tensor Non-zeros

Improvements in MTTKRP implementation:§ Thread teams to process in parallel:

§ Tensor non-zeros§ Factor matrix columns

§ Thread-scalable mechanism for tmp buffer§ Kokkos scratch-pad memory (MTTKRP-A)§ Thread-local, compile-time polymorphic arrays

(MTTKRP-B)

§ Blocking on R for better cache performance, occupancy

§ Alternatives to atomics§ Thread privatization buffers

9

Page 11: High performance tensor decomposition on emerging

Permuted MTTKRP Inspired by COO Mat-Vec

10

3 6 21 1 42 7 31 4 42 5 13 1 2

X.subs X.vals

�<latexit sha1_base64="TlncEkZ3eE0xh1tqB82m540GTR0=">AAAEEnicbVNNb9NAEN02fBTz1cKRi0VUgRBUSTjApVIlJMQFVKSmrRRb0Xo9tlfZD7O7LqQr/wRu/BNuCA6IKzdO/BvGTqjititZGr03b+bNjjcpBbduMPi7tt67cvXa9Y0bwc1bt+/c3dy6d2h1ZRiMmRbaHCfUguAKxo47AcelASoTAUfJ7FXDH52AsVyrAzcvIZY0VzzjjDqEppuPogRyrjx8qFrkSR1EOtUuiEClK+h0sz/YGbQnvBgMl0GfLM/+dGv9T5RqVklQjglqrU/QFpg62I4qCyVlM5rDBEOFsI19O0sdbiOShpk2+CkXtmgQWXALuS4bP77gKYRKO7D1CulAloI68A0Tlli/niDAVRw0PV0BEnwKGa2EQ11XpOgJz9tp7VwmWljE3TyICrw1g+Kq9GWaNUWlTmF3bOGdVrDavvWKvY2kInTwydU+yXfrIFgZ2FNpm/o4qKSusOe5BryMm1Quexl7rsoKrbLFPWUV9tFhs9cw5QaYE3MMKDPccRayghrKHLrvdhG5xoRCclZ3CMdnp89Y2gEnrRNmYg8VowL7hq9xN9ES7ugTeUYnssNYJ6mZm7RZPl5t2qiDSMFHpqWk+KNFB/VkFE9ij0rfH/4v7/ujuq67mW8vZhp5WaJcJnaT0MHZwiw/xalxUSHaw1cQCsjc7kDKp6ug4XnRorhJfAbD8z/9xeBwtDN8vjN4P+rvjZYPYoM8IA/JYzIkL8geeUP2yZgw8pl8Jd/Jj96X3rfez96vRer62lJzn3RO7/c/USpjAQ==</latexit>

⇥<latexit sha1_base64="cqKZeU0AoxwaG9P/Z/YuEXfOwaU=">AAAEE3icbVNNaxRBEO1k/YjjV6JHL4NLIIiG3fWgl0BAEC9KhGwS2BmWnp6amWb7Y+zuiW6a+Qve/CfeRA/i1ZMn/401s2vYSdIwULxXr+pV13RSCm7dYPB3bb137fqNmxu3gtt37t67v7n14MjqyjAYMy20OUmoBcEVjB13Ak5KA1QmAo6T2auGPz4FY7lWh25eQixprnjGGXUITTd3ogRyrjx8qFrkSR1EjkuwQQQqXYGnm/3B7qA94eVguAz6ZHkOplvrf6JUs0qCckxQa32CvsDUwXZUWSgpm9EcJhgqhG3s22HqcBuRNMy0wU+5sEWDyIJbyHXZ+PEFTyFU2oGtV0gHshTUgW+YsMT69QQBruKg6ekKkOBTyGglHOq6IkVPed5Oa+cy0cIi7uZBVOC1GRRXpS/TrCkqdQp7YwvvtILV9q1X7G0kFaGDT672Sb5XB8HKwJ5K29THQSV1hb3INeBV3KRy2cvYc1VWaJUt7imrsI8Om8WGKTfAnJhjQJnhjrOQFdRQ5tB9t4vINSYUkrO6Qzg+O3vG0g44aZ0wE3uoGBXYN3yNu4mWcEefyHM6kR3GOknN3KTN8vFq00YdRAo+Mi0lxR8tOqwno3gSe1T6/vB/ed8f1XXdzXx7OdPIqxLlMrGbhA7OF2b5GU6NiwrRHj6DUEDm9gZSPl0FDc+LFsVN4jMYXvzpLwdHo93h893B+1F/f7R8EBvkEXlMdsiQvCD75A05IGPCyGfylXwnP3pfet96P3u/Fqnra0vNQ9I5vd//AAT2Y3c=</latexit>

=<latexit sha1_base64="Sqiovq9umAtAI3wYwYoDsWYcH5k=">AAAEDnicbVNNaxRBEO1k/YjjRxI9ehlcAiK67K4HvSwEBPGiRMgmgZ0h9PTUzDTpj7G7J7pp5gd48594Ez2IV++e/DfWTNawk03DQPHqvapXXdNJKbh1w+HftfXetes3bm7cCm7fuXtvc2v7/oHVlWEwZVpoc5RQC4IrmDruBByVBqhMBBwmJ6+a/OEpGMu12nfzEmJJc8UzzqhD6HirHyWQc+XhQ9UiT+pgEkSg0iUEWcPBsD3hajBaBH2yOHvH2+t/olSzSoJyTFBrfYKWwNTBTlRZKCk7oTnMMFQI29i3c9ThDiJpmGmDn3JhiwaRBXcu12Xjxxc8hVBpB7ZeSjqQpaAOfJMJS6xfzxDgKg6anq4ACT6FjFbCoa4rUvSU5+20di4TLSzibh5EBd6YQXFV+jLNmqJSpzCZWninFSy3b71ibyOpCB18crVP8kkdBEsDeyptUx8HldQV9nKuAa/KzSqXvYw9V2WFVtn5PWUV9tFhs9Mw5QaYE3MMKDPccRayghrKHLrvdhG5RkIhOas7CcdPzp6xtAPOWifMxB4qRgX2DV/jbqIF3NEn8iKdyE7GOknN3KTN8vFq00YdRAo+Mi0lxR8t2q9n43gWe1T6/uh/ed8f13XdZb5dZRp5FVEuiF0SOrhYmOVnODUuKkR7+AJCAZmbDKV8ugwanhctipvEZzC6/NOvBgfjwej5YPh+3N8dLx7EBnlIHpHHZERekF3yhuyRKWHkM/lKvpMfvS+9b72fvV/n1PW1heYB6Zze738CA2EE</latexit>

A<latexit sha1_base64="eXdKyuqx5r+N/vTwtEuOafZD0Qc=">AAAEE3icbVNLb9QwEHa7PEp4dAtHLhGrShWCanc5wKVSERLiAipSt620iSrHmSRW/Qi2U9ha+Qvc+CfcEBwQV06c+DdMsku1aWspyuj75vGNx5OUgls3HP5dWe1du37j5tqt4Padu/fW+xv3D6yuDIMJ00Kbo4RaEFzBxHEn4Kg0QGUi4DA5edXwh6dgLNdq381KiCXNFc84ow6h4/5WlEDOlYcPVYs8roMokf4l/kClS/BxfzDcHrYnvGyMFsaALM7e8cbqnyjVrJKgHBPUWp+gLjB1sBlVFkrKTmgOUzQVwjb2bTN1uIlIGmba4Kdc2KJBZMHNw3XZ6PEFTyFU2oGtl0gHshTUgW+YsMT89RQBruKgqekKkOBTyGglXNNoJ0jRU5633dqZTLSwiLtZEBV4bQaDq9KXadYklTqFnYmFd1rBcvlWK9Y2korQwSdX+yTfqYNgqWFPpW3yY6OSusJe5BrwKm5auexF7LkqK5TK5veUVVhHh81gw5QbYE7M0KDMcMdZyApqKHOovltF5BodCslZ3SEcPzl7ytIOOG2VMBN7qBgVWDd8jbOJFnAnPpHndCI7jHWSmplJm+Hj1aZNdBAp+Mi0lBQfWrRfT8fxNPbN6xuM/qf3g3Fd113Pt5c9jbzKUS4cu06o4Hxglp9h1zioEOXhGoQCMrczlPLJMmh4XrQoThLXYHTx0V82Dsbbo2fbw/fjwe54sRBr5CF5RLbIiDwnu+QN2SMTwshn8pV8Jz96X3rfej97v+auqyuLmAekc3q//wGnZ2Nd</latexit>

B<latexit sha1_base64="yDsOwuEimKagtOJ16lKY5DZeBPo=">AAAEE3icbVNLb9QwEHa7PEp4dAtHLhGrShWCanc5wKVSBRLiAipSt620iSrHmSRW/Qi2U9ha+Qvc+CfcEBwQV06c+DdMsku1aWspyuj75vGNx5OUgls3HP5dWe1du37j5tqt4Padu/fW+xv3D6yuDIMJ00Kbo4RaEFzBxHEn4Kg0QGUi4DA5edXwh6dgLNdq381KiCXNFc84ow6h4/5WlEDOlYcPVYs8roMokf4l/kClS/BxfzDcHrYnvGyMFsaALM7e8cbqnyjVrJKgHBPUWp+gLjB1sBlVFkrKTmgOUzQVwjb2bTN1uIlIGmba4Kdc2KJBZMHNw3XZ6PEFTyFU2oGtl0gHshTUgW+YsMT89RQBruKgqekKkOBTyGglXNNoJ0jRU5633dqZTLSwiLtZEBV4bQaDq9KXadYklTqFnYmFd1rBcvlWK9Y2korQwSdX+yTfqYNgqWFPpW3yY6OSusJe5BrwKm5auexF7LkqK5TK5veUVVhHh81gw5QbYE7M0KDMcMdZyApqKHOovltF5BodCslZ3SEcPzl7ytIOOG2VMBN7qBgVWDd8jbOJFnAnPpHndCI7jHWSmplJm+Hj1aZNdBAp+Mi0lBQfWrRfT8fxNPbN6xuM/qf3g3Fd113Pt5c9jbzKUS4cu06o4Hxglp9h1zioEOXhGoQCMrczlPLJMmh4XrQoThLXYHTx0V82Dsbbo2fbw/fjwe54sRBr5CF5RLbIiDwnu+QN2SMTwshn8pV8Jz96X3rfej97v+auqyuLmAekc3q//wGq/mNe</latexit>

C<latexit sha1_base64="xm5GZo/8DPaFbu5KCYNnXoOShdw=">AAAEE3icbVNLaxRBEO5kfcTxkY0evQwugSAadteDXgKBgHhRImSTwM4QenpqZpr0Y+zuiW6a+Qve/CfeRA/i1ZMn/401s2vYSdIwTPF99fiqqyspBbduOPy7stq7cfPW7bU7wd179x+s9zceHlpdGQYTpoU2xwm1ILiCieNOwHFpgMpEwFFyutfwR2dgLNfqwM1KiCXNFc84ow6hk/5WlEDOlYcPVYs8rYMokX4Pf6DSJfikPxhuD9sTXjVGC2NAFmf/ZGP1T5RqVklQjglqrU9QF5g62IwqCyVlpzSHKZoKYRv7tpk63EQkDTNt8FMubNEgsuDm4bps9PiCpxAq7cDWS6QDWQrqwDdMWGL+eooAV3HQ1HQFSPApZLQSrmm0E6ToGc/bbu1MJlpYxN0siAq8NoPBVenLNGuSSp3CzsTCO61guXyrFWsbSUXo4JOrfZLv1EGw1LCn0jb5sVFJXWEvcw14HTetXPYq9lyVFUpl83vKKqyjw2awYcoNMCdmaFBmuOMsZAU1lDlU360ico0OheSs7hCOn54/Z2kHnLZKmIk9VIwKrBu+xtlEC7gTn8gLOpEdxjpJzcykzfDxatMmOogUfGRaSooPLTqop+N4Gvvm9Q1G/9P7wbiu667n26ueRl7nKBeOXSdUcDEwy8+xaxxUiPJwDUIBmdsZSvlsGTQ8L1oUJ4lrMLr86K8ah+Pt0Yvt4fvxYHe8WIg18pg8IVtkRF6SXfKG7JMJYeQz+Uq+kx+9L71vvZ+9X3PX1ZVFzCPSOb3f/wCulWNf</latexit>

2 2 54 6 13 4 65 5 31 1 26 3 4

X.perm

§ Poorly structured tensors result in high atomic contention§ COO sparse matrix-vector product leverages sorting by increasing row-index

§ Apply this to coordinate-based sparse tensor format§ Don’t want multiple copies of tensor with different sortings§ Instead, compute permutation array p so that X.subscript(p(:,n),n) is increasing§ Each thread only writes when row index X.subscript(p(:,n),n) changes§ Substantially reduces number of atomic updates and improves their locality

Page 12: High performance tensor decomposition on emerging

MTTKRP Percentage of Peak BandwidthGB/s =

⇣(NR + 3)⇥ sizeof (Real) +N ⇥ sizeof (Ordinal)

⌘⇥ NNZ

t<latexit sha1_base64="pzKhIJoaNnnLHj1RzHv0KhcWQKg=">AAAEZXicbVPdbtMwFE63AiMw2ABxAxIWVVHHYLTlAm4qTUMCbihjWjdEE02Oc5Jai5NgO2Od5XfghXgQnoA34JqTtEzNNkuRjr7vO/8nQZ5wpbvd342l5ea16zdWbrq3bq/eubu2fu9AZYVkMGJZksmvAVWQ8BRGmusEvuYSqAgSOAyO35X84QlIxbN0X09z8AWNUx5xRjVCR2s/vQBinhr4XlTIc+t6IshOzYedV8qSAfF2eNzpDMne5usNT3MBiih+BlnUmen2gCZ2g2yS4VXsZxnytBSUYeb+XiQpM8PhN2u0Ja4HabiQ/mit1d3qVo9cNnpzo+XM3+7R+tIvL8xYISDVLKFKmQD7B2ndtlcoyCk7pjGM0UwRVr6phmZJG5GQRJnEL9WkQl1PgZ65Z3lZj5nwEEiaaVB2gdQg8oRqMCVDcoxvxwjw1HfLnHoCAkwIES0SjX51p5Se8LjqVk1xSIlCXE9db4Lrkehc5CYPozKoyEIYjBQMsxQW01e1Ym4paEI0nGprgnhgXXehYUOFKuNjo4LqibrIleBV3LjQ0Vvf8DQvsFQ2m1NUYJ6MlAdEQi6B6WSKBmWSa84Im1Bcqcbq61mSOEPBRHBma4Tmx2cvWVgDx1UlTPoGCoYXQ9rkPe7Gm8M1/0Cc04HANeMQw1Lnein8YJkQFE/K27fjvj/2DWpMq/c/kGn1rbV15afLSimuEoq5sC7CCs5XU16/KVdCBJX4Y5EEIj3oCvFiEZQ8nlQo7gwPvnfxvC8bB/2tXner96Xf2u7OT3/FeeQ8dTpOz3njbDsfnV1n5DDnb+Nxo914tvynudp80Hw4ky415j73ndprPvkHK4p78Q==</latexit><latexit sha1_base64="pzKhIJoaNnnLHj1RzHv0KhcWQKg=">AAAEZXicbVPdbtMwFE63AiMw2ABxAxIWVVHHYLTlAm4qTUMCbihjWjdEE02Oc5Jai5NgO2Od5XfghXgQnoA34JqTtEzNNkuRjr7vO/8nQZ5wpbvd342l5ea16zdWbrq3bq/eubu2fu9AZYVkMGJZksmvAVWQ8BRGmusEvuYSqAgSOAyO35X84QlIxbN0X09z8AWNUx5xRjVCR2s/vQBinhr4XlTIc+t6IshOzYedV8qSAfF2eNzpDMne5usNT3MBiih+BlnUmen2gCZ2g2yS4VXsZxnytBSUYeb+XiQpM8PhN2u0Ja4HabiQ/mit1d3qVo9cNnpzo+XM3+7R+tIvL8xYISDVLKFKmQD7B2ndtlcoyCk7pjGM0UwRVr6phmZJG5GQRJnEL9WkQl1PgZ65Z3lZj5nwEEiaaVB2gdQg8oRqMCVDcoxvxwjw1HfLnHoCAkwIES0SjX51p5Se8LjqVk1xSIlCXE9db4Lrkehc5CYPozKoyEIYjBQMsxQW01e1Ym4paEI0nGprgnhgXXehYUOFKuNjo4LqibrIleBV3LjQ0Vvf8DQvsFQ2m1NUYJ6MlAdEQi6B6WSKBmWSa84Im1Bcqcbq61mSOEPBRHBma4Tmx2cvWVgDx1UlTPoGCoYXQ9rkPe7Gm8M1/0Cc04HANeMQw1Lnein8YJkQFE/K27fjvj/2DWpMq/c/kGn1rbV15afLSimuEoq5sC7CCs5XU16/KVdCBJX4Y5EEIj3oCvFiEZQ8nlQo7gwPvnfxvC8bB/2tXner96Xf2u7OT3/FeeQ8dTpOz3njbDsfnV1n5DDnb+Nxo914tvynudp80Hw4ky415j73ndprPvkHK4p78Q==</latexit><latexit sha1_base64="pzKhIJoaNnnLHj1RzHv0KhcWQKg=">AAAEZXicbVPdbtMwFE63AiMw2ABxAxIWVVHHYLTlAm4qTUMCbihjWjdEE02Oc5Jai5NgO2Od5XfghXgQnoA34JqTtEzNNkuRjr7vO/8nQZ5wpbvd342l5ea16zdWbrq3bq/eubu2fu9AZYVkMGJZksmvAVWQ8BRGmusEvuYSqAgSOAyO35X84QlIxbN0X09z8AWNUx5xRjVCR2s/vQBinhr4XlTIc+t6IshOzYedV8qSAfF2eNzpDMne5usNT3MBiih+BlnUmen2gCZ2g2yS4VXsZxnytBSUYeb+XiQpM8PhN2u0Ja4HabiQ/mit1d3qVo9cNnpzo+XM3+7R+tIvL8xYISDVLKFKmQD7B2ndtlcoyCk7pjGM0UwRVr6phmZJG5GQRJnEL9WkQl1PgZ65Z3lZj5nwEEiaaVB2gdQg8oRqMCVDcoxvxwjw1HfLnHoCAkwIES0SjX51p5Se8LjqVk1xSIlCXE9db4Lrkehc5CYPozKoyEIYjBQMsxQW01e1Ym4paEI0nGprgnhgXXehYUOFKuNjo4LqibrIleBV3LjQ0Vvf8DQvsFQ2m1NUYJ6MlAdEQi6B6WSKBmWSa84Im1Bcqcbq61mSOEPBRHBma4Tmx2cvWVgDx1UlTPoGCoYXQ9rkPe7Gm8M1/0Cc04HANeMQw1Lnein8YJkQFE/K27fjvj/2DWpMq/c/kGn1rbV15afLSimuEoq5sC7CCs5XU16/KVdCBJX4Y5EEIj3oCvFiEZQ8nlQo7gwPvnfxvC8bB/2tXner96Xf2u7OT3/FeeQ8dTpOz3njbDsfnV1n5DDnb+Nxo914tvynudp80Hw4ky415j73ndprPvkHK4p78Q==</latexit><latexit sha1_base64="pzKhIJoaNnnLHj1RzHv0KhcWQKg=">AAAEZXicbVPdbtMwFE63AiMw2ABxAxIWVVHHYLTlAm4qTUMCbihjWjdEE02Oc5Jai5NgO2Od5XfghXgQnoA34JqTtEzNNkuRjr7vO/8nQZ5wpbvd342l5ea16zdWbrq3bq/eubu2fu9AZYVkMGJZksmvAVWQ8BRGmusEvuYSqAgSOAyO35X84QlIxbN0X09z8AWNUx5xRjVCR2s/vQBinhr4XlTIc+t6IshOzYedV8qSAfF2eNzpDMne5usNT3MBiih+BlnUmen2gCZ2g2yS4VXsZxnytBSUYeb+XiQpM8PhN2u0Ja4HabiQ/mit1d3qVo9cNnpzo+XM3+7R+tIvL8xYISDVLKFKmQD7B2ndtlcoyCk7pjGM0UwRVr6phmZJG5GQRJnEL9WkQl1PgZ65Z3lZj5nwEEiaaVB2gdQg8oRqMCVDcoxvxwjw1HfLnHoCAkwIES0SjX51p5Se8LjqVk1xSIlCXE9db4Lrkehc5CYPozKoyEIYjBQMsxQW01e1Ym4paEI0nGprgnhgXXehYUOFKuNjo4LqibrIleBV3LjQ0Vvf8DQvsFQ2m1NUYJ6MlAdEQi6B6WSKBmWSa84Im1Bcqcbq61mSOEPBRHBma4Tmx2cvWVgDx1UlTPoGCoYXQ9rkPe7Gm8M1/0Cc04HANeMQw1Lnein8YJkQFE/K27fjvj/2DWpMq/c/kGn1rbV15afLSimuEoq5sC7CCs5XU16/KVdCBJX4Y5EEIj3oCvFiEZQ8nlQo7gwPvnfxvC8bB/2tXner96Xf2u7OT3/FeeQ8dTpOz3njbDsfnV1n5DDnb+Nxo914tvynudp80Hw4ky415j73ndprPvkHK4p78Q==</latexit>

11

§ MTTKRP-A: Kokkos scratch-pad arrays§ MTTKRP-B: Compile-time polymorphic arrays

§ MTTKRP-C: Permutation-based MTTKRP

Page 13: High performance tensor decomposition on emerging

Performance on Real Tensors§ Look at Genten MTTKRP performance on several tensors from the FROSTT

collection§ Shaden Smith et al, https://frost.io§ Fixed 10 iterations of CP-ALS§ R =16 for all tests

§ Compare MTTKRP-B/C to SPLATT on HSW/KNL and ParTI! on GPU§ SPLATT: Shaden Smith et al, http://glaros.dtc.umn.edu/gkhome/splatt/overview

§ Compressed Sparse Fiber (CSF) storage format (Smith and Karypis, 2015)§ Use default 2 CSF modes, with and without tiling

§ ParTI!: J. Li et al, https://github.com/hpcgarage/ParTI§ Only works with order-3 and order-4 tensors§ Requires double-precision atomics (no K80)

12

SPARSE TENSOR DECOMPOSITION ON EMERGING ARCHITECTURES 19

Name Order Dimensions Nonzeros

LBNL 5 1.6K⇥ 4.2K⇥ 1.6K⇥ 4.2K⇥ 868K 1.7MUber 4 183⇥ 24⇥ 1.1K⇥ 1.7K 13MEnron 4 6.0K⇥ 5.7K⇥ 244K⇥ 1.2K 54MVAST 5 165K⇥ 11K⇥ 2⇥ 100⇥ 89 26MNELL2 3 12K⇥ 9.1K⇥ 29K 77M

Delicious3 3 532K⇥ 17M⇥ 2.5M 140M

(a) Real-world tensor data sets from FROSTT.

Arch. Method LBNL Uber Enron VAST NELL2 Delicious3

HSW MTTKRP-B 16.6 8.3 45.7 284.0 42.8 48.5MTTKRP-C 0.4 0.3 5.7 6.3 7.0 18.2SPLATT-M 0.8 0.2 1.1 26.1 1.3 13.7SPLATT-T 0.5 1.6 10.4 3.8 1.4 31.6DFacTo – – – – 34.6 338.3

KNL MTTKRP-B 41.2 9.2 108.0 1910.0 82.4 77.8MTTKRP-C 0.4 0.2 2.8 16.1 3.2 10.0SPLATT-M 2.0 1.2 2.6 150.5 9.0 16.5SPLATT-T 1.0 5.1 41.3 6.5 15.6 97.2DFacTo – – – – 47.4 370.2

K80 MTTKRP-B 72.4 5.6 218.0 2010.0 55.2 30.7MTTKRP-C 0.5 0.5 9.4 7.4 11.6 –

P100 MTTKRP-B 0.4 0.2 2.0 4.1 1.6 2.9MTTKRP-C 0.1 0.1 1.7 1.5 1.8 7.0

ParTI! – 0.3 1.9 – 1.5 2.8

(b) Total MTTKRP time (in seconds) for ten iterations with R=16 for each datatensor, architecture, and method. Due to the increased storage requirements ofthe permuted approach, global memory was exhausted for MTTKRP-B for theDelicious3 tensor on the K80 architecture. DFacTo implements MTTKRP only

for 3-way tensors, and ParTI! implements MTTKRP only for 3- or 4-way tensors.

(c) Speedup of the best GenTen MTTKRP methodover the best SPLATT, DFacTo and ParTI! methods.

Fig. 8: Results on real-world tensor data sets.

This manuscript is for review purposes only.

Page 14: High performance tensor decomposition on emerging

GenTen Speedup for Best MTTKRP Algorithm

13

Page 15: High performance tensor decomposition on emerging

§ Standard CP optimization problem:

§ Objective function derived through maximum likelihood estimation assuming the data tensor (X) entries are i.i.d. Gaussian

§ Generalize to other types of data through arbitrary loss function:

Review of Generalized CP (GCP)*

14

⇡ = + + · · ·+

a1<latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit><latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit><latexit sha1_base64="bZLlsxujpIq12ZHqnkAAJa+MPwo=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D0/JaqQ==</latexit>

b1<latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit><latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit><latexit sha1_base64="pNLT3wPMfHXDXOBuWbhoHTRD9yw=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2SiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH13Raqg==</latexit>

c1<latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit><latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit><latexit sha1_base64="57ecMXmzAHC0r+RoaNjdty+lspc=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBTEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D2vZaqw==</latexit>

a2<latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit><latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit><latexit sha1_base64="3QN6l7OepZ4BBf/JoX8lQSfg824=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMqTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D13Faqg==</latexit>

b2<latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit><latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit><latexit sha1_base64="nr/SoGwqBB6NMtha3atpqhNvlic=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMkyb2TiIQWVLzGxjGG1HnYUXndHCGZKF7c02V3/FmWa1BOWYoNb6FFsD0wRbcW2houyIFjBFVyFsE9/N04RbiGRhrg0+5cIODWIL7jRcV20/vuQZhEo7sM0S6UBWgjrwLRNWmL+ZIsBVErQ1XQkSfAY5rYXDuH6Qose86Ka1c5lqYRF38yAu8ecMBteVr7K8TSp1BjsTC2+1guXyXa9Y20gqQgefXOPTYqcJgqWBPZW2zY+DSupKe55rwcu4ae3y54nnqqqxVXb6T3mNdXTY7jbMuAHmxBwdygx3nIWspIYyh933q4hCo6CUnDU9wvGjk6cs64HTrhNmEg81owLrhq9wN/EC7sWn8oxOJa4ZPzFrdUGs4CPTUlI8qXi/mY6TadJemh+O/ifyw3HTNH3lm4tKIy8TyoWwL8IOzlZj+QnOhysJJTV486GA3O1EUj5ZBg0vyg7FneHBj86f90VnMt5+sR29Gw93o8Xlr5H75AF5SEbkGdklr8kemRBGvpAf5Df5M/g8+Dr4Nvh+Kl1dWcTcIz0b/PwH2vNaqw==</latexit>

c2<latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit><latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit><latexit sha1_base64="folyhAlQ30AvF13yYCLaUbC97y8=">AAAD/3icbVLLahRBFK1kfMT2kUSXbhqHgPgIPbNRF4GAIG6UCBkTmG6G6urb3UXq0VZVRydFb/wD/8KVuBMXgn6Gf+Ptzhimk1wouJxz7qvuTSvBrYuivyurgytXr11fuxHcvHX7zvrG5t33VteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkB69bPmDYzCWa7Xv5hUkkhaK55xRh9Bs43GcQsGVhw91hzxqAh+nMmTNbBzEoLIlZrYxjLajzsKLzmjhDMnC9mabq7/iTLNagnJMUGt9iq2BaYKtuLZQUXZEC5iiqxC2ie/macItRLIw1wafcmGHBrEFdxquq7YfX/IMQqUd2GaJdCArQR34lgkrzN9MEeAqCdqargQJPoOc1sJhXD9I0WNedNPauUy1sIi7eRCX+HMGg+vKV1neJpU6g52JhbdawXL5rlesbSQVoYNPrvFpsdMEwdLAnkrb5sdBJXWlPc+14GXctHb588RzVdXYKjv9p7zGOjpsdxtm3ABzYo4OZYY7zkJWUkOZw+77VUShUVBKzpoe4fjRyVOW9cBp1wkziYeaUYF1w1e4m3gB9+JTeUanEteMn5i1uiBW8JFpKSmeVLzfTMfJNGkvzQ9H/xP54bhpmr7yzUWlkZcJ5ULYF2EHZ6ux/ATnw5WEkhq8+VBA7nYiKZ8sg4YXZYfizvDgR+fP+6IzGW+/2I7ejYe70eLy18h98oA8JCPyjOyS12SPTAgjX8gP8pv8GXwefB18G3w/la6uLGLukZ4Nfv4D3nVarA==</latexit>

aR<latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit><latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit><latexit sha1_base64="MLPcHuqx/HW0NP4sQI0h/kf1ht0=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0lirJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DR2Bayg==</latexit>

bR<latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit><latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit><latexit sha1_base64="RJWEYvOrWTNAQ3RLR3Oiz6URbhw=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0liibN7F0QgU6XmNn6MNwKO6PnndHCGZKF7c42Vn9FqeGVAu25ZM7VCbYGtgk2o8pByfghy2GKrkbYxXU3T0M3EUlpZiw+7WmHBpEDfxJuyrafuhApUG08uGaJ9KBKyTzULUNLzN9MERA6DtqavgAFdQoZq6THuH6QZkci76Z1c5UY6RD38yAq8OcsBldlXaZZm1SZFLYnDt4YDcvlu16xtlVMUg+ffFMn+XYTBEsD10y5Nj8Oqpgv3FmuBS/ippXPnsW10GWFrfKTf8oqrGNou1uaCgvcyzk6jFvhBae8YJZxj933q8jcoKBQgjc9wovD4yc87YHTrhNu4xoqziTWpS9xN9EC7sUn6pROFK4ZPzFtdUGk4SM3SjE8qWivmY7jadxeWj0c/U9UD8dN0/SVr88rrbpIqBbCvgg7OF2NE8c4H66EKmbx5qmEzG+HSj1eBq3Iiw7FneHBj86e93lnMt56vhW+HQ93wsXlr5F75D55QEbkKdkhr8gumRBOvpAf5Df5M/g8+Dr4Nvh+Il1dWcTcJT0b/PwHSuJayw==</latexit>

cR<latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit><latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit><latexit sha1_base64="m1+JF5OlXgpsTj+DIsv/FR3m2OU=">AAAD/3icbVLLahRBFK1kfMT2lejSTeEQEB+hZzbqIhAQxI0SJWMC081QXX27u0g92qrq6KTpjX/gX7gSd+JC0M/wb7zdGcN0kgsFl3POfdW9SSmF82H4d2V1cOnylatr14LrN27eur2+cee9M5XlMOFGGnuQMAdSaJh44SUclBaYSiTsJ4cvWn7/CKwTRu/5eQmxYrkWmeDMIzRbfxQlkAtdw4eqQx42QR0livJm9i6IQKdLzGx9GG6FndHzzmjhDMnCdmcbq7+i1PBKgfZcMufqBFsD2wSbUeWgZPyQ5TBFVyPs4rqbp6GbiKQ0Mxaf9rRDg8iBPwk3ZdtPXYgUqDYeXLNEelClZB7qlqEl5m+mCAgdB21NX4CCOoWMVdJjXD9IsyORd9O6uUqMdIj7eRAV+HMWg6uyLtOsTapMCtsTB2+MhuXyXa9Y2yomqYdPvqmTfLsJgqWBa6Zcmx8HVcwX7izXghdx08pnz+Ja6LLCVvnJP2UV1jG03S1NhQXu5Rwdxq3wglNeMMu4x+77VWRuUFAowZse4cXh8ROe9sBp1wm3cQ0VZxLr0pe4m2gB9+ITdUonCteMn5i2uiDS8JEbpRieVLTXTMfxNG4vrR6O/ieqh+OmafrK1+eVVl0kVAthX4QdnK7GiWOcD1dCFbN481RC5rdDpR4vg1bkRYfizvDgR2fP+7wzGW893wrfjoc74eLy18g9cp88ICPylOyQV2SXTAgnX8gP8pv8GXwefB18G3w/ka6uLGLukp4Nfv4DTmRazA==</latexit>

X<latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit><latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit><latexit sha1_base64="Hu9fA9dopUmLwc0ves7vPFyLIvk=">AAAD+3icbVJba9RAFJ52vdR4aauPvgSXooiW7L6oD4WCIL4oFbq2sAllMjlJhp1LnJlUt0P8B/4Ln8Q38c3+EP+NJ+laNm0HEg7f9537SSvBrYuivyurg2vXb9xcuxXcvnP33vrG5v2PVteGwYRpoc1hSi0IrmDiuBNwWBmgMhVwkM5et/zBMRjLtdp38woSSQvFc86oQ+ho43GcQsGVh091hzxtgnjfH+IfVLaEHm0Mo+2oe+FlY7QwhmTx9o42V0/jTLNagnJMUGt9imWBaYKtuLZQUTajBUzRVAjbxHe9NOEWIlmYa4OfcmGHBrEFd+auq7YeX/IMQqUd2GaJdCArQR34lgkrjN9MEeAqCdqcrgQJPoOc1sKhX99J0WNedN3auUy1sIi7eRCXODWDznXlqyxvg0qdwc7EwnutYDl9VyvmNpKK0MEX1/i02GmCYKlhT6Vt42OjkrrSXuRa8CpuWrv8ZeK5qmoslZ3NKa8xjw7bvYYZN8CcmKNBmeGOs5CV1FDmsPp+FlFoFJSSs6ZHOD47ec6yHjjtKmEm8VAzKjBv+AZ3Ey/gnn8qz+lU4ppxiFmrC2IFn5mWkuJJxfvNdJxME48aPxz9D+SH46Zp+sp3l5VGXiWUC2FfhBWcr8byE+wPVxJKavDeQwG524mkfLYMGl6UHYo7w4MfXTzvy8ZkvP1qO/owHu5Gi8tfIw/JI/KEjMgLskvekj0yIYx8I7/IH3I6+Dr4Pvgx+HkmXV1Z+DwgvTf4/Q/cEFlN</latexit>

M<latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit><latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit><latexit sha1_base64="fQXu8n+2Bs6NOHe7rfWSYJA1mSQ=">AAAD+3icbVJba9RAFJ52vdR4a/XRl+BSFNGS3Rf1oVAQxJdKha4tbEKZTE6SYecSZybV7RD/gf/CJ/FNfLM/xH/jSbqWTduBhMP3fed+0kpw66Lo78rq4Nr1GzfXbgW379y9d39948FHq2vDYMK00OYwpRYEVzBx3Ak4rAxQmQo4SGdvWv7gGIzlWu27eQWJpIXiOWfUIXS0/iROoeDKw6e6Q541Qbzvd/EPKltCj9aH0VbUvfCyMVoYQ7J4e0cbq6dxplktQTkmqLU+xbLANMFmXFuoKJvRAqZoKoRt4rtemnATkSzMtcFPubBDg9iCO3PXVVuPL3kGodIObLNEOpCVoA58y4QVxm+mCHCVBG1OV4IEn0FOa+HQr++k6DEvum7tXKZaWMTdPIhLnJpB57ryVZa3QaXOYHti4b1WsJy+qxVzG0lF6OCLa3xabDdBsNSwp9K28bFRSV1pL3IteBU3rV3+KvFcVTWWys7mlNeYR4ftXsOMG2BOzNGgzHDHWchKaihzWH0/iyg0CkrJWdMjHJ+dvGBZD5x2lTCTeKgZFZg3fIu7iRdwzz+V53Qqcc04xKzVBbGCz0xLSfGk4v1mOk6miUeNH47+B/LDcdM0feXuZaWRVwnlQtgXYQXnq7H8BPvDlYSSGrz3UEDutiMpny+Dhhdlh+LO8OBHF8/7sjEZb73eij6MhzvR4vLXyCPymDwlI/KS7JB3ZI9MCCPfyC/yh5wOvg6+D34Mfp5JV1cWPg9J7w1+/wO1kFlC</latexit>

loss function

minM

F (X,M) = kX�Mk2F =X

i

(xi �mi )2

s.t. M = JA,B,CK = a1 � b1 � c1 + · · ·+ aR � bR � cR<latexit sha1_base64="wBAvBTN/nQMbfGoUwFzKAGJLINQ=">AAAFGXicbVPNbhMxEN6UAGX5a+HIxSIqaqGJknCAS6QCUsUFVKqWVopD5PVOEqv27mJ7S4PxW3DjTbghOCCunHgbZjcpbNpaWu3MN9/8ecZRJoWx7faf2tKl+uUrV5evhddv3Lx1e2X1zluT5prDPk9lqg8jZkCKBPatsBIOMw1MRRIOoqMXhf3gGLQRabJnpxkMFBsnYiQ4swgNV2tNGsFYJA7e5yX00IdzhEkxTiBGXYlk6Oiee+U9eUAoMmOyvY7Aod8s4Q3SI/RTCTRLgH4abr/rFqjJ1VCQ9ZOhaKKwgSClGDFKT5xp2db/gKVf4SFlpBk/AktopNwzTIG/57PfC0+1PjX3SgLzww7lQvNCiaoKR4U8IjROrSn+M/IuKQlkTq+q6LAbUkjiSu+F9v9uhiuNdqtdHnJe6MyFRjA/O8PVpd+Yn+cKEsslM8ZFOB3QPlyjuYEMO2Fj6KOYIGwGrhypJ2uIxGSUavwSS0o0pAbszD3NinrcRMRAktSC8RWjBZVJZsEVFpJhfN9HQCSDsMhpJ6DAxTBiubTltKtOCTsW47JbM8UpSYO4nYZ0gsuj0TnPXBaPiqAqjaG3b+B1mkA1fVkr5taKSWLhxHoXjXs+DCsNO6ZMER8bVcxOzFlbAV5k6+d29HTgRJLlWCqf3dMoxzwpKdabxEIDt3KKAuNaWMEJnzBcGIvVL2aR4xQJEyW4XzBYcfSxyeMFsF9WwvXAQc6ZxLxkG2dD5/CCf6T+mSO1YDFWMT3VcTF8vNq48A5pAh94qhTDRaN7vt8d9Aeu2MZG5zS8a3S994vMV+eZWl1EVHPiIgkr+DcwIz5i1zgoguXhyycSRrbXVmqzCmoxnpQoThKfQefs0p8X3nZbncet9ptuY2tz/iCWg3vB/WA96ARPgq3gZbAT7Ae89rn2tfa99qP+pf6t/rP+a0Zdqs197gYLp/77L7Jrv+Y=</latexit>

minM

F (X,M) =X

i

f (xi ,mi )

s.t. M = JA,B,CK<latexit sha1_base64="eZyS56pQXygIbUA45bomTZtYFrA=">AAAEq3icbVPfb9MwEE5HgRF+bfDIi0W1aUOjassDvFQaTJp4YGiIdZuoq8pxLqk1Owm2M9ZZ+d/4H3jiv+GShq3ZZinK+bv77j7f2UEmhbG93t/Wyr32/QcPVx/5j588ffZ8bf3FsUlzzWHEU5nq04AZkCKBkRVWwmmmgalAwklwtlf6T85BG5EmR3aewUSxOBGR4MwiNF37TQOIReLgZ14hbwq/RpgUcQIh7pVIpo4euYOiIJuEYmRI9rcQOC12KnibDAk1uZoKEm1dTMUOWtuEUqQG6YUzXdu9ZlaMkiBloBk/A0tooNxHzIW/T4vfXkG1rt0+hSRcklPuruVO1zq9bq9a5LbRr42OV6/D6frKHxqmPFeQWC6ZMS7AfoEu/A2aG8iwJothjGaCsJm4qskF2UAkJFGq8UssqVCfGrALepqVetxMhECS1IIplpwWVCaZBVd6SIb5izECIpn4ZU07AwUuhIjl0lYDWCYl7FzE1WnNHPspDeJ27tMZjlMjOc9cFkZlUpWGMBwZ+JomsFy+0oq1tWKSWLiwhQviYeH7Swd2TJkyPx5UMTszN30leJdvnNvow8SJJMtRKl/0KcqxTkrKC0dCoYFbOUeDcS2s4ITPGI7WovpmFRmnGDBTghcNhxVnl2952ADHlRKuJw5yziTWJfs4G1rDDX6grtyBaniMVUzPdVgOH1sblmyfJvCLp0oxvGj0qBgPJuOJK29lp/8/vesMiqJoRh7cjtTqrkBVBzaDUMHVwIy4xFPjoAjKw8dIJER22FNqZxnUIp5VKE4Sn0H/5qW/bRwPuv133d63QWd3p34Qq94r77W35fW9996u99k79EYeb222vrRGreP22/b39o82XYSutGrOS6+x2vAPZoOWLA==</latexit>

*T. G. Kolda, D. Hong. Stochastic Gradients for Large-Scale Tensor Decomposition. arXiv 1906.01687, 2019.

Page 16: High performance tensor decomposition on emerging

§ Define tensor Y such that

§ Then gradient of objective function given by

§ And so model can be fitted through direct optimization.

§ Unfortunately Y is in general dense, even when X is sparse, making standard optimization infeasible

§ Instead, employ Stochastic Gradient Descent (SGD) where Y is only randomly sampled§ Stratified: sample zeros and nonzeros separately (requires tensor search)§ Semi-stratified: skip tensor search and adjust for “zeros” that are really nonzeros

Fitting GCP Model

15

y(i1, . . . , id) = yi =@f

@m(xi ,mi )

<latexit sha1_base64="OjqL/zN99PXBo24ZvrJkbHwnl/M=">AAAENnicbVNNa9RAGJ52/ajxq9Wjl8Gl0MpaNnvRy0JBEC9KhW5b2IQwmbzJDp1J4sykdjvkf3nx7G/w4k08CZ706pt0rZu2A7u8PM/zfr+JSymMHQ6/rqz2bty8dXvtjnf33v0HD9c3Hh2YotIcJryQhT6KmQEpcphYYSUclRqYiiUcxsevGv7wBLQRRb5v5yWEimW5SAVnFqFonQUxZCJ38KFqkWe1N98SkT8IksKagYiSbTqm80jgf5Bqxl1QMm0FkzSt/9uq3jqNxEBFYtsLIE+W4kXr/eHOsH30quEvjD5ZvL1oY/UzJueVgtxyyYxxMTYEuvY2g8pAyfgxy2CKZo6wCV07hZpuIpLQtND4yy1tUS8wYM/di7Kpx81EAjQvLJh6ibSgSsksuIahJcavpwiIPPSanHYGClwCKaukRb+uU85ORNZ2a+YqLqRB3M69YIbz1uhcla5M0iaoKhIYTwy8K3JYTt/Wirm1wllaOLW1i7Nx7XlLDTumTBMfG1XMzsxlrgGv46aVTV+GTuRlhaXy8zmlFeYpaHMRNBEauJVzNBjXwgpO+Yzhqi1W380iswIFMyV43SGsOD57zpMOOG0r4Tp0UHEmMS99jbsJFnDHP1YXdKxwzTjEpNF5QQ4feaEUw5MK9uvpKJyGDjWu7/8L5Pqjuq67yrdXlVpdJ1QLYVeEFVysxogz7A9XQhXT+KVQCakdD5UaLINaZLMWxZ3hwfuXz/uqcTDa8Yc7/vtRf3e0OP018oQ8JVvEJy/ILnlD9siEcPKF/CS/yZ/ep9633vfej3Pp6srC5zHpvN6vv6f3b9c=</latexit><latexit sha1_base64="OjqL/zN99PXBo24ZvrJkbHwnl/M=">AAAENnicbVNNa9RAGJ52/ajxq9Wjl8Gl0MpaNnvRy0JBEC9KhW5b2IQwmbzJDp1J4sykdjvkf3nx7G/w4k08CZ706pt0rZu2A7u8PM/zfr+JSymMHQ6/rqz2bty8dXvtjnf33v0HD9c3Hh2YotIcJryQhT6KmQEpcphYYSUclRqYiiUcxsevGv7wBLQRRb5v5yWEimW5SAVnFqFonQUxZCJ38KFqkWe1N98SkT8IksKagYiSbTqm80jgf5Bqxl1QMm0FkzSt/9uq3jqNxEBFYtsLIE+W4kXr/eHOsH30quEvjD5ZvL1oY/UzJueVgtxyyYxxMTYEuvY2g8pAyfgxy2CKZo6wCV07hZpuIpLQtND4yy1tUS8wYM/di7Kpx81EAjQvLJh6ibSgSsksuIahJcavpwiIPPSanHYGClwCKaukRb+uU85ORNZ2a+YqLqRB3M69YIbz1uhcla5M0iaoKhIYTwy8K3JYTt/Wirm1wllaOLW1i7Nx7XlLDTumTBMfG1XMzsxlrgGv46aVTV+GTuRlhaXy8zmlFeYpaHMRNBEauJVzNBjXwgpO+Yzhqi1W380iswIFMyV43SGsOD57zpMOOG0r4Tp0UHEmMS99jbsJFnDHP1YXdKxwzTjEpNF5QQ4feaEUw5MK9uvpKJyGDjWu7/8L5Pqjuq67yrdXlVpdJ1QLYVeEFVysxogz7A9XQhXT+KVQCakdD5UaLINaZLMWxZ3hwfuXz/uqcTDa8Yc7/vtRf3e0OP018oQ8JVvEJy/ILnlD9siEcPKF/CS/yZ/ep9633vfej3Pp6srC5zHpvN6vv6f3b9c=</latexit><latexit sha1_base64="OjqL/zN99PXBo24ZvrJkbHwnl/M=">AAAENnicbVNNa9RAGJ52/ajxq9Wjl8Gl0MpaNnvRy0JBEC9KhW5b2IQwmbzJDp1J4sykdjvkf3nx7G/w4k08CZ706pt0rZu2A7u8PM/zfr+JSymMHQ6/rqz2bty8dXvtjnf33v0HD9c3Hh2YotIcJryQhT6KmQEpcphYYSUclRqYiiUcxsevGv7wBLQRRb5v5yWEimW5SAVnFqFonQUxZCJ38KFqkWe1N98SkT8IksKagYiSbTqm80jgf5Bqxl1QMm0FkzSt/9uq3jqNxEBFYtsLIE+W4kXr/eHOsH30quEvjD5ZvL1oY/UzJueVgtxyyYxxMTYEuvY2g8pAyfgxy2CKZo6wCV07hZpuIpLQtND4yy1tUS8wYM/di7Kpx81EAjQvLJh6ibSgSsksuIahJcavpwiIPPSanHYGClwCKaukRb+uU85ORNZ2a+YqLqRB3M69YIbz1uhcla5M0iaoKhIYTwy8K3JYTt/Wirm1wllaOLW1i7Nx7XlLDTumTBMfG1XMzsxlrgGv46aVTV+GTuRlhaXy8zmlFeYpaHMRNBEauJVzNBjXwgpO+Yzhqi1W380iswIFMyV43SGsOD57zpMOOG0r4Tp0UHEmMS99jbsJFnDHP1YXdKxwzTjEpNF5QQ4feaEUw5MK9uvpKJyGDjWu7/8L5Pqjuq67yrdXlVpdJ1QLYVeEFVysxogz7A9XQhXT+KVQCakdD5UaLINaZLMWxZ3hwfuXz/uqcTDa8Yc7/vtRf3e0OP018oQ8JVvEJy/ILnlD9siEcPKF/CS/yZ/ep9633vfej3Pp6srC5zHpvN6vv6f3b9c=</latexit><latexit sha1_base64="OjqL/zN99PXBo24ZvrJkbHwnl/M=">AAAENnicbVNNa9RAGJ52/ajxq9Wjl8Gl0MpaNnvRy0JBEC9KhW5b2IQwmbzJDp1J4sykdjvkf3nx7G/w4k08CZ706pt0rZu2A7u8PM/zfr+JSymMHQ6/rqz2bty8dXvtjnf33v0HD9c3Hh2YotIcJryQhT6KmQEpcphYYSUclRqYiiUcxsevGv7wBLQRRb5v5yWEimW5SAVnFqFonQUxZCJ38KFqkWe1N98SkT8IksKagYiSbTqm80jgf5Bqxl1QMm0FkzSt/9uq3jqNxEBFYtsLIE+W4kXr/eHOsH30quEvjD5ZvL1oY/UzJueVgtxyyYxxMTYEuvY2g8pAyfgxy2CKZo6wCV07hZpuIpLQtND4yy1tUS8wYM/di7Kpx81EAjQvLJh6ibSgSsksuIahJcavpwiIPPSanHYGClwCKaukRb+uU85ORNZ2a+YqLqRB3M69YIbz1uhcla5M0iaoKhIYTwy8K3JYTt/Wirm1wllaOLW1i7Nx7XlLDTumTBMfG1XMzsxlrgGv46aVTV+GTuRlhaXy8zmlFeYpaHMRNBEauJVzNBjXwgpO+Yzhqi1W380iswIFMyV43SGsOD57zpMOOG0r4Tp0UHEmMS99jbsJFnDHP1YXdKxwzTjEpNF5QQ4feaEUw5MK9uvpKJyGDjWu7/8L5Pqjuq67yrdXlVpdJ1QLYVeEFVysxogz7A9XQhXT+KVQCakdD5UaLINaZLMWxZ3hwfuXz/uqcTDa8Yc7/vtRf3e0OP018oQ8JVvEJy/ILnlD9siEcPKF/CS/yZ/ep9633vfej3Pp6srC5zHpvN6vv6f3b9c=</latexit>

Gk =@F

@Ak= Y(k)(Ad � · · ·� Ak+1 � Ak�1 � · · ·� A1)

<latexit sha1_base64="y/7hII3V+FXeYxHPlMXuAnCXo84=">AAAEmHicbVNbb9MwFE5HgRFuG3uDF4tq0gZsavoCPFQaTNrgATTQbqipKsc5aa3GcbCdQWf5J/FreOLfcJJmVbPNkqWj7/vOxeccR3nKtel2/7VW7rTv3ru/+sB/+Ojxk6dr689OtSwUgxMmU6nOI6oh5RmcGG5SOM8VUBGlcBZN90v+7AKU5jI7NrMchoKOM55wRg1Co7U/YQRjnln4WVTIK+eHkbCHbjQlfRImijIb5lQZTlNy4BZ2KfqAIleqju0PN7Jb0223VeNxKGNpQrx6bs1hO30duCawcwUstLbmArfth5DFS7WN1jrd3W51yE0jqI2OV5+j0frKXwzMCgGZYSnV2kbYHFDO3wwLDTllUzqGAZoZwnpoq446solITBKp8GaGVKgfajBzd5mX9dgJj4Fk0oB2S6QBkafUgC0ZkmN8N0CAZ0O/zGkmIMDGkNAiNWW3G04ZveDj6rV6JiKZasTNzA8nODuFzkVu8zgpgwoZQ/9Ew1eZwXL6qlbMrQROzMBv42w07jvfX3qwpUKX8fGhgpqJvs6V4G3coDDJu6HlWV5gqWzep6TAPJKU20ViroCZdIYGZYobzgibUFwig9U3s6RjiYKJ4Mw1CMOnlzssboCDqhKmhhYKRlPMSw5wNmENN/wjsaAj0WC0EVTNVFwOH1sbl95+mMEvJoWguGjhsRv0hoNhtYGd4Cq87fScc03ll5tKJW4TilrYFGEFi4FpfomvxkERLA//IkkhMf2uEG+WQcXHkwrFSeI3CK4v/U3jtLcbdHeDb73OXq/+EKveC++lt+UF3ltvz/vkHXknHmtttN63Prb228/be+3D9ue5dKVV+2x4jdP+/h9PgJJy</latexit><latexit sha1_base64="y/7hII3V+FXeYxHPlMXuAnCXo84=">AAAEmHicbVNbb9MwFE5HgRFuG3uDF4tq0gZsavoCPFQaTNrgATTQbqipKsc5aa3GcbCdQWf5J/FreOLfcJJmVbPNkqWj7/vOxeccR3nKtel2/7VW7rTv3ru/+sB/+Ojxk6dr689OtSwUgxMmU6nOI6oh5RmcGG5SOM8VUBGlcBZN90v+7AKU5jI7NrMchoKOM55wRg1Co7U/YQRjnln4WVTIK+eHkbCHbjQlfRImijIb5lQZTlNy4BZ2KfqAIleqju0PN7Jb0223VeNxKGNpQrx6bs1hO30duCawcwUstLbmArfth5DFS7WN1jrd3W51yE0jqI2OV5+j0frKXwzMCgGZYSnV2kbYHFDO3wwLDTllUzqGAZoZwnpoq446solITBKp8GaGVKgfajBzd5mX9dgJj4Fk0oB2S6QBkafUgC0ZkmN8N0CAZ0O/zGkmIMDGkNAiNWW3G04ZveDj6rV6JiKZasTNzA8nODuFzkVu8zgpgwoZQ/9Ew1eZwXL6qlbMrQROzMBv42w07jvfX3qwpUKX8fGhgpqJvs6V4G3coDDJu6HlWV5gqWzep6TAPJKU20ViroCZdIYGZYobzgibUFwig9U3s6RjiYKJ4Mw1CMOnlzssboCDqhKmhhYKRlPMSw5wNmENN/wjsaAj0WC0EVTNVFwOH1sbl95+mMEvJoWguGjhsRv0hoNhtYGd4Cq87fScc03ll5tKJW4TilrYFGEFi4FpfomvxkERLA//IkkhMf2uEG+WQcXHkwrFSeI3CK4v/U3jtLcbdHeDb73OXq/+EKveC++lt+UF3ltvz/vkHXknHmtttN63Prb228/be+3D9ue5dKVV+2x4jdP+/h9PgJJy</latexit><latexit sha1_base64="y/7hII3V+FXeYxHPlMXuAnCXo84=">AAAEmHicbVNbb9MwFE5HgRFuG3uDF4tq0gZsavoCPFQaTNrgATTQbqipKsc5aa3GcbCdQWf5J/FreOLfcJJmVbPNkqWj7/vOxeccR3nKtel2/7VW7rTv3ru/+sB/+Ojxk6dr689OtSwUgxMmU6nOI6oh5RmcGG5SOM8VUBGlcBZN90v+7AKU5jI7NrMchoKOM55wRg1Co7U/YQRjnln4WVTIK+eHkbCHbjQlfRImijIb5lQZTlNy4BZ2KfqAIleqju0PN7Jb0223VeNxKGNpQrx6bs1hO30duCawcwUstLbmArfth5DFS7WN1jrd3W51yE0jqI2OV5+j0frKXwzMCgGZYSnV2kbYHFDO3wwLDTllUzqGAZoZwnpoq446solITBKp8GaGVKgfajBzd5mX9dgJj4Fk0oB2S6QBkafUgC0ZkmN8N0CAZ0O/zGkmIMDGkNAiNWW3G04ZveDj6rV6JiKZasTNzA8nODuFzkVu8zgpgwoZQ/9Ew1eZwXL6qlbMrQROzMBv42w07jvfX3qwpUKX8fGhgpqJvs6V4G3coDDJu6HlWV5gqWzep6TAPJKU20ViroCZdIYGZYobzgibUFwig9U3s6RjiYKJ4Mw1CMOnlzssboCDqhKmhhYKRlPMSw5wNmENN/wjsaAj0WC0EVTNVFwOH1sbl95+mMEvJoWguGjhsRv0hoNhtYGd4Cq87fScc03ll5tKJW4TilrYFGEFi4FpfomvxkERLA//IkkhMf2uEG+WQcXHkwrFSeI3CK4v/U3jtLcbdHeDb73OXq/+EKveC++lt+UF3ltvz/vkHXknHmtttN63Prb228/be+3D9ue5dKVV+2x4jdP+/h9PgJJy</latexit><latexit sha1_base64="y/7hII3V+FXeYxHPlMXuAnCXo84=">AAAEmHicbVNbb9MwFE5HgRFuG3uDF4tq0gZsavoCPFQaTNrgATTQbqipKsc5aa3GcbCdQWf5J/FreOLfcJJmVbPNkqWj7/vOxeccR3nKtel2/7VW7rTv3ru/+sB/+Ojxk6dr689OtSwUgxMmU6nOI6oh5RmcGG5SOM8VUBGlcBZN90v+7AKU5jI7NrMchoKOM55wRg1Co7U/YQRjnln4WVTIK+eHkbCHbjQlfRImijIb5lQZTlNy4BZ2KfqAIleqju0PN7Jb0223VeNxKGNpQrx6bs1hO30duCawcwUstLbmArfth5DFS7WN1jrd3W51yE0jqI2OV5+j0frKXwzMCgGZYSnV2kbYHFDO3wwLDTllUzqGAZoZwnpoq446solITBKp8GaGVKgfajBzd5mX9dgJj4Fk0oB2S6QBkafUgC0ZkmN8N0CAZ0O/zGkmIMDGkNAiNWW3G04ZveDj6rV6JiKZasTNzA8nODuFzkVu8zgpgwoZQ/9Ew1eZwXL6qlbMrQROzMBv42w07jvfX3qwpUKX8fGhgpqJvs6V4G3coDDJu6HlWV5gqWzep6TAPJKU20ViroCZdIYGZYobzgibUFwig9U3s6RjiYKJ4Mw1CMOnlzssboCDqhKmhhYKRlPMSw5wNmENN/wjsaAj0WC0EVTNVFwOH1sbl95+mMEvJoWguGjhsRv0hoNhtYGd4Cq87fScc03ll5tKJW4TilrYFGEFi4FpfomvxkERLA//IkkhMf2uEG+WQcXHkwrFSeI3CK4v/U3jtLcbdHeDb73OXq/+EKveC++lt+UF3ltvz/vkHXknHmtttN63Prb228/be+3D9ue5dKVV+2x4jdP+/h9PgJJy</latexit>

minM

F (X,M) =X

i

f (xi ,mi ) s.t. M = JA1, . . . ,AdK<latexit sha1_base64="YAWJrdL9XSxQIBMhicrc9ZNqfKg=">AAAEjHicbVNdSxtBFN3YtNpta7V97MvQIGiJIUlLLUjAUpC+WCwYFTIhzM7eJIMzu9uZWWsc5sf0N/Wp/6Z311Wy6sDC3XPPuZ8zUSaFsd3uv8bKk+bTZ6trz8MXL1+tv97YfHNq0lxzGPJUpvo8YgakSGBohZVwnmlgKpJwFl18K/xnl6CNSJMTu8hgrNgsEVPBmUVosvGHRjATiYNfeYl88CFVIpk4euKOvCd0nxxuo33u2yWyQwaEmlxNBJluX01EG60dQlEdE6qi9MqZju34W6SQFAopI834BVhCI+W++kmvTePUmnb1G1OtK0ZIIYmX6plstLqdbnnIQ6NXGa2gOseTzZW/GJvnChLLJTPGRTgQ0D7cormBDLOwGYzQTBA2Y1dO0ZMtRGIyTTV+iSUlGlID9kaeZkU9bi5iIElqwfglpwWVSWbBFR6SYXw/QkAk47DIaeegwMUwZbm0qKuLEnYpZmW3ZoEjlAZxuwjpHPelUZxnLounRVCVxjAYGviRJrCcvqwVc2vFJLFwZb2LZgMfhksNO6ZMER8bVczOzX1fAT7mG+V2+mXsRJLlWCq/mdM0xzwpKW4UiYUGbuUCDca1sIITPme4TIvV17PIWYqEuRLc1xxWXFzv8rgGjspKuB47yDmTmJcc4m5oBdf0kbpzR6rmMVYxvdBxsXwcbVyoQ5rAb54qxfCi0RM/6o9HY1fcxFbvNrxr9b33debRQ6ZWjxFVRayTsIK7hRlxjV3jogiWh++PSJjaQVep9jKoxWxeorhJfAa9+5f+oXHa7/Q+dro/+62DbvUg1oJ3wftgO+gFe8FB8D04DoYBb6w2dhufG3vN9ean5n5zcENdaVSat0HtNA//A0kRicU=</latexit>

MTTKRP!

Page 17: High performance tensor decomposition on emerging

GCP-SGD (ADAM) Algorithm

16

1: function GcpAdam(X, {Ak}k=1,...,d , , ⌧)2: for k = 1, . . . , d do3: Gk ,Mk ,Vk zero matrix of size nk ⇥ r

4: Yf SampleForObjective(X)5: F ComputeObjective(Yf , {Ak})6: while F > tolerance and at most iterations do7: Save copies of {Ak}, {Mk}, {Vk}, F8: for ⌧ iterations do9: Yg SampleForGradient(X)

10: {Gk} MTTKRP(Yg , {Ak})11: {Ak}, {Mk}, {Vk} 12: Adam({Gk}, {Ak}, {Mk}, {Vk})13: {Ak} Clip({Ak})14: F ComputObjective(Yf , {Ak})15: if F > Fold then16: Restore saved copied of {Ak}, {Vk}, {Vk}, F17: Increase learning rate

<latexit sha1_base64="XR+IIgBLuHUPWAjT6Xjd8ZdOAnY=">AAAK9XicjVZbbxw1FN62EMpwa8ojL1ayEQWFKBse4CWoVUVChAKlTdKindXK4zmza3Y8HmxPbtb8FMQLQvDE/+Af8G849sxu1pP0Mg+R99y+c3y+c5ykzLk229v/3bp95623V965+2703vsffPjRvdX7J1pWisExk7lULxKqIecFHBtucnhRKqAiyeF5Mnvs9M9PQWkuiyNzUcJI0EnBM86oQdF4deV+nMCEF5bmE6m4mQrO6uFgFBH84r2qYM7O7rPyUUpFbfvxkX1R9zdJP7ZxIuyjejyL67Gd7Q4241QavZk22hktS+pPhlb92sfbiJ8ZamDue3iD7yZpdCc36yZgdJ9cgpJEUKM4A01kRjS/xKjFeEZiwwXKVL8tQCrbX4rRJoKaeSaItY9YGLvNqDk5/Otw5wFYB+vbIkW4ttDHUggojDUgSqmouiCnVHGKbdGk0pCSTCrirzQK0jmyP9fjrEHGYuDcaGafUVHmgMF/TH4BbMgp1A98Hz7rh95TashexxkzKSsDoasDmV9108J5qOdTnoP1oexeTb5ZxDEyB0ULBnWf0CIlCCWkNotWE25Q79iig5qe0VMgTJa8aVXAmyseHXZ+n1z9bjJZbqin1Mvw2jucvOwO9xVNObbmJVfo0bGzeQpIDJ9FJ9Lh0dH3T5/Mr3Fy8zUG4RrV3PAw+HWyBNHx/bWiafOnRfZ8edD47QdRXo/wisS6hMl5OUcJi1pi+BtQ7vWMWx4U0DhNLmQutSZnuIlIxs9xUDQYP3W+f/NWH2R2Do0M9Yex9fBW5mld9+ul0GwKbOYHLqM8rxQQI0kKDPekbvACAj3FVCQaaSRu2jA3fQVzT17L3OUyFeA2Ni6BnOLwQCnZ9FN9tRyCTA6KeY5AVcGLCUHCw6IXB9ni6Kc22LHxUzCVKjpZX/Wx3ewoiKFIg/U/vre+vbXtP3L9MGgP6732ezJevfMv7ldWuQoZ1qVtgk8QqDraiHHZlZTN6ASGeCxQrEfWv1vYosUmzGRhiJdGMTa8cZelf3umPAVSSOMu50rpFmuOhVqnISXGr4co4MUocphmCgJsChmtcoN+oVNBT/mk2R0XIpG5Rrm5iOIpvpAKnavSlmnmggqZwu6xhh9kAcvwPlfEVoLmxPGutslkt46ipYItFdrFx0KR2VPd1TnhTbphZbKvR5YXbm8XrLmnrModa9wbTlKucLLyCzxQhl3jjLApVZThRuygXLX1vO5qSg0Vtg0rDFWGzy6/YGkgHPosmRpZqBjNMSeCq4DErTjwT8RCnYhAo43Ap1Cljhh47anzjuICzhiOBz4qNj6qhzuj4chzdn0wD2/Xd+q6Di0Pr1sqcZOhaA1DI8xg0Uz3olvXRHzlFf4zhPOWmd1tITaXhYpPpl7auS2aVDlWFUoTKWeo0Y4ReNUuzue22Z/1Bq4YIfFVXMPJc/Rf8wwKDf1Ev5npQdax49lNZvORd80Jorby1genf9Cd9euHk52twZdbOz/trD/cbvfA3d4nvbXeg96g91XvYe+73pPecY+tXKz8vvLXyt/RWfRb9Ef0Z2N6+1br83Ev+KJ//ge9CsTg</latexit>

Page 18: High performance tensor decomposition on emerging

Improving Gradient Computation

§ Idea: For semi-stratified, fuse sampling and MTTKRP kernels§ Avoid explicit construction of sampled tensor§ Compute MTTKRP in all modes simultaneously§ Currently restricted to atomics for MTTKRP

factor matrix update§ Should work well for GPU

§ What about architectures without fast atomics: ``Sparse Array”§ Put MTTKRP contribution for each nonzero in

Y in a separate row of temporary buffer along with row index (no thread contention)

§ Reduce entries with the same row index (parallel reduce-by-key)

§ Perform ADAM update and clip with only the nonzero rows

§ Should work well when # samples << mode length

17

1: function SparseArrayUpdate(X, {Ak}, {Mk}, {Vk})2: for k = 1, . . . , d do3: Gk zero matrix of size s ⇥ r . Sample MTTKRPs4: Rk zero vector of size s . Row keys

5: parallel for j = 1, . . . , s do6: y , i1, . . . , id SemiStratifiedSample(X, j)7: for k = 1, . . . , d do8: Rk(j) ik9: Gk(j , :) y ·

Qj 6=k Aj(ij , :) . MTTKRP

10: for k = 1, . . . , n do11: Rk , Gk , tk ParallelReduceByKey(Rk , Gk)12: parallel for j = 1, . . . , tk do13: ik Rk(j)14: Ak(ik , :),Mk(ik , :),Vk(ik , :)) 15: Adam(Gk(j , :),Ak(ik , :),Mk(ik , :),Vk(ik , :))16: Ak(ik , :) Clip(Ak(ik , :))

<latexit sha1_base64="Owa3xrWOzXhH6GbKbP/k8g7bulk=">AAALgHiclVZbbyM1FJ4tBJYE2F0QT7xYbSqlKFua8gBaqdKWigW0KirpbaVMFHk8nsSZ8XjW9nSbWv5h/BSeeIV/wZlLLjNJ2ZIn51y+852Lj8dLIqb0wcGfj7Y++LDx0cePP2m2Pv3s8ydPn31xpUQqCb0kIhLyjYcVjVhMLzXTEX2TSIq5F9FrLzzJ9Nc3VCom4gs9S+iQ43HMAkawBtHoWaPvenTMYoOjsZBMTzgjdtAbNhFyX6UxyazMeYKlosdS4tll4mNNrWm7F+aNbXdR2zWux82xHYXuyv/T2v+r4r8F3F33XAPGXONqFvnU/GxzC+SOqVZthKPo+R2VAnGsJSNUIREgxe7ATSFw4SCRbeSeCM5prM055klE0enFxev+mdoQpv9g+BXUvniHQjrL8dxXQpp2eNTrur7QquvnyYB8HqaWyzzUMsztSpAHpVAF79dBbyjRQj6A+U+xD+Sz0xmWeRrTRRqqlsasi9hormQjv4gIJae3WhFzTjk71xKmJ2DULyjbTj4M3eleu4C6p1Lr6XSmeyU+G4Xt0mj3vop2pt0XezU+XOtQJrYDtIHEqe2iToX+XheFe+0N4e8DnrkEfN1ECn9kpm5M36LQlhM+BezMdKXORbfKEi7rDKei1BsGJ94wOFkxIIMary7S85YvEgZYGF4a9amfEvrj7DWdQf3vhZi3ZL3xAL3WF7YMt+zQpupl1x2KEUIxipindcHVUlCWdg3nbYr9eVrHPuZFHmuNKfD+X8T3cM75zEOfRCwpQh+vIVQbWenvfDs2XRr7lfU5erpzsH+Q/9D6oVcedpzydwZb+CvoCUmziSIRVsp4sMKptM1dN1U0wSTEYzqAYwxiNTT53rdoFyQ+CmAHBCLWKJc2XUV14S6SfHtPmE9RLDSFZbBUagpXF2pjMg1KAN8OQMDiYTOLqSeUU+PTAKeRBr+qU4xv2Dh/QdSMeyJSINezpjuBFwZeCp0mJvGDDJQLnx5dKvqbiOlq+JwrxJYcRyhrhDXe+Mg2mysJG8xVhg+Jwv6cqLouE27SDVId/DA0LE5SoEqKOgUpxBEoewORzySszmgGB0yga4wgMoF7RTSwr0ZZtvXW1jWJoim0DTKsqjQL754TvyIc5CyJHBqaEhwBJwSjhNxSXPH3+ELt8YpGaY7lTPrZYEDZ/cy7CTvqHYF1hGEM3Qs7OBwOhvmTt9Obw5udQ2tt1fJ03VLyTYa8NKwaAYNFM7Pnx2RNhHdOwscEimigjw44764KJRtPcmmtWthLI8iqKvWECEGjsomAUnuRIKEpbqI1i0tpc+WSrLvSL/hmgZth8lvuBSYp92Z2Xex7/HyxcPPFfxrD1a/FAYmLNgSTFBAym3kW8JFl1vm6CMqM1vjUMZYVMBvplGXLbL8xxdqyu6gP1/GGom0wy5ht5xevang9YfCgP8j016Bmx4JNZvM9mc10BbWUb/Ipc4NF2quvzfXD1eF+77v9w98Pd162y5X62Pna2XY6Ts/53nnp/OKcOZcOafzR+Kvxd+Of1lar0/q21StMtx6VPl86lV/rxb9qD/i0</latexit>

1: function FusedUpdate(X, {Ak}, {Gk}, {Mk}, {Vk})2: parallel for j = 1, . . . , s do3: y , i1, . . . , id SemiStratifiedSample(X, j)4: for k = 1, . . . , d do5: Gk(ik , :) a Gk(ik , :) + y ·

Qj 6=k Aj(ij , :)

6: {Ak}, {Mk}, {Vk} 7: Adam({Gk}, {Ak}, {Mk}, {Vk})8: {Ak} Clip({Ak})

<latexit sha1_base64="SzyIbCgeQtRmlTT3mE5uUDOf1iQ=">AAAIm3icjVVtb9s2EFaTdvO0t6bDPg0DiNgBktUL4uxDhwEBGgTLXjNkTZwEMA2DIimbsSipJJXVJfgv92V/ZZ92erEjKd46AQnIu+funjvenYM0EtocHPz1aGPz8ZP33u984H/40ceffPp069mVTjJF+ZAmUaJuAqJ5JGI+NMJE/CZVnMgg4tfB/CTXX99xpUUSX5pFyseSTGMRCkoMiCZbGxIHfCpiS6JpooSZSUHdaDD2EcKnWUxzlD3NNGfDlBHDne3hS3vjen3UwxYH0h67yRzX7j+07met+1V5d3mEc6JOE2V7t0eDPmaJ0X1dKEB1YSAa6i36SEyWSjFhCE+50Qgb/sZoai+4FBdGQTKh4OyCyDTibrdg2L/d65WuihDzVQhWhbgPsqS9Kybz/nd7ZYwJQW35c7TAFHzgVCVsYm9xzF+juauqcAuwW4D10A4+SaTksbFnl5e/vDqvUvo+ZkDFL09l6n6NRK2afVSv3fJWVq5k16tbvs4IK/9VZTlmREIZau+x9PFu/3u9fyHVKv1JJNJljOOaZZ5l1Tg+5jFrdNbkafdg/6D40MPDoDp0veo7n2w9+RxejWZ5NWlEtLYBdDdXzt/B0JUpoXMy5SM4xiDWY1uMhEM7ecuiMFHwFxtUSH2suSnNk7Ro7JlgHMWJ4drVlIZDG0H6NtegFPy7EQhEPPbzmGbGJbeMhySLDNg1jWJyJ6bFcOmFDJJIg9wsfDyD4VNgnKU2ZWHuVCaMHw01/y2JeT18wRViK0kilBfb2WB65Hy/lrAlUuf+IVFJzEy3dblwnW6UmfDbsRVxmgFVWtYpzCBOgvL1gJhQnJpoAQdC4dUERXRGFKEG2Dej3D/rG9fWpJpn8GyQYVNlxPzt15Q1hKOCJVVjyzNKIuCEYDIQrsQN+0Cu1IFsaLSRRC0UyxsDys5yax8G9A8Ko0igDfGlGx2OR+OiYbuDpXvbPXTONZFnD5FKrgPKCtgEAYPVY2rxFrKGR0RAD/Ysinhojg6k7NeFSkxnhbRVLRJkEWTVlAZJMgeNzjsCSh1ECZ3bcp04u9osrlDek8W190qJgsmwxSQHoYUriSIe5ePi3mHHkpUZS/4TDKPfigMSjNYEUxw85JhlFvD7Yx/yxQjKjB7wafu4r4BdS6cqW479ypYr2e2gVzCOdxxtAyxntl0MXhN4PRPw4/K/oD+FLZwI18GWezLv6YbXSr7OpsoNFumgvTYfHq4O9wff7B/+fth92atWasf7wtv2dr2B98J76f3onXtDj278ufH35ubm486XnZPOz51fS+jGo8rmM6/xdYb/AAIaBis=</latexit>

Page 19: High performance tensor decomposition on emerging

GCP-SGD Performance Comparison

18

15K zero and nonzero samples for all tensors

Page 20: High performance tensor decomposition on emerging

GenTen Speedup over Matlab TTB

19

6.42.1

6.1

62.6

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

HSW CPU KNL K80 GPU P100 GPU

x-Sp

eedu

pGCP-SGD speedup over serial state-of-the-art code

(Chicago crime data tensor)

GenTen

Page 21: High performance tensor decomposition on emerging

Summary & Conclusions

§ New software package Genten§ Publicly available at https://gitlab.com/tensors/genten§ POC: Eric Phipps, [email protected]

§ Built on Kokkos for shared-memory parallelism and high performance, portable to emerging parallel architectures

§ Simple modification of standard MTTKRP algorithm to store and iterate over permutation arrays improves performance for CP-ALS

§ Future Work§ CP-ALS

§ Continue to improve performance (MTTKRP, sorting, ...)§ Optimize algorithms for dense tensors§ Kokkos implementation of state-of-the-art formats such as CSF and HiCOO§ Merge with K. Devine’s distributed memory implementation

§ GCP:§ Investigate number of gradient samples vs. MTTKRP algorithms§ Continue to improve sampling/gradient kernels§ Sparse array approach motives asynchronous SGD (a la HogWild!)

20

Page 22: High performance tensor decomposition on emerging

Acknowledgments

§ This work was supported by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA-0003525.

21

Page 23: High performance tensor decomposition on emerging

Backup Slides

22

Page 24: High performance tensor decomposition on emerging

GenTen Project

§ Work funded by LDRD program at Sandia National Labs§ SNL: T. Kolda (PI), K. Devine, E. Phipps, D. Dunlavy, J. Duersch, C.

Anderson-Bergman§ Georgia Tech: Rich Vuduc, Jeff Young, Srinivas Eswar, Jiajia Li§ Wake Forest: G. Ballard

§ Goals of the project:§ Develop sparse tensor decomposition methods for more general

tensor data types (Tammy’s talk earlier)§ Develop distributed memory parallel implementations § Develop shared memory parallel implementations for next-generation

computing architectures (this talk)

23

Page 25: High performance tensor decomposition on emerging

Total MTTKRP Time per Nonzero

24

Page 26: High performance tensor decomposition on emerging

Tensor Sorting Cost

§ Permutation method requires N sorting operations§ Sort tensor coordinates for each mode§ Small, but not inconsequential cost

§ Sorting methods used§ OpenMP: Parallel stable sort from Intel§ Cuda: Thrust stable sort

Sorting time divided by avg. (permuted) CP-ALS iteration time

18 E. PHIPPS AND T. KOLDA

Table 6: Results on real-world tensor data sets.

(a) Real-world tensor data sets from FROSTT.

Name Order Dimensions Nonzeros

LBNL 5 1.6K⇥ 4.2K⇥ 1.6K⇥ 4.2K⇥ 868K 1.7MUber 4 183⇥ 24⇥ 1.1K⇥ 1.7K 13MEnron 4 6.0K⇥ 5.7K⇥ 244K⇥ 1.2K 54MVAST 5 165K⇥ 11K⇥ 2⇥ 100⇥ 89 26MNELL2 3 12K⇥ 9.1K⇥ 29K 77MDelicious 4 532K⇥ 17M⇥ 2.5M⇥ 1.4K 140M

(b) Total MTTKRP time (in seconds) for ten iterations with R=16 for eachdata tensor, architecture, and method. Due to the increased storage re-quirements of the permuted approach, global memory was exhausted forMTTKRP-B for the Delicious tensor on the K80 and P100 architectures.

Arch. Method LBNL Uber Enron VAST NELL2 Delicious

HSW MTTKRP-B 16.6 8.3 45.7 284.0 42.8 57.4MTTKRP-C 0.4 0.3 5.7 6.3 7.0 22.7SPLATT-M 0.8 0.2 1.1 26.1 1.3 23.4SPLATT-T 0.5 1.6 10.4 3.8 1.4 71.2

KNL MTTKRP-B 41.2 9.2 108.0 1910.0 82.4 79.1MTTKRP-C 0.4 0.2 2.8 16.1 3.2 12.9SPLATT-M 2.0 1.2 2.6 150.5 9.0 47.7SPLATT-T 1.0 5.1 41.3 6.5 15.6 219.1

K80 MTTKRP-B 72.4 5.6 218.0 2010.0 55.2 30.3MTTKRP-C 0.5 0.5 9.4 7.4 11.6 –

P100 MTTKRP-B 0.4 0.2 2.0 4.1 1.6 4.0MTTKRP-C 0.1 0.1 1.7 1.5 1.8 –

(c) Sorting cost for the permutation-based MTTKRP approach scaledby the average (permuted) CP-ALS iteration time with R=16.

Architecture LBNL Uber Enron VAST NELL2 Delicious

HSW 0.6 4.3 9.2 5.3 6.2 4.0KNL 1.0 5.5 8.6 1.3 7.6 4.9K80 1.1 3.0 3.3 3.3 2.6 –P100 0.5 1.3 4.2 3.6 4.6 –

Table 6b shows the results based on the total MTTKRP time for 10 CP-ALS itera-504

tions with R = 16 factor components. The timing results shown here are aggregated505

over all modes of the tensor, even though significant variation in performance is often506

observed for di↵erent modes of a given tensor. While performance of the COO-based507

MTTKRP is relatively insensitive to the length of each mode (since it parallelizes over508

nonzeros and not factor matrix rows), it is quite sensitive to the ordering of nonzero509

coordinates in each mode (since this a↵ects the amount of atomic contention). How-510

ever the permutation-based approach by design attempts to mitigate this.511

If we compare just MTTKRP-B and MTTKRP-C, then MTTKRP-C (the per-512

muted approach) is up to two orders of magnitude faster. The greater di↵erences for513

This manuscript is for review purposes only.

25

Page 27: High performance tensor decomposition on emerging

§ Given index set , define

§ SGD theory only requires

§ Uniform sampling§ Sample s indices uniformly with replacement: § Will miss nonzeros if X is very sparse

§ Stratified sampling§ Sample p nonzeros and q nonzeros uniformly with replacement§ Requires searching X for each sample to determine if zero or nonzero

§ Semi-stratified sampling§ Sample p nonzeros and q ”zeros” uniformly with replacement§ Correct for sampled zeros that were really nonzeros

Sampling for SGD

26

⌦<latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="BffOntVlnscNflW07UeXnEydn0Y=">AAADxXicbVNNa9wwEFWy/UjdtE3OvYgugR7a4M2luSwUCqWXlhSySWBtgiyPbRF9GElOuxE+lN567e/pD+m/6di7DetsBIbhvTfzZjRyVkvhfBz/3doePXj46PHOk+jpbvTs+Yu93TNnGsthxo009iJjDqTQMPPCS7ioLTCVSTjPrj50/Pk1WCeMPvWLGlLFSi0KwZlH6ORybxwfxv2hm8FkFYzJ6lzub/9JcsMbBdpzyZwLGZqBbaODpHFQM37FSphjqBF2aeg7bOkBIjktjMVPe9qjUeLAL9NN3XUTKpED1caDa9dID6qWzEPoGFpj/XaOgNBp1Hn6ChSEHArWSI95wyTNrkXZz+oWKjPSIe4XUVLhXVhMbupQ50VXVJkcpjMHX4yGdfu+V/S2iknq4btvQ1ZO2yhaGzgw5br6OKhivnJ3uQ68j5s3vjhOg9B1g63y5T0VDfoY2m2L5sIC93KBAeNWeMEpr5hl3GP3QxdZGhRUSvB2QHhxdfOW5wNw3nfCbRqg4UyiL/2Iu0lW8CA/U7d0pnDNeIl5p4sSDd+4UYrpPCSn7fwonacBNWE8+V8ojI/ath0qP28qrbpPqFbCoQg7uF2NEzc4H66EKmZLoamEwk9jpd6sg1aUVY/izvC9T+6+7s3g7OhwEh9OvsZkh7wkr8hrMiHvyHvyiZyQGeEkJ7/I79Hp6Gb0Y/lfbG+tfpB9Mjijn/8AfcdGjw==</latexit><latexit sha1_base64="vKcIIanVOTMwlf97ST/nNIDUeFs=">AAAD8nicbVPdatRAFJ52/amxauutN8GlUERLdm/0piAI4o1aodsWNmGZTE6SofMTZybV7RB8A9/CO/HK2/oevo0n2bVs2g4MHL7vO39zzqSV4NZF0d+19cGt23fubtwL7m8+ePhoa3vzyOraMJgwLbQ5SakFwRVMHHcCTioDVKYCjtPTNy1/fAbGcq0O3byCRNJC8Zwz6hCabe3GKRRcefhcd8izJog/SihoEIPKVuDZ1jDai7oTXjdGS2NIludgtr1+EWea1RKUY4Ja61OsC0wT7MS1hYqyU1rAFE2FsE1810wT7iCShbk2eJULOzSILbiFu67aenzJMwiVdmCbFdKBrAR14FsmrDB+M0WAqyRoc7oSJPgMcloLh359J0XPeNF1a+cy1cIi7uZBXOKzGXSuK19leRtU6gz2JxY+aAWr6btaMbeRVIQOvrrGp8V+EwQrDXsqbRsfG5XUlfYq14I3cdPa5a8Sz1VVY6ls8U55jXl02A42zLgB5sQcDcoMd5yFrKSGMofV97OIQqOglJw1PcLx0/MXLOuB064SZhIPNaMC84ZvcTbxEu75p/KSTiWOGR8xa3VBrOAL01JSXKn4sJmOk2niUeOHo/+B/HDcNE1f+f660sibhHIp7IuwgsvRWH6O/eFIQkkNLnwoIHf7kZTPV0HDi7JDcWa48KOr633dOBrvjaK90aeIbJAn5CnZJSPykrwm78gBmRBGvpPf5IL8GXwb/Bj8XHyN9bXlH3lMemfw6x95nVhz</latexit><latexit sha1_base64="vKcIIanVOTMwlf97ST/nNIDUeFs=">AAAD8nicbVPdatRAFJ52/amxauutN8GlUERLdm/0piAI4o1aodsWNmGZTE6SofMTZybV7RB8A9/CO/HK2/oevo0n2bVs2g4MHL7vO39zzqSV4NZF0d+19cGt23fubtwL7m8+ePhoa3vzyOraMJgwLbQ5SakFwRVMHHcCTioDVKYCjtPTNy1/fAbGcq0O3byCRNJC8Zwz6hCabe3GKRRcefhcd8izJog/SihoEIPKVuDZ1jDai7oTXjdGS2NIludgtr1+EWea1RKUY4Ja61OsC0wT7MS1hYqyU1rAFE2FsE1810wT7iCShbk2eJULOzSILbiFu67aenzJMwiVdmCbFdKBrAR14FsmrDB+M0WAqyRoc7oSJPgMcloLh359J0XPeNF1a+cy1cIi7uZBXOKzGXSuK19leRtU6gz2JxY+aAWr6btaMbeRVIQOvrrGp8V+EwQrDXsqbRsfG5XUlfYq14I3cdPa5a8Sz1VVY6ls8U55jXl02A42zLgB5sQcDcoMd5yFrKSGMofV97OIQqOglJw1PcLx0/MXLOuB064SZhIPNaMC84ZvcTbxEu75p/KSTiWOGR8xa3VBrOAL01JSXKn4sJmOk2niUeOHo/+B/HDcNE1f+f660sibhHIp7IuwgsvRWH6O/eFIQkkNLnwoIHf7kZTPV0HDi7JDcWa48KOr633dOBrvjaK90aeIbJAn5CnZJSPykrwm78gBmRBGvpPf5IL8GXwb/Bj8XHyN9bXlH3lMemfw6x95nVhz</latexit><latexit sha1_base64="2+Wqr1ls4GXWaRnk50ZkHPGf1p8=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG2b3oJRAQxIsaIZsEdobQ01Mz02z/jN090U0z+Aa+hTfxJHiK7+HbWDNZw06Sgobiq6+qvuqqtBLcuij6u7I6uHHz1u21O8Hde/cfPFzfeHRgdW0YTJgW2hyl1ILgCiaOOwFHlQEqUwGH6ex1Gz88AWO5VvtuXkEiaaF4zhl1CB2vb8UpFFx5+FR3yLMmiD9IKGgQg8qW4OP1YbQddRZedUYLZ0gWtne8sXoWZ5rVEpRjglrrU9QFpgk249pCRdmMFjBFVyFsE98N04SbiGRhrg0+5cIODWIL7jxdV60eX/IMQqUd2GYp6EBWgjrwbSSssH4zRYCrJGh7uhIk+AxyWguHef0kRU940U1r5zLVwiLu5kFc4rcZTK4rX2V5W1TqDHYmFt5rBcvtO63Y20gqQgdfXOPTYqcJgqWBPZW2rY+DSupKeznWgtfFprXLXyWeq6pGqez8n/Ia++iwXWyYcQPMiTk6lBnuOAtZSQ1lDtX3u4hCI6GUnDW9gOOz0xcs64HTTgkziYeaUYF9wze4m3gB9/JTeRFOJa4ZPzFreUGs4DPTUlI8qXi/mY6TaeKR44ej/4X8cNw0TZ/57irTyOuIckHsk1DBxWosP8X5cCWhpAYPPhSQu51IyufLoOFF2aG4Mzz40eXzvuocjLdH0fboYzTcHS9Of408IU/JFhmRl2SXvCV7ZEIY+UZ+kTPyZ/B18H3wY/DznLq6ssh5THo2+P0PWwxZyQ==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit><latexit sha1_base64="S1h3Vn4nuzhjhFPPrk4n3wNQoB0=">AAAD/XicbVLNahRBEO5k/YnjX6JHL4NLIIiG3b3oJRAQxIsaIZsEdoalp6dmptn+Gbt7optm8A18C2/iSfAU38O3sWZ2DTtJChqKr76q+qqrklJw6waDv2vrvRs3b93euBPcvXf/wcPNrUdHVleGwZhpoc1JQi0IrmDsuBNwUhqgMhFwnMxeN/HjUzCWa3Xo5iXEkuaKZ5xRh9B0cydKIOfKw6eqRZ7VQfRBQk6DCFS6Ak83+4PdQWvhVWe4dPpkaQfTrfXzKNWskqAcE9Ran6AuMHWwHVUWSspmNIcJugphG/t2mDrcRiQNM23wKRe2aBBZcIt0XTZ6fMFTCJV2YOuVoANZCurAN5GwxPr1BAGu4qDp6QqQ4FPIaCUc5nWTFD3leTutnctEC4u4mwdRgd9mMLkqfZlmTVGpU9gbW3ivFay2b7VibyOpCB18cbVP8r06CFYG9lTapj4OKqkr7OVYA14Xm1QuexV7rsoKpbLFP2UV9tFhs9gw5QaYE3N0KDPccRayghrKHKrvdhG5RkIhOas7AcdnZy9Y2gEnrRJmYg8VowL7hm9wN9ES7uQn8iKcSFwzfmLa8IJIwWempaR4UtFhPRnFk9gjx/eH/wv5/qiu6y7z3VWmkdcR5ZLYJaGCi9VYfobz4UpCSQ0efCggc3sDKZ+vgobnRYvizvDgh5fP+6pzNNodDnaHH0f9/dHy9DfIE/KU7JAheUn2yVtyQMaEkW/kFzknf3pfe997P3o/F9T1tWXOY9Kx3u9/W6xZyw==</latexit>

E[Y] = Y<latexit sha1_base64="ga4wCu9j861aR+rVacrBchIDp30=">AAAEIXicbVPNbtQwEHa7/JTw18KRi8WqEkJQ7S4HuKxUCYE4ACpSty3aRJXjzGatteNgO4Wt5afhDbjzANwQJxAneBIm6VJt2lqKMvrm++bHM05LKazr9X6trHYuXb5yde1adP3GzVu31zfu7FldGQ4jrqU2BymzIEUBIyechIPSAFOphP109rz27x+BsUIXu25eQqJYXoiJ4MwhdLj+Ok4hF4WHD1WDPAxRrJib6tI3/zT1L0IYx07IDHy869+HkNAhbawohiJbkh6ud3tbvebQ80Z/YXTJ4uwcbqx+jTPNKwWF45JZ61OsHUyINuPKQsn4jOUwRrNA2Ca+aTjQTUQyOtEGv8LRBo1iC+5Ersu6Hj8VGdBCO7BhyelAlZI58LWHlhg/jBEQRRLVOd0UFPgMJqySDnVtUcGORN50a+cq1dIi7uZRPMWrNSiuSl9mkzqo0hkMRxbe6gKW0ze1Ym6jmKQOPrng03wYomipYc+UreNjo/UM7FlfDV7kG1du8izxoigrLJWf3NOkwjya1sOnmTDAnZyjwbgRTnDKp8ww7rD6dhaZayRMleCh5XBidvyYZy1w3FTCTeKh4kxiXvoSZxMv4JY+VafuVOGY8RKzmhfFBXzkWimGKxXvhvEgGSceOb7b/x/IdwchhDbzzXmmURcR1YLYJmEFp6Ox4hj7w5FQxQw+Ciph4oY9pR4tg0bk0wbFmeHC98+u93ljb7DVf7LVezfobg8Wq79G7pH75AHpk6dkm7wiO2REOPlCfpI/5G/nc+db53vnxwl1dWWhuUtap/P7Hw2BaXE=</latexit>

Y : yi =

(!iyi , i 2 ⌦

0, i 62 ⌦<latexit sha1_base64="VuAhHpie/1ng8IfTYhimO6b2uX8=">AAAEbXicbVPbbtNAEHXaAMXcWhBPILQiFBBKIyc8gJAiVUJCvABFatqibBSt12Nn1V3beNeFdLX/wc/wIXwFf4AYuyYkbVeyNDpzZubMxWEuhTZB8Ku1tt6+cvXaxnX/xs1bt+9sbt090FlZcBjxTGbFUcg0SJHCyAgj4SgvgKlQwmF4/LbyH55AoUWW7pt5DhPFklTEgjOD0HTzBw0hEamFr2WNvHA+NUJGYOm+/eIc7b6h3QaZu6kgQ7+J4FhVIztTkDB0zKeiS54SQUVKP1UYodQnixd0Se1NM7Ng+BTSaJGosv/LmG52gl5QP3LR6DdGx2ve3nRr7SeNMl4qSA2XTGsb4hygcP42LTXkjB+zBMZopgjria2H58g2IhGJswK/1JAa9akGcxae5ZUeOxMRENReK104DahcMgO28pAc87sxAiKd+FVNMwMFNoKYldK4anDLQSk7EUndrZ6rMJMacTP36QzXVGBwmds8iqukKotgONLwMUthuXytFWsXikli4LtxNkyGzveXGrZM6So/NqqYmenzvgq8zDcuTfx6YkWalyiVn80pLrFORqpDIpEogBs5R4PxQhjBCZ+xgnGD6leryCRDwkwJ7lYcRhyf7vBoBRzXSngxsVByJrEueYe7oQ28Eh+qhTtUuGYcYlTxfJrCN54pxfCk6L4bDybjiUWO7fT/JbKdgXNulfnhIrNQlxFVQ1wloYLFarQ4xf5wJUSxAn8XIiE2w0Cp7jJYiGRWo7gzPPj++fO+aBwMev2XveDzoLMbNKe/4T3wHnvPvb73ytv13nt73sjj3p/Wk9ZOq7f+u32//bD96Iy61mpi7nkrr/3sLzASfFU=</latexit>

yi =

8<

:pi · N

p · (yi � ci ) + qi · nd

q · ci , i 2 ⌦nz

qi · nd

q · yi , i 2 ⌦z

, ci =@f

@m(0,mi )

<latexit sha1_base64="ZV0WJ2IBMhAaInOGmTSNxCOgQ7M=">AAAE+nicjVPLbtNAFHVKgGJeLSzZjAhFBdIqCUKwiVQJCbGhFKkvKROi8XjsjOoZuzPjQjLMF/AX7BArJFYgPoS/4doxadx2wUiRbs4993mugyzh2nQ6fxpLl5qXr1xdvuZfv3Hz1u2V1Tv7Os0VZXs0TVJ1GBDNEi7ZnuEmYYeZYkQECTsIjl4W/oMTpjRP5a6ZZGwoSCx5xCkxAI1WGw9wwGIuLTvOS+ix87HhScjsxI046vuVn0IRPfdl4MM0TA2OFKF22wFS/kfrkxHfoCP+CD1BFfm4TpbvQwfYjA7MNkIPEcdc4reCxWRk5dQhjAfPMjP0/yPFpEhRzzCFRpkMq6bbCMNwYVEL9dEsA86IMpwkKHKntnDrnbaA3mfRpysZrbQ6m53yofNGtzJaXvV2RqtLv3GY0lwwaWhCtLYBiMKU89dwrllG6BGJ2QBMCbAe2lJJh9YACVGUKvhJ2E6B+lgzMwtPs6IfO+YhQzI1pSBzp2EiS4hhtvCgDPK7AQBcwhYhrRkzwWzIIpInxhWyLgZJcsLjclo9EUGaaMDNxMdjuBkFwXlmszAqkoo0ZP09zbZTyRbLl71CbSVgk4Z9NM4Gcd/5/sLAlghd5IdBBTFjfdZXgBf5BrmJXgwtl1kOrdLZnqIc6qSouGoUcsWoSSZgEKq44RTRMQGhDXRfr5LEKRDGglNXcxh+NN2gYQ0clJ1QNbQspySBuugVaIMruBYfiLk7ECAzLDEseD6W7ANNhSBwUnjXDXrDwdACx7a6/xLZVs85V2e+Oc9U4iKiqIh1EnQwl0bzKcwHkiBBFHzMKGGR6XeEaC+CisfjEgXN4OC7Z8/7vLHf2+w+3ey867W2OtXpL3v3vPveutf1nntb3mtvx9vzaONz43vjZ+NX81PzS/Nr89uMutSoYu56tdf88RdO0bfC</latexit>

yi =

8<

:pi · N

p · yi , i 2 ⌦nz

qi · nd�Nq · yi , i 2 ⌦z

, N = nnz(X)

<latexit sha1_base64="TsGHMkNcr28Sm50hiht4WFG3sO8=">AAAEt3icbVNbb9MwFE5HgRFuGzzyYlENDTSmtggBD5UmIaG9MIa0m1SXynFOGmuxk8bOttbyT+OH8Mov4STNSrPNUqXT7/vO/STIEqFNt/untXavff/Bw/VH/uMnT58939h8caLTIudwzNMkzc8CpiERCo6NMAmcZTkwGSRwGpx/LfnTC8i1SNWRmWUwkmyiRCQ4MwiNN/7SACZCWZgWFfLO+dSIJAQ7c2NBBn7Nc8yhl1yGHOVhamiUM24PHCLVfzIbix3yhggqFP0hYcLGVs0doXT4MTOja/9p01/9Ct9jjOlqjBtB5pgbVFjXsUMo1huSAzIgVAbplVWYZZse2TP3diH839B4o9Pd7VaP3DZ6tdHx6nc43lz7TcOUFxKU4QnT2gY4Ucidv0ULDRnj52wCQzQVwnpkqzU4soVISKI0x58ypEJ9qsEs3NOsrMfGIgSiUlONc0kakFnCDNiSIRnGd0MEhMKZYVgTgwQbQsSKxLhyKatOil2ISdWtnuE0Eo24mfk0xoXn6FxkNgujMqhMQxgcazhIFaymr2rF3LlkCTFwZZwNJgPn+ysNWyZ1GR8blczE+iZXgndxw8JEn0dWqKzAUvliTlGBeVJSniQJRQ7cJDM0GM+FEZzwmOFhGKy+mSWZpCiIpeCuQRhxPn/PwwY4rCrh+chCwVmCeck33A2t4YZ/IJd0IHHNOMSw1PlUwSVPpWR4UvTIDfuj4ciixnZ614Fsp++cayq/31bm8i6hrIVNEVawXI0Wc+wPV0Iky/FTJAlEZtCVcmcVzMUkrlDcGR587+Z53zZO+ru9D7vdn/3OXrc+/XXvlffa2/Z63idvz9v3Dr1jj7f2W6p12bpqf2mP21E7XkjXWrXPS6/x2tN/FKmdcA==</latexit>

yi = si ·nd

s· yi , nd ⌘

Y

k

nk<latexit sha1_base64="Ulx3TnHpCrtgLjcFFQHJp/84jpM=">AAAES3icbVPdahNBFN4krdZtta1eejMYCiK1JPFCbwIFQbxRKjRtIRvD7OzZZMj8rDOz0XTYR/MRfADBt/BOvPDsdi3ZtgMLZ77vO39zzsaZ4Nb1ej9b7c7G5r37Ww/C7Z2Hj3b39h+fWZ0bBiOmhTYXMbUguIKR407ARWaAyljAebx4W/LnSzCWa3XqVhlMJJ0pnnJGHULTvWUUw4wrD1/yCnlRhJHjIgG/KqacDEl9s+UtYol2JEoNZV59TgpEa2w15YckwhgJQSLCaHxJoszoZLogaroII1DJWpLpXrd31KsOuW30a6Mb1Odkut/+HiWa5RKUY4Ja62PsEkwRHkS5hYyyBZ3BGE2FsJ346mkKcoBIQlJt8FOOVGgYWXBX7jor6/FzngBR2oEt1kgHMhPUgS8ZkmH8YowAV5OwzOnmIMEnkNJcOPRrOim65LOqW7uSsRYWcbcKozkOwaBznvksScugUicwHFn4qBWsp69qxdxGUkEcfHOFj2fDIgzXGvZU2jI+Niqpm9ubXAnexY1zl76ZeK6yHEtlV++U5phHk3JNSMINMCdWaFBmuOOMsDnFwTusvplFzDQK5pKzokE4vrh8yZIGOK4qYWbiIWdUYF7yDmcT1XDDP5bXdCxxzPiISakLIwVfmZaS4kpFp8V4MBlPPGp8t/8/kO8OiqJoKj/cVhp5l1DWwqYIK7gejeWX2B+OhEhq8PchAlI37El5uA4aPptXKM4MF75/c71vG2eDo/6ro96nQfe4V6/+VvA0eBY8D/rB6+A4eB+cBKOABb9a7dZ2a6fzo/O786fz90rabtU+T4LG2dj8B9hMdi8=</latexit>

Page 28: High performance tensor decomposition on emerging

Performance Considerations

§ Even for semi-stratified we use stratified for estimating objective function

§ Stratified sampling requires efficient way of determining whether a given sample corresponds to a zero or nonzero§ Sort tensor indices lexicographically and do binary search

§ O(log(N)) search, no increase in storage § Oversample Y tensor, sort, and do bulk search

§ Construct hash-map with tensor indices as keys§ O(1) search but O(N) increase in storage§ Use Kokkos::UnorderedMap for parallel, lock-free map construction

§ For gradient evaluation, explicitly construct sampled Y tensor and pass to MTTKRP§ Reuse code from CP-ALS§ Allows use of e.g., permutation data structure for efficient MTTKRP (but

with cost of recomputing permutation arrays each SGD iteration)

27

Page 29: High performance tensor decomposition on emerging

SGD-ADAM Convergence with Varying Sample Sizes

28

Page 30: High performance tensor decomposition on emerging

What is Kokkos?

§ κόκκος (Greek, not an acronym)§ Translation: “granule” or “grain” ; like grains of sand on a beach§ H.C. Edwards, C. Trott, et al

§ Performance Portable Thread-Parallel Programming Model§ E.g., “X” in “MPI+X” ; not a distributed-memory programming model§ Application identifies its parallelizable grains of computations and data§ Kokkos maps those computations onto cores and that data onto memory

§ Fully Performance Portable C++11 Library Implementation§ Not a language extension (e.g., OpenMP, OpenACC, OpenCL, …)§ Production – open source at https://github.com/kokkos/kokkos

ü Compilers: GNU, LLVM, Intel, NVIDIA, IBM XL, Crayü Multicore CPU - including NUMA architectural concernsü Intel Xeon Phi (KNC/KNL) ü NVIDIA GPU (Kepler, Pascal, Volta) ü IBM Power 8/9o AMD Fusion

29

Page 31: High performance tensor decomposition on emerging

Advantages of Kokkos (my view)

§ Tools to build portable, thread-parallel algorithms with good performance§ Tools to manage memory allocations and multiple memory spaces§ Ability to introduce data layout polymorphism§ Simple, portable interfaces to advanced architecture-specific capabilities§ Light-weight abstractions that don’t impede performance

§ Future-proofing§ Substantial buy-in from DOE and vendors§ Expected to be part of C++-2x standard

§ Flexibility to write directly to the underlying programming model§ Can always write specializations specific to an architecture if

necessary/desired

30

Page 32: High performance tensor decomposition on emerging

Kokkos Abstractions

§ Parallel Pattern of user’s computations§ parallel_for, parallel_reduce, parallel_scan, task-graph, ... (extensible)

§ Execution Policy tells how user computation will be executed§ Static scheduling, dynamic scheduling, thread-teams, ... (extensible)

§ Execution Space tells where user computations will execute§ Which cores, numa region, GPU, ... (extensible)

§ Memory Space tells where user data resides§ Host memory, GPU memory, high bandwidth memory, ... (extensible)

§ Layout (policy) tells how user data is laid out in memory§ Row-major, column-major, array-of-struct, struct-of-array … (extensible)

§ Differentiating: Layout and Memory Space§ Versus other programming models (OpenMP, OpenACC, …)§ Critical for performance portability …

31

Page 33: High performance tensor decomposition on emerging

typedef Kokkos::RangePolicy<ExecSpace> Policy;const int nnz = X.nnz();Policy policy(0,nnz);Kokkos::parallel_for( policy,

KOKKOS_LAMBDA(const int i){double *tmp = new double[R];

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

for (int j = 0; j < R; ++j)tmp[j] = x_val;

for (int m = 0; m < N; m++) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));for (int j = 0; j < R; ++j)tmp[j] *= row[j];

}}

for (int j = 0; j < R; ++j)Kokkos::atomic_add(&v.entry(k,j), tmp[j]);

delete [] tmp;});

Flat Parallelism Over Tensor Non-zeros

Dynamic memory allocation (very bad on GPU)

Not exploiting parallelism over columns of factor matrices

32

Page 34: High performance tensor decomposition on emerging

Kokkos Hierarchical Parallelism

§ Kokkos maps a league of teams across a given index range§ Each team consists of a given number of “threads”§ Each “thread” can iterate over a “vector” range

§ Team:§ CPU/Phi: One or more cores§ GPU: Cuda thread block

§ Thread:§ CPU/Phi: Hyperthread in a core§ GPU: Warp in a thread block

§ Vector:§ CPU/Phi: Vector lane in a SIMD instruction§ GPU: Thread within a warp

33

Page 35: High performance tensor decomposition on emerging

typedef Kokkos::TeamPolicy<ExecSpace> Policy;const int nnz = X.nnz();const int VectorSize = 8;const int TeamSize = 1;Policy policy(nnz,TeamSize,VectorSize);

typedef Kokkos::View< double*,ExecSpace::scratch_memory_space, Kokkos::MemoryUnmanaged > ScratchSpace;

const size_t bytes = ScratchSpace::shmem_size(R);

Kokkos::parallel_for(policy.set_scratch_size(0,Kokkos::PerTeam(bytes)),KOKKOS_LAMBDA(Policy::member_type team)

{const size_t i = team.league_rank();

ScratchSpace tmp(team.team_scratch(0), R);

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

Adding Scratch Space and Vector Parallelism

Vector parallelism across columns of factor matrices

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),[&] (const int r)

{tmp[r] = x_val;

});

for (int m = 0; m < N; ++m) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),

[&] (const int r){tmp[r] *= row[r];

});}

}

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R), [&] (const int r)

{Kokkos::atomic_add(&v.entry(k,r), tmp[r]);

});});

34

Page 36: High performance tensor decomposition on emerging

typedef Kokkos::TeamPolicy<ExecSpace> Policy;const int nnz = X.nnz();const int VectorSize = 8;const int TeamSize = 1;Policy policy(nnz,TeamSize,VectorSize);

typedef Kokkos::View< double*,ExecSpace::scratch_memory_space, Kokkos::MemoryUnmanaged > ScratchSpace;

const size_t bytes = ScratchSpace::shmem_size(R);

Kokkos::parallel_for(policy.set_scratch_size(0,Kokkos::PerTeam(bytes)),KOKKOS_LAMBDA(Policy::member_type team)

{const size_t i = team.league_rank();

ScratchSpace tmp(team.team_scratch(0), R);

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

Adding Scratch Space and Vector Parallelism

Allocation from scratch memory space (shared memory on GPU)

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),[&] (const int r)

{tmp[r] = x_val;

});

for (int m = 0; m < N; ++m) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),

[&] (const int r){tmp[r] *= row[r];

});}

}

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R), [&] (const int r)

{Kokkos::atomic_add(&v.entry(k,r), tmp[r]);

});});

35

Page 37: High performance tensor decomposition on emerging

typedef Kokkos::TeamPolicy<ExecSpace> Policy;const int nnz = X.nnz();const int VectorSize = 8;const int TeamSize = 1;Policy policy(nnz,TeamSize,VectorSize);

typedef Kokkos::View< double*, ExecSpace::scratch_memory_space, Kokkos::MemoryUnmanaged > ScratchSpace;

const size_t bytes = ScratchSpace::shmem_size(R);

Kokkos::parallel_for(policy.set_scratch_size(0,Kokkos::PerTeam(bytes)),KOKKOS_LAMBDA(Policy::member_type team)

{const size_t i = team.league_rank();

ScratchSpace tmp(team.team_scratch(0), R);

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),[&] (const int r)

{tmp[r] = x_val;

});

for (int m = 0; m < N; ++m) {if (m != n) {const double *row = u[m].rowptr(X.subscript(i,m));Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),

[&] (const int r){tmp[r] *= row[r];

});}

}

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R), [&] (const int r)

{Kokkos::atomic_add(&v.entry(k,r), tmp[r]);

});});

§ Final MTTKRP algorithm:§ Incorporates team size > 1 on GPU for better occupancy (process blocks of non-zeros)§ Blocking on R for better cache performance

§ Template algorithm on R block size for better vectorization§ Choose block size to balance occupancy, tensor reads, minimize remainder term

Adding Scratch Space and Vector Parallelism

36

Page 38: High performance tensor decomposition on emerging

typedef Kokkos::TeamPolicy<ExecSpace> Policy;const bool is_cuda = std::is_same<ExecSpace, Kokkos::Cuda>::value;

const int nnz = X.nnz();const int VectorSize = 8;const int TeamSize = is_cuda ? 128/VectorSize : 1;const int LeagueSize = (nnz+TeamSize-1)/TeamSize;Policy policy(LeagueSize,TeamSize,VectorSize);

typedef Kokkos::View< double**, Kokkos::LayoutRight, ExecSpace::scratch_memory_space,Kokkos::MemoryUnmanaged > ScratchSpace;

const size_t bytes = ScratchSpace::shmem_size(TeamSize,R);

Kokkos::parallel_for(policy.set_scratch_size(0,Kokkos::PerTeam(bytes)),KOKKOS_LAMBDA(Policy::member_type team)

{const int team_index = team.team_rank();const size_t i = team.league_rank()*TeamSize+team_index;if (i >= nnz) return;

ScratchSpace tmp(team.team_scratch(0), TeamSize, R);

const size_t k = X.subscript(i,n);const double x_val = X.value(i);

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),[&] (const int r)

{tmp(team_index,r) = x_val;

});

for (int m = 0; m < n; ++m) {if (m != n) {

const double *row = u[m].rowptr(X.subscript(i,m));Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),

[&] (const int r){

tmp(team_index,r) *= row[r];});

}}

Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,R),[&] (const int r)

{Kokkos::atomic_add(&v.entry(k,r), tmp(team_index,r));

});});

Adding Team Parallelism

37