12
Heterogenous Bitwidth Binarization: Weird Operators with Big Benefits Josh Fromm

Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Heterogenous Bitwidth Binarization: Weird Operators with Big Benefits

Josh Fromm

Page 2: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Network Binarization

● Multiply-accumulate becomes xnor-popcount.

● 5-30x theoretical speedup.● 32x weight memory compression.

Page 3: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Network Binarization

● Multiply-accumulate becomes xnor-popcount.

● 5-30x theoretical speedup.● 32x weight memory compression.

● 1-bit accuracy is too low but fast.● 2-bit accuracy is high but too slow.● How to bridge the gap?

Page 4: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Mixed Bitwidth Tensors

Page 5: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Mixed Bitwidth Tensors

Page 6: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Middle-Out Bit Distribution

Page 7: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Middle-Out Bit Distribution

Page 8: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Middle-Out Bit Distribution

Page 9: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Super-Linear Scaling

Page 10: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Super-Linear Scaling

Page 11: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Super-Linear Scaling

Page 12: Weird Operators with Big Benefits Heterogenous Bitwidth ...Josh Fromm. Network Binarization Multiply-accumulate becomes xnor-popcount. 5-30x theoretical speedup. 32x weight memory

Hard to Implement!

Implementing on CPU

● Needs efficient sparse tensor library support

Implementing on FPGA

● Gates can be directly laid out for big benefits

● Designing FPGAs is hard, especially for non-uniform computation

TVM can enable these platforms!