View
1.946
Download
0
Category
Preview:
Citation preview
A Random Forest using a Multi-valued Decision Diagram on an FPGA
1Hiroki Nakahara, 1Akira Jinguji, 1Shimpei Sato, 2Tsutomu Sasao
1Tokyo Institute of Technology, JP, 2Meiji University, JP
May, 22nd, 2017@ISMVL2017
Outline• Background• Random forest (RF)• Multi-valued decision diagram (MDD)• RF using MDDs• Experimental results• Conclusion
2
Machine Learning
3
Much computation power, and Big data(Left): “Single-Threaded Integer Performance,” 2016(Right): Nakahara, “Trend of Search Engine on modern Internet,” 2014
Machine Learning Algorithms
M. Warrick, “How to get started with machine learning,” PyCon2014 4
Introduction• Random Forest (RF)
• Ensemble learning method• Consists of multiple decision trees (DTs)• Applications: Segmentation, human pose
detection• It is based on binary DTs (BDTs)
• A node is evaluated by an if-then-else statement
• The same variable may appear several times• Multiple-valued decision diagram (MDD)
• Each variable appears only once on a path
5
Introduction (Contʼd)• Target platform
• CPU: Too slow• GPU: Not suitable to the RF → slow, and consumes much power
• FPGA: Faster, low power, long TAT• High-level synthesis (HLS) for the RF using MDDs on an FPGA• Low power, high performance, short design time
6
Random Forest
7
Classification by a Binary Decision Tree (BDT)• Partition of the feature map
1.00
0.53
0.29
0.00
0.09
0.63
0.71
1.00
C1
C2 C1
C1 C2 C1
X1
X2
X2<0.53?
X2<0.29? X1<0.09?
X1<0.63? X1<0.71?
Y N
N
NN
NY
Y
Y
Y
C1
C1C2 C1C2
C1
8
Training of a BDT• It is built by randomized samples• Recursively partition the dataset to maximize its
entropy → The same variables may appear
9
1.00
0.53
0.29
0.00
0.09
0.63
0.71
1.00
C1
C2 C1
C1 C2 C1
X1
X2
X2<0.53?
X2<0.29? X1<0.09?
X1<0.63? X1<0.71?
Y N
N
NN
NY
Y
Y
Y
C1
C1C2 C1C2
C1
Random Forest (RF)• Ensemble learning• Classification and regression• Consists of multiple BDT
10
Tree 1 Tree 2 Tree n
C1 C2C1
Voter
C1 (Class)
InputX1<0.53?
X3<0.71? X2<0.63?
X2<0.63? X3<0.72?
Y N
N
NN
NY
Y
Y
Y
C1
C1C2 C1C3
C1
Tree 1
Binary Decision Tree (BDT) Random Forest
...
Applications• Key point matching [Lepetit et al., 2006]• Object detector [Shotton et al., 2008][Gall et al., 2011]• Hand written character recognition [Amit&Geman, 1997]• Visual word clustering
[Moosmann et al.,2006]• Pose recognition
[Yamashita et al., 2010]• Human detector
[Mitsui et al., 2011][Dahang et al., 2012]
• Human pose estimation [Shotton 2011]
11
Known Problem• Build BDTs from randomized samples
• The same variable may appear on a path• Tend to be slow, even if we use the GPUs
12
X2<0.53?
X2<0.29? X2<0.09?
X1<0.63? X1<0.71?
Y N
N
NN
NY
Y
Y
Y
C1
C1C2 C1C2
C1
if X2 < 0.09 thenoutput C1;elsegoto Child_node;
Multi-valued Decision Diagram
13
14
Binary Decision Diagram (BDD)• Recursively apply Shannon expansion to a given logic function
• Non-terminal node: If-then-else statement• Terminal node: Set functional value
0 1
x1
x2
x3
x4
x5
x6
Non‐terminal node
Terminal node
15
Measurement of BDD
Memory size: # of nodes size of a nodeWorst case performance: LPL (Longest Path Length)
→Dedicated fully pipeline hardware
0 1
x1
x2
x3
x4
x5
x6
16
Multi-Valued Decision Diagram (MDD)
• MDD(k): 2k outgoing edges• Evaluates k variables at a time
0 1
x1
x2
x3
x4
x5
x6
BDD0 1
X3
X2
X1
{x5,x6}
{x3,x4}
{x1,x2}
MDD(2)
Comparison the BDT with the MDD
17
X2<0.53?
X2<0.29? X1<0.09?
X1<0.63? X1<0.71?
Y N
N
NN
NY
Y
Y
Y
C1
C1C2 C1C2
C1
X2
X1 X1
C1 C2
<0.29
<0.53<1.00
<1.00<0.71<0.71
<1.00
<0.63
BDT MDD
# of Nodes
18
1.00
0.53
0.29
0.00
0.09
0.63
0.71
1.00
C1
C2 C1
C1 C2 C1
X2
X1
1.00
0.53
0.29
0.00
0.09
0.63
0.71
1.00
C1
C2 C1
C1 C2 C1
X2
X1BDT MDD
Complexities of the BDT and the MDD
19
# Nodes LPL
BDT O(Σ|Xi|) O(Σ|Xi|)
MDD O(|Xi|k) O(n)
The RF prefers shallow decision trees for avoid the overfitting
Random Forest using MDDs on an FPGA
20
FPGA (Field Programmable Gate Array)• Reconfigurable architecture
• Look-up Table (LUT)• Configurable channel
• Advantages• Faster than CPU• Dissipate lower power
than GPU• Short time design
than ASIC
21
Fully Pipeline Circuit
Tree 1 Tree 2 Tree b
C1 C2
C1
VoterC1
X (Input)
...
22
MUX-based Realization
23
System Design Tool
24
①②
④
③
1. Behavior design+ pragmas
2. Profile analysis3. IP core generation by HLS4. Bitstream generation by
FPGA CAD tool5. Middle ware generation
↓Automatically done
Proposed Tool Flow
TrainingDataset
scikit‐learn
HyperParameter(by Grid‐search)
RandomForest
HostCode
KernelCode aocx
Binary
HostPC
FPGABoard
aoc
gcc
RF2AOC
25scikit‐learn Intel SDK for OpenCL
Experimental Results
26
Comparison the MDD based with the BDT based
27
BDT MDDName Path len.
(Peform.)#Nodes(Mem.)
Max. Path
Path len.(Peform.)
#Nodes(Mem.)
Dermatology 720 676 15 322 118336Contraceptive Method
600 1055 9 198 7360
Glass Identification
952 1260 10 268 17204
Hayes‐Roth 480 577 5 73 448Hepatitis 720 1040 15 357 145664Ionosphere 1196 1077 20 381 671744Iris 1056 777 4 199 517
Dataset: UCI Machine Learning Repositoryhttp://archive.ics.uci.edu/ml/datasets.html
Comparison of Platforms• Implemented RF following devices
• CPU: Intel Core i7 650• GPU: NVIDIA GeForce GTX Titan• FPGA: Terasic DE5-NET
• Measure dynamic power includingthe host PC
• Test bench: 10,000 random vectors• Execution time includingcommunication time between the host PC and devices
28
GPU
FPGA
Comparison of Platforms
29
GPU@86WGeForce Titan
CPU@13WXeon (R) E5607
FPGA@15WStratix V A7
Name LPS LPS/W LPS LPS/W LPS LPS/WDermatology 336.2 3.9 211.6 16.3 3221.2 214.7
Contraceptive Method
521.9 6.1 286.4 22.0 10924.3 728.3
Glass Identification
726.7 8.5 587.5 45.2 6442.3 429.5
Hayes‐Roth 1512.9 17.6 1165.5 89.7 12884.6 859.0
Hepatitis 739.1 8.6 662.7 51.0 8209.9 547.3Ionosphere 821.0 9.5 595.9 45.8 9663.5 644.2
Iris 446.6 5.2 436.7 33.6 4831.7 322.1
LPS: #Looks Per Second
Conclusion• Proposed the RF using MDDs
• Reduced the path length• Increased the column multiplicity
• # of nodes: O(|X|k)• The shallow decision diagram is recommended to avoid the overfitting
• Developed the high-level synthesis design flow toward the FPGA realization
• 10.7x faster than the GPU• 14.0x faster than the CPU
30
Recommended