8
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.19, NO.1, FEBRUARY, 2019 ISSN(Print) 1598-1657 https://doi.org/10.5573/JSTS.2019.19.1.129 ISSN(Online) 2233-4866 Manuscript received Feb. 8, 2019; accepted Feb. 8, 2019 1 the School of Electrical and Computer Engineering, UNIST 2 UX Factory Inc. 3 the School of Electrical Engineering, KAIST E-mail : [email protected] A Low-power, Mixed-mode Neural Network Classifier for Robust Scene Classification Kyuho Lee 1 , Junyoung Park 2 , and Hoi-Jun Yoo 3 Abstract—A low-power neural network classifier processor is proposed for real-time mobile scene classification. It has analog-digital mixed-mode architecture to save power and area while preserving fast operation speed and high classification accuracy. Its current-mode analog datapath replaces massive digital computations such as multiply-accumulate and look-up table operations, which saves area and power by 84.0% and 82.2% than those of digital ASIC implementation. Moreover, the processor integrates a multi-modal and highly controllable radial basis function circuit that compensates for the environmental noise to make the processor maintain high classification accuracy despite of temperature and supply voltage variations, which are critical in mobile devices. In addition, its reconfigurable architecture supports both multi-layer perceptron and radial basis function network. The proposed processor fabricated in 0.13 mm CMOS process occupies 0.14 mm 2 with 2.20 mW average power consumption and attains 92% classification accuracy. Index Terms—Mixed-mode SoC, classifier, neural network processor, multi-layer perceptron, radial basis function network I. INTRODUCTION Deep neural network has been actively developed as a classifier in various computer vision applications due to its high classification accuracy [1-5]. However, they involve too huge amount of computations and memory footprints to be deployed in battery-driven mobile devices yet. Besides the deep neural networks, Support Vector Machines and neural networks have been commonly used for mobile applications but the former suffered from its massive computation and data storage compared to the latter. Among the variants of neural network, Radial Basis Function Network (RBFN) and Multilayer Perceptron (MLP) are most commonly used for classification on account of its high classification accuracy and fast learning process when compared to other types of classifiers [6, 7]. However, mobile applications such as object recognition in unmanned aerial vehicle [8] and head-mounted display [9, 10] require low-power consumption while preserving high classification accuracy and fast processing speed. However, previous neural network implementations were not able to achieve such high accuracy and low-power consumption at the same time due to the significant advantages and disadvantages of analog and digital circuit implementations [11-19]. Digital implementations of neural networks [11-13] have advantages that they can achieve high accuracy and programmability with noise tolerance, but they consumed huge power and area. Hundreds of weights must be stored in SRAM as well as numbers of nonlinear activation functions in Look-Up Table, but usually their values were approximated because they require huge memory to hold every entries, e.g. holding 8-b resolution of one activation requires 256B while digital multiplication doubles the precision, therefore, it needs an external memory to store the parameters. Moreover,

A Low-power, Mixed-mode Neural Network Classifier for ... · network, Radial Basis Function Network (RBFN) and Multilayer Perceptron (MLP) are most commonly used ... RBFN is represented

  • Upload
    lekien

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.19, NO.1, FEBRUARY, 2019 ISSN(Print) 1598-1657 https://doi.org/10.5573/JSTS.2019.19.1.129 ISSN(Online) 2233-4866

Manuscript received Feb. 8, 2019; accepted Feb. 8, 2019 1 the School of Electrical and Computer Engineering, UNIST 2 UX Factory Inc. 3 the School of Electrical Engineering, KAIST E-mail : [email protected]

A Low-power, Mixed-mode Neural Network Classifier for Robust Scene Classification

Kyuho Lee1, Junyoung Park2, and Hoi-Jun Yoo3

Abstract—A low-power neural network classifier processor is proposed for real-time mobile scene classification. It has analog-digital mixed-mode architecture to save power and area while preserving fast operation speed and high classification accuracy. Its current-mode analog datapath replaces massive digital computations such as multiply-accumulate and look-up table operations, which saves area and power by 84.0% and 82.2% than those of digital ASIC implementation. Moreover, the processor integrates a multi-modal and highly controllable radial basis function circuit that compensates for the environmental noise to make the processor maintain high classification accuracy despite of temperature and supply voltage variations, which are critical in mobile devices. In addition, its reconfigurable architecture supports both multi-layer perceptron and radial basis function network. The proposed processor fabricated in 0.13 mm CMOS process occupies 0.14 mm2 with 2.20 mW average power consumption and attains 92% classification accuracy. Index Terms—Mixed-mode SoC, classifier, neural network processor, multi-layer perceptron, radial basis function network

I. INTRODUCTION

Deep neural network has been actively developed as a

classifier in various computer vision applications due to its high classification accuracy [1-5]. However, they involve too huge amount of computations and memory footprints to be deployed in battery-driven mobile devices yet. Besides the deep neural networks, Support Vector Machines and neural networks have been commonly used for mobile applications but the former suffered from its massive computation and data storage compared to the latter. Among the variants of neural network, Radial Basis Function Network (RBFN) and Multilayer Perceptron (MLP) are most commonly used for classification on account of its high classification accuracy and fast learning process when compared to other types of classifiers [6, 7]. However, mobile applications such as object recognition in unmanned aerial vehicle [8] and head-mounted display [9, 10] require low-power consumption while preserving high classification accuracy and fast processing speed. However, previous neural network implementations were not able to achieve such high accuracy and low-power consumption at the same time due to the significant advantages and disadvantages of analog and digital circuit implementations [11-19].

Digital implementations of neural networks [11-13] have advantages that they can achieve high accuracy and programmability with noise tolerance, but they consumed huge power and area. Hundreds of weights must be stored in SRAM as well as numbers of nonlinear activation functions in Look-Up Table, but usually their values were approximated because they require huge memory to hold every entries, e.g. holding 8-b resolution of one activation requires 256B while digital multiplication doubles the precision, therefore, it needs an external memory to store the parameters. Moreover,

130 KYUHO LEE et al : A LOW-POWER, MIXED-MODE NEURAL NETWORK CLASSIFIER FOR ROBUST SCENE CLASSIFICATION

parallel processing of activation function for multiple neurons was not possible, degrading the processing time [14]. On the other hand, analog VLSI implementation results in low-cost parallelism as well as low-power computation such as addition, but their inaccurate circuit parameters with noise and low precision degraded classification accuracy [15-17]. Several mixed-mode SoCs took the advantages of both analog and digital implementations obtaining low-power consumption within small area; however, there were no concerns on noise compensation while mobile devices suffer from supply voltage fluctuation and temperature variation that directly ruins accuracy [18, 19].

In this paper, an analog-digital mixed-mode neural network classifier (NNC) is proposed for mobile scene classification. Its highly controllable radial basis function (RBF) circuit generates sigmoidal-shape function as well as various shapes of RBF [20] supporting both approximated MLP and RBFN. By adopting the environmental noise compensation in the analog core, namely supply voltage (DVDD) and temperature (DT) compensation, the proposed SoC achieves 92% classification accuracy.

The rest of the paper is organized as follows. In Section II, NNC is introduced to provide basic operations and hardware architecture of the proposed chip will be explained in Section III. Section IV shows measurement and implementation results, followed by conclusion in Section V.

II. NEURAL NETWORK CLASSIFIER

Fig. 1(a) shows the typical neural network architecture. It consists of input layer (X), hidden layers (H), and an output layer (Z). Every input is fully connected to hidden neurons, and they are multiplied with weights (c, w, v) and summed up before transfer function f(.). The output of each hidden neuron j becomes as (1), where types of activation functions are shown in Fig. 1(b).

( )j i jf X cå (1) Among different types of activation functions in Fig.

1(b), sigmoid and ReLU functions are most widely used for MLP and deep neural network; and RBFs are used for Support Vector Machine and RBFN. The output of

RBFN is represented as (2), where wij is the weight between the ith center (ci) of RBF and output j, si represents the width of the ith center, ||.|| is the Euclidean norm on the input space, f is output of each hidden RBF nodes that corresponds to bell-shaped functions including Gaussian.

1

(|| ||, )N

j ij i iiO w r cf s

== -å

r ur (2)

The operation of neural network classification differs

by the types of function used for its activation, as depicted in Fig. 1(c). The number of hidden layers

X2

H11

H12 Z2

X1

cijXl

•••

H1m

•••

Z1

Zo

•••

fj(SXici) Hk1

Hk2

wijHk

n

•••

vij

Input Layer Output LayerHidden Layers

(a)

r r r

xcx xc

x xcx

x

f(X)

f(X)

f(X)

f(X)

x

f(X)

RB

FNM

LP

(b)

Input X1

Inpu

t X2

Class A

Class B

Decision Boundaries

Input X1

Inpu

t X2

Class A

Class B

(c)

Fig. 1. (a) Neural network architecture, (b) Types of activation functions. Bell-shaped or Gaussian functions for RBFN; Sigmoid and ReLU for MLP, (c) Basic concepts of linear and non-linear classification. Left diagram = MLP, right diagram = RBFN.

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.19, NO.1, FEBRUARY, 2019 131

determines combinations of linear decision boundaries of MLP while using RBF utilizes nonlinear classification boundaries. Therefore, MLP and RBF have different classification accuracy under the same network size or computational cost. RBFN shows great performance for simple classification with low complexity besides today’s heavy deep neural network, but its performance is dependent on shape, center, and width of the RBFs. If the shape of the RBF is corrupted by noise resulting in unwanted overlaps among the RBFs as depicted in Fig. 2, the trained weight values become no more reliable; this fact causes severe degradation of accuracy. Therefore, the NNC should provide versatile shapes of RBF with stability. In order to solve the problem that may occur due to noise, the proposed NNC processor contains highly controllable RBF circuit that can generate various RBFs as well as sigmoid-shape function for reconfigurable processing.

III. MIXED-MODE HARDWARE

ARCHITECTURE

Fig. 3 shows the overall hardware architecture of the proposed mixed-mode NNC processor. The analog datapath performs feed-forward neural network classification. For low-power consumption and utilizing

Kirchhoff’s current law, the analog core is designed with current-mode circuits. The entire analog core consists of DACs for input, current multipliers for weight multiplication, I-V converters, RBF circuits, and a sigmoid circuit. The neural network module has 4 input neurons, 6 hidden neurons and 1 output neuron, which is operated recursively to generate 25 scene categories.

The neural network parameters are stored in the memory bank in the digital controller that sets the corresponding values of the parameters to the analog core through DAC bank. The digital controller also performs on-line learning of the weights and RBF shapes.

1. Analog Neural Network Core Circuits The required number of DACs and current multipliers

in fully-connected layers should be identical to the number of weights. Instead of having numbers of DACs and multipliers, binary-weighted current mirrors are implemented to save area and power as shown in Fig. 4. They perform weight multiplication when the NNC is used as MLP-mode. In RBFN-mode, multipliers in the first layer are used as wire connection and those in the second layer are used for weight multiplication. The input current is multiplied by twofold 4-b weights of w[3:0] and w[7:4] to save area. The outputs mirrored through M1~M2 and M3~M4 represent LSB and MSB, respectively, and they have different size. The amount of

NoiseI OU

T

Vin

Class B Class A

I OU

T

Vin

Class B Class A

Overlap

Fig. 2. Changes of RBF shape due to environmental noise DVDD.

Current-MultiplierCurrent-Multiplier

cij

Current-MultiplierCurrent

MultiplierI-V

Conv RBFCurrent

Multiplier

wjkI-V

Conv

Analog Neural Network Core

Digital Controller

DAC Bank

Network Interface / DMA

Learning Unit (LU)Control Unit (CU) Weight MEM Bank

Rc, Rw, Rh

Fig. 3. Overall architecture of the mixed-mode NNC processor.

VDD

Ioutx2 x1

Iin

w7 w6 w5 w4x1x2x4x8

w3 w2 w1 w0x1x2x4x8

VDD

M2 M1M4 M3

Fig. 4. 8-bit current multiplier.

VBP

VBN

VRef1 VRef2

VOUTVIN

VDD

0 1.2

0.0

1.2

VIN (V)

V OUT

(V)

VBP

VBN

Fig. 5. Controllable sigmoid circuit for activation and measured waveform.

132 KYUHO LEE et al : A LOW-POWER, MIXED-MODE NEURAL NETWORK CLASSIFIER FOR ROBUST SCENE CLASSIFICATION

final output is as (3).

iout in i

I = I ( w 2 )´å (3)

Fig. 5 shows the sigmoid circuit for the output neuron. Bias voltages VBP and VBN control the transient points and output range. The reference voltages define shape and slope of the sigmoid function, and setting their values can also provide approximately linear functions. The controllable sigmoid circuit saves area and power of the NNC processor.

For low-power consumption, the analog core operates with current in the order of nA. Therefore, the circuit becomes too sensitive to environmental noises DT and DVDD variations. To compensate for such noise, all the analog circuit adopts a stable current reference from [21] with modifications, as shown in Fig. 6. The stacked

output nodes generate stable current where its amount is controlled by the 5-b switches BS[4:0].

Fig. 7(a) shows the proposed RBF circuit which is highly controllable with six parameters: Vref1, Vref2, B[4:0], Ix, Sel_p, F_Sel. Since bias voltages Vref1, Vref2 define transient points of the V-I curve, their combination defines center and width of the RBF and approximated sigmoid function shape. Switches B[4:0] controls gm of the curve; height is set by Ix, which is the binary switches BS[4:0] in the current source; and up/down phase of the function is set by multiplexer Sel_p. MUX, F_sel, and level shifter invert the input voltage domain to generate much sharp bump functions as described in [20]. Fig. 7(b) shows measured waveforms of the circuit. The circuit generates various shapes of RBF (black, red, green, blue) as well as sigmoid-shaped function depicted in purple dots where Vref1 is 0 V and Vref2 is 0.6 V. Moreover, it reduced DT noise error by 92.7% testing under [-37℃, 87℃], and achieved stable output current within ±0.2 V window of DVDD where VDD is 1.2 V.

2. Digital Controller for Learning

Fig. 8 shows the digital controller that consists of a

VDD

...IOUT1

G

S

D

x1x2x4x8x16BS[4:0]

IOUTn

Fig. 6. Environmental noise robust current source.

VDD

Vref1 Vref2Vin

VDD

MU

X

Sel_p

IOUT

IxF_Sel

VREF

VIN

LevelShifter

RBF

IOUT

IOUT

MUX

G

S

D

x1x2x4x8x16

B[4:0]

(a)

0 1.20.4

1.6

VIN (V)

I OUT

(mA

)

0.8V1.0V2.0V

DIOUT 0.5mA

- 37 C27 C

87 C0.4

1.6

I OUT

(mA

)

0 1.2VIN (V)0.4

1.6

I OUT

(mA

)

0 1.2VIN (V)

(b)

Fig. 7. (a) The highly controllable RBF circuit with noise compensation, (b) Measured analog waveforms of RBFs show diversity of activation functions with noise compensation.

Configuration Memory

Control Unit (CU) Learning Unit (LU)

NNC FSM Controller

K-means Clustering

Accelerator (KCA)

Back Propagation Accelerator (BPA)

DAC Bank

Network Interface / DMA

NN Memory Bank

16-b RISC Controller

I$D$

Analog Core

Fig. 8. Digital controller architecture.

Mixed-Mode NN

Digital Controller

Process 0.13mm 1P8M CMOS

Area 68,400mm2 (Analog)

71,800mm2 (Digital)

Power Supply 1.2V

Operating Freq. 200MHz (Digital)

PowerConsumption

723mW (Analog)

1.48mW (Digital)

332mm

206m

m21

6mm

Fig. 9. Chip photograph and summary table.

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.19, NO.1, FEBRUARY, 2019 133

control unit, a learning unit, and a configuration memory which controls analog parameters through DAC bank. In the learning unit, the K-means Clustering Accelerator (KCA) and the Back Propagation Accelerator (BPA) are used for learning the RBF parameters and weights of the fully-connected layers. In RBFN-mode, the current multipliers in the first layer work as wire connection and the NNC FSM controller sets RBF parameters, which are trained from the KCA, to the analog core. The BPA consists of a 4-way SIMD Multiply-and-Accumulator arrays for back-propagation and a sum-of-squared-distance unit for loss computation. The KCA contains centroid unit that finds center of each cluster, or class, and a RBF identifier which finds the shape of the RBFs. The KCA is not used in MLP-mode and the NNC FSM Controller supervises correct weights via the current multiplier in the first layer. The learning unit is used only for on-line training of the weights, which means it is not

necessarily used for feed-forward classification once the neural network parameters are trained.

IV. IMPLEMENTATION & MEASUREMENT

Fig. 9 shows the chip micrograph and the performance summary of the proposed NNC processor. It is fabricated in 0.13 mm CMOS process as a part of the mobile object recognition processor [8]. The NNC processor occupies 0.140 mm2 and consumes 2.20 mW running at 200 MHz for digital domain, while the whole SoC [8] occupies 25.0 mm2 and consumes 260 mW in average. The power consumption of the analog neural network core is only 723 mW due to the current-mode circuits. Compared with conventional fully-digital implementation, the proposed processor saves area and power by 84.0% and 82.2%, respectively by utilizing mixed-mode architecture.

Table 1 is the performance comparisons with the

Table 1. Comparisons with Neural Network Processors

Reference CLASSIFIER TYPE Signal Type Complexity

[# of weights] Programmability Process Power [mW]

Area [mm2]

Power h [#weight/mW]

Area h [#weight/mm2]

Seo [11]* 1)SNN Digital 64k (256x256) High 45 nm ~3.00 4.20 7560 1870 Yang [13] RBFN Digital (FPGA) 135 Extremely High 0.18 mm 967 N/A 0.140 N/A

Lont [16] MLP Analog 161 Very Low 3 mm 25.0 2.40 6.44 67.1 Peng [17] RBFN Analog 14 Low 0.5 mm 2)2.24 2)0.0482 6.25 290 Kim [18] 3)NFL Mixed 12 (3x4) High 0.13 mm 2.83 0.163 4.24 73.6 Oh [19] 3)NFL Mixed 27 High 0.13 mm 1.20 0.765 271 35.3

This Work MLP/RBFN Mixed 750 (30x25) High 0.13 mm 2.20 0.140 341 5360

1) SNN: Spiking Neural Network; 2) Numbers are available only with RBF circuits; 3) NFL: Neuro Fuzzy Logic * Power dissipation differs by the variants in [11]; power and area efficiencies are scaled to 0.13 mm process

HD (720p) Video Stream

HistogramVector

Template Matching & Max Pooling (HMAX) RBFN Scene Classification

RBFN operated recursively (25 scene categories)

GRASS SKYROAD

Spatial Scene Map

RecognitionResult

SIFT-based Object Recognition Pipeline

Object Recognition

Scene Consistent Database Search

image width

imag

e he

ight

image width

image height

128x128Macro tile

blocks

GRASS

DB bin#1 DB bin#2

DB bin#3 DB bin#4

< q

> qin > q

< q

> q

< q

scene1

scene2

scene25

ROAD ROAD ROAD GRASS

SKY

SKY

SKY

SKY

SKY

SKY

SKY

ROAD

SKY

ROAD

Fig. 10. Measurements process of scene classification for object recognition.

134 KYUHO LEE et al : A LOW-POWER, MIXED-MODE NEURAL NETWORK CLASSIFIER FOR ROBUST SCENE CLASSIFICATION

analog/digital ASIC and FPGA implementations. We define complexity, which represents the number of weights, to compare power and area efficiencies. Also, the efficiencies of [11] are scaled to 0.13 mm process by applying Dennard Scaling for comparison. By the natural characteristics of spiking neural network, [11] shows the greatest power efficiency and dense complexity but it requires large area to implement every neuron, therefore its area efficiency is less than this work. Also, this work can support both MLP and RBFN by having reconfigurable architecture. Compared to analog circuits, mixed-mode implementations are advantageous to obtain high programmability. Among the mixed-mode processors [18, 19], this work achieved the highest efficiency in terms of power and area due to its recursive operating architecture for 25-category classification.

Fig. 10 shows the measurement process of the proposed processor, which is used as a visual attention for the entire object recognition SoC [8]. The input image is decomposed into 128x128 pixel macro-blocks and HMAX is performed over each block to extract statistical descriptor vector that becomes the input to the RBFN. Then, the macro-block is classified as one of the pre-trained 25 scene categories by recursive operation of RBFN. Finally, the input image turns into spatially organized scene map and object recognition is performed within the macro-blocks. The scene classification result provides the likelihood of target object on the object recognition pipeline, therefore, only correct objects of interest are detected. For example of safe driving, drivers are interested in moving vehicles on the road, not the vehicles on the advertisement. Scene classification with RBFN provides the contextual information to detect objects on road scene category. Fig. 11 shows the evaluation platform and results. The SoC is integrated with the multimedia expansion board and evaluated in city-view experiment set, where the target object is the toy car on the road and the distractor is a vehicle on an advertising board. The RBFN scene classification results in context-aware map as depicted in the right-bottom, and only the target object on road scene is recognized while the advertisement is neglected.

Thanks to mixed-mode implementation of the proposed NNC processor, the overall visual attention accuracy is increase to 84% that is 1.40x improvement to the conventional visual attention model. As a result, the

entire SoC [8] achieved 96% of object recognition accuracy in the test of 200 objects with 25 scene categories. In addition to the scene classification with HMAX descriptor, sole classification accuracy is measured with handcrafted test vectors and the proposed NNC processor achieved 92%.

V. CONCLUSIONS

In this work, a reconfigurable mixed-mode neural network classifier processor is proposed for robust and low power scene classification as a part of mobile object recognition processor. It consists of noise tolerant analog circuits compensate for temperature and supply voltage variations in order to achieve high classification accuracy, and supports both MLP and RBFN. The proposed processor fabricated in 0.13 mm CMOS process consumes 2.20 mW running at 200 MHz; it achieves 92% classification accuracy. Thanks to the analog-digital mixed-mode implementation, the proposed neural network classifier processor reduced area and power by 84.0% and 82.27% compared with fully-digital ASIC implementation, respectively.

ACKNOWLEDGMENTS

This work was supported by the research fund (1.180081.01) of UNIST.

ProposedObject

Recognition Processor

Evaluation Board

Fig. 11. Evaluation system and results.

JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.19, NO.1, FEBRUARY, 2019 135

REFERENCES

[1] A. Krizhevsky, et al, “ImageNet classificatino with deep convolutional neural networks,” in Advances in Nueral Information Processing Systems (NIPS) 25, 2012.

[2] C. Szegedy, et al, “Going deeper with convolutions,” IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 1-9, Jun. 2015.

[3] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, Jun. 2016.

[4] S. Xie, et al, “Aggregated residual transformations for deep neural networks,” IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 5987-5995, Jul. 2017.

[5] J. Redmon, et al, “You only look once: unified, real-time object detection,” IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, Jun. 2016.

[6] Y. Ou and Y. Oyang, “A novel radial basis function network classifier with centers set by hierarchical clustering,” in International Joint Conference on Neural Networks, pp. 1383-1388, Aug. 2005.

[7] S. Sardar, et al, “A hardware/software co-design model for face recognition using cognimem neural network chip,” in IEEE International Conference on Image Information Processing, pp. 1-6, Nov. 2011.

[8] J. Park, et al, “A 646GOPS/W multi-classifier many-core processor with cortex-like architecture for super-resolution recognition,” IEEE International Solid-State Circuits Conference Digest of Tech. Papers, pp. 168-169, 2013.

[9] G. Kim, et al, “A 1.22 tops and 1.52mW/MHz augmented reality multi core processor with neural network NoC for HDM applications,” IEEE Journal of Solid-State Circuits, vol. 50, no. 1, pp. 113-124, Jan. 2015.

[10] I. Hong, et al, “A 2.71nJ/pixel gaze-activated object recognition system for low-power mobile smart glasses,” IEEE Journal of Solid-State Circuits, vol. 51, no. 1, pp. 45-55, Jan. 2016.

[11] J. Seo, et al, “A 45nm CMOS neuromorphic chip with a scalable architecture for learning in networks of spiking neurons,” in Proceedings of

IEEE Custom Integrated Circuits Conference, Oct. 2011.

[12] P. Ienne, et al, “Special-purpose digital hardware for neural networks: an architectural survey,” Journal of VLSI Signal Processing Systems, vol. 13, no. 1, pp. 5-25, Aug. 1996.

[13] F. Yang and M. Paindavoine, “Implementation of an RBF neural netwrok on embedded systems: real-time face tracknig an didentity verification,” IEEE Transactions on Neural Networks, vol. 14, no. 5, pp. 1162-1175, Sept. 2003.

[14] K.-L. Du and M.N.S. Swamy, Neural Networks in a Softcomputing Framework, London, Springer, 2006, Ch. 6.14, pp.285.

[15] K. Kang and T. Shibata, “An on-chip-trainable gaussian-kernel analog support vector machine,” IEEE Transactions on Circuits and Systems I, vol. 57, no. 7, pp. 1513-1524, Jul. 2010.

[16] J. Lont and W. Guggenbuhl, “Analog CMOS implementation of a multilayer perceptron with nonlinear synapses,” IEEE Transactions on Neural Networks, vol. 3, no. 3, pp. 457-465, May 1992.

[17] S. Peng, P. Hasler, and D. Anderson, “An analog programmable multi dimensional radial basis function based classifier,” IEEE Transactions on Circuits and Systems, vol. 54, no. 10, pp. 2148-2158, Oct. 2007.

[18] M. Kim, et al, “A 54GOPS 51.8mW analog-digital mixed mode neural perception engine for fast object detection,” in IEEE Custom Integrated Circuits Conference, pp. 649-652, 2009.

[19] J. Oh, S. Lee, and H.-J. Yoo, “1.2mW online learning mixed-mode intelligent inference engine for low-power real-time object recognition processor,” IEEE Transactions on VLSI Systems, vol. 21, no. 5, pp. 921-933, May 2013.

[20] K. Lee, et al, “A multi-modal and tunable radial-basis-function circuit with supply and temperature compensation,” in Proceedings of IEEE International Symposium on Circuits and Systems, pp. 1608-1611, May 2013.

[21] C. Yoo and J. Park, “CMOS current reference with supply and temperature compensation,” IEEE Electronics Letters, vol. 43, no. 25, pp. 1422-1424, Dec. 2007.

136 KYUHO LEE et al : A LOW-POWER, MIXED-MODE NEURAL NETWORK CLASSIFIER FOR ROBUST SCENE CLASSIFICATION

Kyuho Jason Lee (S’12-M’17) received B.S., M.S., and Ph.D. degrees in the School of Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea in 2012, 2014 and 2017, respectively. From

2017 to 2018, he has researched as a postdoctoral researcher in Information Engineering and Electronics Research Institute, KAIST, Daejeon, Korea, and as a Principal Engineer in UX Factory Inc., Pangyo, Korea. Now he is an Assistant Professor at the School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology (UNIST). His research interests include mixed-mode neuromorphic SoC, deep learning processor, Network-on-Chip architectures, and intelligent computer vision processor for mobile devices and autonomous vehicles.

Junyoung Park (S’09-M’15) received Ph.D. degree in Electrical Engineering from Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea in 2014. His Ph.D. research focused on System-on-a-Chip (SoC) architec-

tures for energy-efficient vision processing and artificial intelligence. He is interested in customized architectures and circuits for computationally intensive algorithm such as computer vision, machine learning, and their integration on the mobile platform. Since 2015, he has been running a start-up, UX Factory Inc., which is dedicated to deliver the AI solutions derived from Software-System-on-chip technologies.

Hoi-Jun Yoo (M’95 – SM’04 – F’08) received the B.S. degree in electronics engineering from Seoul National University, Seoul, South Korea, in 1983, and the M.S. and Ph.D. degrees in electrical engineering from Korea Advanced

Institute of Science and Technology (KAIST), Daejeon, in 1985 and 1988, respectively. From 2001 to 2005, he

was the Director of Korean System Integration and IP Authoring Research Center (SIPAC), South Korea. From 2003 to 2005, he was a full-time Advisor to the Korean Ministry of Information and Communication, South Korea, and the National Project Manager for System-on-Chip and Computer. In 2007, he founded System Design Innovation & Application Research Center (SDIA) at KAIST. Since 1998, he has been with the Department of Electrical Engineering, KAIST, where he is currently a Full Professor. He has coauthored DRAM Design (Hongrung, 1996), High Performance DRAM (Sigma, 1999), Future Memory: FRAM (Sigma, 2000), Networks on Chips (Morgan Kaufmann, 2006), Low-Power NoC for High-Performance SoC Design (CRC, 2008), Circuits at the Nanoscale (CRC, 2009), Embedded Memories for Nano-Scale VLSIs (Springer, 2009), Mobile 3D Graphics SoC form Algorithm to Chip (Wiley, 2010), Bio-medical CMOS ICs (Springer, 2011), Embedded Systems (Wiley, 2012), and Ultra-Low-Power Short-Range Radios (Springer, 2015). His current research interests include computer vision system-on-chip, body-area networks, and biomedical devices and circuits. Dr. Yoo has been serving as the General Chair of the Korean Institute of Next Generation Computing since 2010. He was a member of the Executive Committee of ISSCC, the Symposium on VLSI Circuits, and A-SSCC, the TPC Chair of A-SSCC 2008 and ISWC 2010, an IEEE Distinguished Lecturer from 2010 to 2011, the Far East Chair of ISSCC from 2011 to 2012, the Technology Direction Sub-Committee Chair of ISSCC 2013, the TPC Vice Chair of ISSCC 2014, and the TPC Chair of ISSCC 2015. He was a recipient of the Electronic Industrial Association of Korea Award for his contribution to DRAM technology in 1994, the Hynix Development Award in 1995, the Korea Semiconductor Industry Association Award in 2002, the Best Research of KAIST Award in 2007, the Scientist/Engineer of this month Award from the Ministry of Education, Science, and Technology of Korea in 2010, the Best Scholarship Awards of KAIST in 2011, and the Order of Service Merit from the Ministry of Public Administration and Security of Korea in 2011. He was a co-recipient of the ASP-DAC Design Award 2001, the Outstanding Design Awards of 2005, 2006, 2007, 2010, 2011, 2014 A-SSCC, and the Student Design Contest Award of 2007, 2008, 2010, 2011 DAC/ISSCC.