VLSI - COMPATIBLE IMPLEMENTATIONS FOR ARTIFICIAL NEURAL ...978-1-4615-6311-2/1.pdf · ANALOG VLSI IMPLEMENTATION OF NEURAL NETWORKS, ... 2.5.6 Sub-Threshold Neural-Network Designs

VLSI - COMPATIBLE IMPLEMENTATIONS FOR ARTIFICIAL

NEURAL NETWORKS

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

ANALOG CIRCUITS AND SIGNAL PROCESSING Consulting Editor

Mohammed Ismail Ohio State University

Related Titles:

CHARACTERIZATION METHODS FOR SUBMICRON MOSFETs, edited by Hisham Haddara ISBN: 0-7923-9695-2

LOW-VOLTAGE LOW-POWER ANALOG INTEGRA TED CIRCUITS, edited by Wouter Serdijn ISBN: 0-7923-9608-1

INTEGRATED VIDEO-FREQUENCY CONTINUOUS-TIME FILTERS: High-Performance Realizations in BiCMOS, Scott D. Willingham, Ken Martin

ISBN: 0-7923-9595-6 FEED-FORWARD NEURAL NETWORKS: Vector Decomposition Analysis, Modelling andAnakJg Implementation, Anne-Johan Annema

ISBN: 0-7923-9567-0 FREQUENCY COMPENSATION TECHNIQUES LOW-POWER OPERATIONAL AMPLIFIERS, Ruud Easchauzier, Johan Huijsing

ISBN: 0-7923-9565-4 ANALOG SIGNAL GENERATION FOR BIST OF MIXED-SIGNAL INTEGRA TED CIRCUITS, Gordon W. Roberts, Albert K. Lu

ISBN: 0-7923-9564-6 INTEGRA TED FIBER-OPTIC RECEIVERS, Aaron Buchwald, Kenneth W. Martin

ISBN: 0-7923-9549-2 MODELING WITH AN ANALOG HARDWARE DESCRIPTION LANGUAGE, H. Alan Mantooth,Mike Fiegenbaum

ISBN: 0-7923-9516-6 LOW-VOLTAGE CMOS OPERATIONAL AMPLIFIERS: Theory, Design and Implementation, Satoshi Sakurai, Mohammed Ismail

ISBN: 0-7923-9507-7 ANALYSIS AND SYNTHESIS OF MOS TRANSLINEAR CIRCUITS, Remco 1. Wiegerink

ISBN: 0-7923-9390-2 COMPUTER-AIDED DESIGN OF ANALOG CIRCUITS AND SYSTEMS, L. Richard Carley, Ronald S. Gvurcsik

ISBN: 0-7923-9351-1 HIGH-PERFORMANCE CMOS CONTINUOUS-TIME FILTERS, Jose Silva-Martinez, Michiel Steyaert, Willy Sansen

ISBN: 0-7923-9339-2 SYMBOLIC ANALYSIS OF ANALOG CIRCUITS: Techniques and Applications, Lawrence P. Huelsman, Georges G. E. Gielen

ISBN: 0-7923-9324-4 DESIGN OF LOW-VOLTAGE BIPOLAR OPERATIONAL AMPLIFIERS, M. Jeroen Fonderie, Johan H. Huijsing

ISBN: 0-7923-9317-1 STATISTICAL MODELING FOR COMPUTER-AIDED DESIGN OF MOS VLSI CIRCUITS, Christopher Michael, Mohammed Ismail

ISBN: 0-7923-9299-X SELECTIVE LINEAR-PHASE SWITCHED-CAPACITOR AND DIGITAL FILTERS, Hussein Baher

ISBN: 0-7923-9298-1 ANALOG CMOS FILTERS FOR VERY HIGH FREQUENCIES, Bram Nauta

ISBN: 0-7923-9272-8 ANALOG VLSI NEURAL NETWORKS, Yoshiyasu Takefuji

ISBN: 0-7923-9273-6 ANALOG VLSI IMPLEMENTATION OF NEURAL NETWORKS, Carver A. Mead, Mohammed Ismail

ISBN: 0-7923-9049-7 AN INTRODUCTION TO ANALOG VLSI DESIGN AUTOMATION, Mohammed Ismail, Jose Franca

ISBN: 0-7923-9071-7

VLSI - COMPATIBLE IMPLEMENTATIONS FOR ARTIFICIAL

NEURAL NETWORKS

by

Sied Mehdi Fakhraie University ofTehran

Kenneth Carless Smith University of Toronto

Hong Kong University ofScience & Technology

~.

" SPRINGER SCIENCE+BUSINESS MEDIA, LLC

in a photoof the

ISBN 978-1-4613-7897-6 ISBN 978-1-4615-6311-2 (eBook) DOI 10.1007/978-1-4615-6311-2

Library of Congress Cataloging-in-Publication Data

A C.I.P. Catalogue record for this book is available from the Library of Congress.

Copyright © 1997 by Springer Science+Business Media New York Originally published by Kluwer Academic Publishers in 1997 Softcover reprint ofthe hardcover Ist edition 1997

AII rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, Springer Science+Business Media, LLC.

Printed on acid-free paper.

To our families, our teachers. and our students.

S. Mehdi Fakhraie

K.C.Smith

Contents

Contents • • • . • • • • . . • . . . . . . • • . . . vii

List of Figures xiii

Foreword xxi

Preface ..... . xxiii

Acknowledgments xxix

CHAPTER 1 Introduction and Motivation 1

1.1 Introduction ........................................ , 1

1.2 Motivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2

1.3 Objectives of this Work ................................ 3

1.4 Organization of the Book ............................... 6

CHAPTER 2 Review of Hardware-1m plementation Techniques 7

2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 7

2.2 Taxonomies of Neural Hardware ......................... 7

2.3 Pulse-Coded Implementations ......................... 12

2.4 Digital Implementations .............................. 12

2.5 Analog Implementations ............................. 14

2.5.1 General-Purpose Analog Hardware ............ 15

2.5.2 Networks with Resistive Synaptic Weights ...... 15

2.5.3 Weight-Storage Techniques in Analog ANNs .... 17

2.5.4 Switched-Capacitor Synthetic Neurons ......... 18

2.5.5 Current-Mode Neural Networks .............. 19

2.5.6 Sub-Threshold Neural-Network Designs ........ 19

2.5.7 Reconfigurable Structures ................... 19

2.5.8 Learning Weights .......................... 20

2.5.9 Technologies Other Than CMOS .............. 20 2.5.10 Neuron-MOS Transistors .................... 21

2.5.11 General Non-Linear Synapses ................ 22

2.6 Comparison of Some Existing Systems .................. 23

2.7 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 24

CHAPTER 3 Generalized Artificial Neural Networks (GANNs) 25

3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 25

3.2 Generalized Artificial Neural Networks (GANNs) ......... 27

3.2.1 Possible Variations ofGANNs ............... 29

3.2.2 Quadratic Networks ........................ 29

3.3 Nonlinear MOS-Compatible Semi-Quadratic Synapses ..... 30

3.4 Networks Composed of Semi-Quadratic Synapses ......... 31

3.5 Training Equations .................................. 32

3.5.1 Gradients for the Synapses in the Output Layer .. 34 3.5.2 Gradients for the Synapses in the Hidden Layer .. 34

3.5.3 Updates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 35

3.6 Simulation and Verification of the Approach ............. 35

3.6.1 Simulator Developed ....................... 35

3.6.2 The Test Problems Used .................... 36

3.6.3 Performance of Simple MOS-Compatible Synapses 38

3.7 Summary......................................... 39

via

CHAPTER 4 Foundations: Architecture Design 41

4.1 Introduction ........................................ 41

4.2 Feedforward Networks with Linear Synapses .............. 41

4.2.1 The Effect of Constrained Weights ............. 44

4.3 Feedforward Networks with Quadratic Synapses ........... 46

4.4 Single-Transistor-Synapse Feedforward Networks .......... 49

4.4.1 Analysis .................................. 49

4.5 Intelligent MOS Transistors: SyMOS .................... 53

4.6 Performance of Simple SyMOS Networks ................ 53

4.6.1 Advantages of Simple SyMOS Networks (SSNs) .. 53

4.6.2 Limitations of Simple Sy MOS Networks (SSN s) .. 55

4.7 Architecture Design in Neural-Network Hardware .......... 56

4.8 The Resource-Finding Exploration ...................... 57

4.9 The Current-Source-Inhibited Architecture (CSIA) ......... 59

4.10 The Switchable-Sign-Synapse Architecture (SSSA) ......... 62

4.11 Digital-Analog Switchable-Sign-Synapse Architecture (DASA) 67

4.12 Simulation Results and Comparison ..................... 67

4.12.1 Summary of Results ......................... 68

4.13 Our Choice of the Way to Go .......................... 70

4.14 Summary .......................................... 71

CHAPTERS Design, Modeling, and Implementation of a Synapse-MOS Device 73

5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 73

5.2 Design of a SyMOS Device in a CMOS Technology ........ 74

5.2.1 Modeling ................................. 77

5.3 Implementation ..................................... 81

5.3.1 Experimental Results ........................ 82

5.4 Reliability Issues .................................... 83

5.5 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 84

ix

CHAPTER 6 Synapse-MOS Artificial Neural Networks (SANNs) . . . . . .. 85

6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 85

6.2 Overview of the Work Leading to Hardware Implementation 85

6.3 Guidelines for Neural-Network Hardware Design ......... 87

6.4 Design and VLSI Implementation ...................... 89

6.4.1 Design of a Neuron ........................ 89 6.4.2 Design of a Switchable-Sign Synapse .......... 91 6.4.3 Design of the Sign-Select Block .............. 93

6.4.4 Dynamic Capacitive Storage ................. 95 6.4.5 Decoding Scheme .......................... 99

6.4.6 The Connectivily Issue ..................... 101

6.4.7 Augmented Synaptic Units ................. 102

6.4.8 Accumulating. Subtracting. and Scaling Unit ... 103

6.4.9 Sigmoid Output Unit ...................... 104 6.4.10 Biasing or Constant-Term (Radius) Unit ....... 106 6.4.11 Figures of Merit Established by Simulation . . . .. 107

6.5 Structure of an SSSA Chip ........................... 108

6.5.1 X and Y Decoders ........................ 108 6.5.2 Trimming and Offset-Cancellation Units ....... 110

6.5.3 Output Analog Multiplexer ................. 110 6.5.4 Input-Feeding Units ....................... 111

6.6 Training Algorithm ................................ 111

6.6.1 Effect of Training on Offset. and Mismatch .... 115

6.7 Experimental Results ............................... 115

6.8 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 123

CHAPTER 7 Analog Quadratic Neural Networks (AQNNs) 125

x

7.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 126

7.2 Background. . .. . . . . .. . ..... . . . .. . . . . .. . . .. .. . . . .. 127

7.2.1 Networks with Linear Synapses .............. 127 7.2.2 Oosed-Bounda:ry-Discriminating-Surface Nets . 128

7.3 Design of an "Analog Quadratic Neural Network (AQNN)" 130

7.3.1 Block Diagram of a General AQNN ........... 130

7.3.2 Overview of Our CMOS Design .............. 130

7.3.3 Design of the Building Blocks ................ 132

7.4 Training 137

7.5 VLSI Implementation ............................... 137

7.5.1 Subtracting-Squaring Unit (SSU) ............. 138

7.5.2 A Complete Neuron With Two Synapses ....... 139

7.6 Test Results ....................................... 140

7.6.1 Subtracting-Squaring Unit ................... 140

7.6.2 ACompleteNeuron ........................ 141

7.7 Applications ....................................... 142

7.7.1 Single-Layer Networks ..................... 142

7.7.2 Multi-LayerNetworks ...................... 143

7.7.3 Unsupervised Competitive Learning (UCL) ..... 143

7.7.4 Function Approximation .................... 145

7.8 Summary and Future Work ........................... 149

CHAPTERS Conclusion and Recommendations for Future Work . . . . .. 151

8.1 Summary of the Work ............................... 151

8.2 Contributions of this work ............................ 153

8.3 Recommendations for Future Work 153

APPENDIX A Review of Nonvolatile Semiconductor Memory Devices .... 157

A.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 157

A.2 Device Review ...................................... 157

A.2.1 Charge-Trapping Devices .................... 157

A.2.2 Floating-Gate Devices. . . . . . . . . . . . . . . . . . . . . .. 158

AJ Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 162

APPENDIXB Scaling Effects ........................... 163

xi

B.l The Effect of the Scaling of CMOS Technology onSANNs. 163

APPENDIXC Performance Evaluation 167

C.l Speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 167

C.2 Power Consumption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

C.2.l Detailed Calculation of Power Dissipation .... " 169

C.3 Area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 171

C.4 Overall Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 172

References 173

Index ................................ 185

xii

List of Figures

Fig. 1-1: A block diagram showing a general procedure for problem-solving using artificial neural networks. The dashed block can be considered to be a process of "early simulation" leading to a direction for an appropriate solution. ................................ 4

Fig. 2-1: This figure (extended to the next two pages) shows several taxonomic trees for neural-network-hardware implementations based on the criteria specified at the top of each tree. ... ............ ............ .................... ............................................................... 8

Fig. 2-2: Digital implementations for ANNs ...................................................... 13

Fig. 2-3: The schematic diagram of a digital-storage interconnecting weight with a resolution of four bits plus sign. The circuit does a D/ A conversion of the weight value and an analog multiplication of the input signal and the weight. The notation n: 1 signifies the width to length ratio of the MOS device. Note that negative four-bit numbers have their B4 bit equal to one and are stored in form of 1 's complement (e.g. -2 is written as 1101 with B4=l. This gives rise to an analog current proportional to -2it where "i" is the current produced in the smallest MOS device with Vir as its input). .... 17

Fig.2-4: (a) A drawing of a neuron MOSFET. (b) Demonstration of terminal voltages and capacitances involved in the device. ............ ............ .......... ............... ............... 22

Fig. 2-5: The table in this figure compares the computational power and power consumption of a few existing systems that use (or can be used for) neuro-computational techniques. .. ... ......... ... ............ ....... ........ .......................... ... ..................................... 24

Fig. 3-1: A general model of a neuron as discussed in this book. .. ..................... 27

Fig. 3-2: A feed-forward network composed of neurons having semi-quadratic synapses. The output of each synapse is equal to (input _ weiqht) 2 subject to practical operating constraints. See Section 3.5 for more detail. (Tnangles represent unity-gain input buffers.) ................................................................................................ 32

Fig. 3-3: A block diagram of the simulator program. .......................................... 37

Fig. 3-4: Different input characters can be mapped onto a input retina. The neural network is trained to recognize several different patterns. ..................................... 38

Fig. 3-5: Simulated performance of a two-layer neural network composed of neurons with constrained single-transistor quadratic synapses. The squared-error measure at the output is defined as the square of the difference of the desired and actual output levels ....................................................................................................................... 39

Fig.4-1: (a) A two-synapse neuron. (b) Input-output characteristic of a hard-limiter

function ................................................................................................................... 42

Fig.4.2: (a) A 3·D plot of the input·output relation for a two·linear·synapse neuron. (b) Equi-value lines at the output of the synapse mapped on the input space. . ..... 43

Fig. 4·3: General forms of regions discriminated by a multi-layer feedforward net-work using linear synapses (adapted from [88]) ..................................................... 45

Fig.4.4: (a) Solving the XOR logic problem with a two-layer perceptron. (b) The AND logic problem can be solved with a single-layer network. ............................ 46

Fig.4.5: Values of the synaptic weights in a two-layer perceptron with linear synapses used to solve the XOR problem. (a) A theoretical solution with a hard limiter. (b) A practical solution obtained by using sigmoid units and a backpropagation training algorithm with output target values of 0.2 and 0.8. Triangles show input buffers and bias terms are written inside the circles ......................................................................... 47

Fig. 4·6: (a) 3-D plot of the input-output relation implemented by a two-input qua-dratic neuron. (b) Equi-value surfaces mapped onto the input space ..................... 48

Fig.4.7: (a) This figure illustrates the way quadratic networks can solve the XOR problem. (b) A solution for the NAND problem .................................................... 49

Fig. 4·8: A neuron with two single-transistor synapses and one bias term .......... 50

Fig. 4-9: (a) Representation of internal parameters of a neuron having two single-transistor synapses. (b) Discriminating areas as recognized by the neuron. A high output value is assigned to input points above and to the right of the dashed line; otherwise the output stays at a low value. Note that different operating regions of the transistors are determined by the thin-solid lines ........................................................................... 51

Fig. 4-10: (a) 3-D plot of the input-output relationship of a simulated neuron with two single-transistor synapses. (b) Drawing of equi-value surfaces. Here, , , and ....... 54

Fig.4.11: Required inclusion (high) and exclusion (low) areas needed to solve a NAND problem. SSNs cannot support the above format, but can do its reverse, which implements an AND function (that is one in which area I is high, and area n is low) ......................................................................................................................... 56

Fig.4.12: 2·D representations of the different quadratic functions obtainable by changing the sign of various terms in the synaptic relation .................................... 60

Fig.4·13: A synapse in the Current-Source-Inhibited Architecture .................... 62

Fig.4-14: (a) 3-D plot of the input·output characteristic of a CSI two-input neuron. (b) Equi -value curves mapped on the input space .................................................. 63

xiv

Fig. 4-15: Block diagram of a synapse in the Switchable-Sign-Synapse Architec-ture .......................................................................................................................... 64

Fig. 4-16: A pictorial demonstration of the operation of a two-synapse SSSA neuron llSing MA lLAB simulation. Signs of the two synaptic terms and the radius term for each case are: (a) (1,1,-1), (b) (-1,-1,1), (c) (1,-1,1), (d) (-1,1,-1), (e) (-1,1,1), (t) (1,-1,-1). Notice the complementary nature of each horizontal pair ................. 65

Fig. 4-17: Equi-value surfaces obtained for different configurations of the sign variables in a two-synapse SSSA neuron using MAlLAB simulations. Signs of two synaptic terms and the radius term are given for each case: (a) (1,1,-1), (b) (-1,-1,1), (c) (1,-1,1), (d) (-1,1,-1), (e) (-1,1,1), (t) (1,-1,-1). Notice the complementary nature of each horizontal pair. Members of each pair have similar-looking equi-value surfaces, bow ever, their corresponding high- and low-output regions are exact complements (see Fig. 4-16) ......................................................................................................... 66

Fig.4-18: Block diagram of the combined digital and analog Switchable-Sign-Syn-apse Architecture (DASA). .................................................................................... 68

Fig. 4-19: This table compares typical results obtained by using three different architectures for solving the XOR problem. Each test network has two input units, two hid-den units, and one output unit. ........ .............. .......... ............ ............ ....................... 70

Fig.S-1: A schematic representation of a controllable-threshold device ............ 75

Fig. 5-2: Simulation of the externally-controllable-threshold-voltage transistor with HSPICE, current is plotted versus the sweep voltage applied to the input gate. For each curve, a different bias is applied to the control gate. ............................................. 76

Fig. 5-3: For transistors with W =5um. and having different channel lengths in a 1.2wn process, the threshold voltage is extracted from SQRT(I) versus input-voltage curves. For L>2wn, different devices show essentially the same threshold voltage. ......... 77

Fig. 5-4: (a) A ~picallayout for a Sy MOS device. (b) A schematic representation for the device ................................................................................................................ 78

Fig.S-S: (a) Major capacitances involved in a Sy MOS device are shown. (b) An equivalent circuit after the less-significant capacitances are deleted and the source and substrate are connected together. ............................................................................ 78

Fig. 5-6: The terminal capacitances involved in a Sy MOS transistor in its primary operating regions. Lov is the length of gate overlap with drain and source. (a) Cutoff re-gion. (b) Saturation region. (c) Triode region. ...................................................... 80

Fig.S-7: Experimental chip fabricated to characterize various SyMOS devices. 82

xv

Fig.5-8: SQRT(I) versus Vgs graphs measured for a SyMOS device with W=L=5um .............................................................................................................. 83

Fig. 5-9: Measured threshold voltages as seen from the input gate, when the voltage at the control gate is changed. ........................ .......................................... ............... 84

Fig. 6-1: Basic structure of a neuron in SSSA. Here SSSi represents the switchable-sign synapse number i ............................................................................................. 90

Fig. 6-2: Schematic diagram of a synaptic cell in the SSS architecture. ........ ...... 91

Fig.6-3: The effect of coupling-capacitor size on the behavior of a SyMOS device. W=5um and L=2um ................................................................................................ 92

Fig. 6-4: Layout diagram of the SRAM cell used in this work. ........................... 93

Fig. 6-5: Dimensions of the transistors used in the synaptic SRAM cell. ............ 94

Fig.6-6: Voltage drop on the transmission-gate switches used in a synapse between the receiving mirror and the Sy MOS device. Results for a transmission gate, and single NMOS and PMOS transistors are shown, and the operating region of the synaptic tran-sistor is specified. .... .............................. ................. .............. .......... ................... ...... 95

Fig. 6-7: (a) A switch connected to a capacitor. (b) The equivalent noise circuit when the switch is on. ............................ .................... .............. ........ ............ .................... (J]

Fig. 6-8: Thermal injected noise from the switch for different capacitive values. 97

Fig. 6-9: Two possible selection schemes for addressing an analog cell. Scheme (A) uses transmission gates directly, while scheme (B) uses AND decoding ........... 100

Fig.6-10: Complete layout of a synaptic cell in our first design. ...................... 102

Fig.6-11: Layout diagram of a two-unit synaptic cell ....................................... 103

Fig.6-12: Schematic diagram of the Current Accumulating, Scaling, and Subtracting (CASS) unit. Each neuron has one of these units ................................................. 104

Fig.6-13: Layout of the CASS unit. .................................................................. 105

Fig.6-14: Sigmoid output unit used in this work. It also performs current-to-voltage conversion. ............................................................................................................ 105

Fig. 6-15: Layout of the CASS and sigmoid units compacted to a form compatible with that of synaptic cells ..................................................................................... 107

Fig.6-16: Simulated delays from the input of a synapse to the neuronal output stage

xvi

with different output loads ................................................................................... 108

Fig. 6-17: The building block used to generate the layout of a 6-input, 32-output NAND decoder. .................................................................................................... 109

Fig. 6-18: Layout of the two-inverter circuit used to buffer the input addressing lines and provide the required complementary signals to drive the NAND decoders. Doughnut-shaped gates provide more driving power in a smaller area and with fewer para-sitics ...................................................................................................................... 110

Fig. 6-19: Layout of the output analog multiplexer. The state of SRAM determines whether the inputs are connected to the outputs or not SWi&j label different pairs of switches. .................................................................. ............................................. 111

Fig. 6-20: The experimental chip fabricated to characterize various subblocks in a Sy-MOS network. . ................................................... ................................ .................. 116

Fig. 6-21: The measured characteristic curve of a sigmoid unit ....................... 117

Fig.6-22: Input-output characteristic curves of a neuron with one synapse and one bias term. .................................................. .................... ........................... ............. 118

Fig. 6-23: A photomicrograph of our third fabricated test chip. It includes 50 switch-able-sign synapses and 10 complete neurons. .... ............ .......... ............ ........ ........ 119

Fig.6-24: Plot of the measured equi-value surfaces generated by a two-synapse SSSA neuron. Different areas correspond to different operating regions of the transistors. The central discriminating curve is highlighted. ......................................................... 120

Fig.6-25: This table shows the signs of different terms in the characteristic equation of a neuron (Equation 6-4) which have been used to obtain characteristic curves plotted in Fig. 6-26 and Fig. 6-27 ..................................................................................... 120

Fig. 6-26: 3-D plots of the input-output characteristics of a switchable-sign-synapse neuron for the different sign configurations listed in Fig. 6-25. .......................... 121

Fig.6-27: Plots of the equi-value curves in the input-output characteristics of a switchable-sign-synapse neuron for the different sign configurations listed in Fig. 6-25.... ........................................................................................................... 122

Fig.7-1: (a) The block diagram of a neuron with quadratic synapses, where Kj are synaptic gains, W j determine the centre of the quadratic shape, V j are the inputs, and R2 is a constant bias term. (b)Typical quadratic discriminating surfaces in 2-D repre-sented by this block ............... ............ ............ .......... ............ ........ .............. ........... 131

Fig. 7 -2: Basic diagram demonstrating the operation of a symmetric MOS difference-

xvii

squarer circuit. ...................................................................................................... 133

Fig. 7-3: (a) A CMOS pair is substituted for each MOS transistor. (b) A CMOS pair biased with a current source implements a floating voltage source [141] ........... 135

Fig. 7-4: Schematic diagram of the subtracting-squaring unit used as a basic synaptic block ..................................................................................................................... 136

Fig. 7-5: A photomicrograph of an AQNN test chip fabricated in a CMOS 1.2 micron technology. This design includes basic blocks as well as a two-synapse qua-dratic analog neuron (to the left) .......................................................................... 138

Fig. 7-6: Layout of the subtracting-squaring unit. ................ ..... ....................... 139

Fig. 7-7: Block diagram of a neuron with two quadratic synapses ................... 140

Fig. 7-8: Test results for the subtracting-squaring unit. As seen, a symmetric and qua-dratic output current versus differential input voltage results .............................. 141

Fig. 7-9: 3-0 plot of the output versus inputs of a two-synapse AQNN .......... 142

Fig. 7-10: Equi-value points in the output. Circular discriminating functions aredem-onstrated here ........................................................................................................ 143

Fig.7-11: A single layer of quadratic analog neurons can isolate different islands of data in their input space ........................................................................................ 144

Fig. 7-12: Some of the geometrical shapes readily discriminated by a network using one layer of AQNs and other layers of conventional linear neurons .................... 144

Fig. 7-13: The net stimulation current received by an AQN is directly proportional to the square of the Euclidean distance between input and weight vectors .......... 145

Fig. 7-14: (a) A linear-synapse bump-generator network and its output (b) ..... 146

Fig. 7-15: (a) A neuron with a single quadratic synapse is a good candidate for use in 1-0 function approximators. (b) Its typical output. .......................................... 148

Fig. 7-16: A general function approximator in an N-D space ........................... 150

Fig.8-1: A complete switchable-sign synapse in a floating-gate technology. It employs only 3 charge-injection floating-gate transistors, requiring no other synaptic memory device or support circuitry ...................................................................... 154

Fig. A-I: An idealized cross section of an n-channel MNOS device ................ 158

Fig. A-2: A simple representation of a floating-gate device with a thin-oxide layer on

xviii

its drain area ......................................................................................................... 159

Fig. A-3: (a) An inter-poly charge-injector can be implemented using a standard dou-ble-poly CMOS process. (b) Cross-sectional view of the charge injector ........... 161

Fig. B-1: The effect oflinear operation ofMOS devices in an SANN. (a) Synapses have positive signs and the radius term is negative. (b) and (c) One of the synaptic terms is negative while the bias term is positive (continued on subsequent pages). Wl=W2=2 andR=I ............................................................................................. 164

xix

Foreword

This book introduces several state-of-the-art VLSI implementations of artificial neural networks (ANNs). It reviews various hardware approaches to ANN implementations: analog, digital and pulse-coded. The analog approach is emphasized as the main one taken in the later chapters of the book.

The area of VLSI implementation of ANNs has been progressing for the last 15 years, but not at the fast pace originally predicted. Several reasons have contributed to the slow progress, with the main one being that VLSI implementation of ANNs is an interdisciplinaly area where only a few researchers, academics and graduate students are willing to venture. The work of Professors Fakhraie and Smith, presented in this book, is a welcome addition to the state-of-the-art and will greatly benefit researchers and students working in this area. Of particular value is the use of experimental results to backup extensive simulations and in-depth modeling. The introduction of a synapse-MOS device is novel. The book applies the concept to a number of applications and guides the reader through more possible applications for future work.

I am confident that the book will benefit a potentially wide readership.

M. I. Elmasry

University of Waterloo

Waterloo, Ontario

Canada

Preface

Neural Networks (NNs), generally defined as parallel networks that employ a large number of simple processing elements to perform computation in a distributed fashion, have attracted a lot of attention in the past fifty years. As the result. many new discoveries have been made. For example, while conventional serial computational techniques are reaching intrinsic phy sical limits of speed and performance, parallel neural-computation techniques introduce a new horizon directing humans towards the era of tera-computation: The inherent parallelism of a neural-network solution is one of the features that has attracted most attention. In addition, the learning capability embedded in a neural solution provides techniques by which to adapt potentially low-cost low-precision hardware in ways which are of great interest on the implementation side.

Although the majority of advancements in neuro-computation have resulted from theoretical analysis, or by simulation of parallel networks on serial computers, many of the potential advantages of neural networks await effective hardware implementations. Fortuitously, rapid advancement of VLSI technology has made many of the previously-impossible ideas now quite feasible.

The first transistor was invented nearly fifty years ago; yet it took more than a decade until early integrated circuits began to appear. Since then, the dimensions of a minimum-size device on an integrated circuit chip have shrunk dramatically. Nowadays, to have a billion transistors on a chip seems quite possible. However, such transistors have their own limitations which impose special conditions on their effective use.

The inherent parallelism of neural networks and their trainable-hardware implementation provide a natural means for employment of VLSI technologies. Analog and digital storage techniques are conveniently available in VLSI circuits. Also implementation of addition, multiplication, division, exponentiation and threshold operations have proved to be possible. As well, through advances in VLSI technology, multilayer metal and polysilicon connecting lines have eased the communication problems inherent in any implementation of artificial neural networks (ANNs). Thus the VLSI environment naturally suits neural-network implementation. Moreover, fortuitously, tolerances, mismatches, noise, and other hardware imperfections in VLSI can be best accommodated in the adaptive-training process which ANNs naturally incorporate.

Thus, it appears that these two landmark technologies, ANNs and VLSL rapidly emerging in the final years of the second millennium, must be united for mutual benefit: one provides a seemingly never-satisfiable demand for more and more processing elements organizable to deal with real-world information-processing problems; the other delivers an apparently ever-increasing number of resources on a

chip. One provides a reasonable performance with a great degree of tolerance to the operation of any single device; the other can best take advantage of this property to increase the overall acceptable yield of working systems, despite the increasing level of imperfections which accompanies the ever-shrinking dimensions of a single device on a chip of potential1¥ ever-increasing area. Localil¥ of individual operations with global communication of information, and possible modularity of system-level designs, in combination with unsupervised training algorithms, are among the many interesting features that encourage the expectation of a better future available through the combination of these fields.

However, one annoying problem, which shows itself every now and then, is that although the marriage of VLSI implementation and neural-information-processing techniques seems natural and inevitable, one which is publicJ¥ encouraged and even announced as attractive by most researchers in these fields, their union is not without difficulty. In fact, at some level of detail, it may be quite unnatural! The problem is that the basic premises on which each wolks are quite different.

VLSI technology has its own wzry of expressing relations and connections. In reality, a VLSI layout is composed of different lzryers of metals, other conductors, various insulators, semiconductors, and so on. When several of these basic elements are combined, one of several basic electrical elements is formed (in particular, resistors, capacitors, inductors, diodes, and transistors), each represented by a small number of characteristic equations whose role and validity are generally understood and not simply proprietary knowledge of the layout engineer. Designers use these abstract representations to realize their circuits as larger abstractions, which, after several levels of translation and compilation, are final1¥ mapped onto a piece of silicon or other electronic medium.

Correspondingly, but differently, neural netwolks have their own block structure for structuring, expressing, and processing the facts they embody. Their structure is normally quite simple, repeatable, and usualJ¥ complete with some appropriate ana1¥ tical interpretation. Each processing element employ s the basic characteristics of its building blocks, whose origin, naturally enough, is in the modelling of real biological systems. LogicalJ¥ enough, the simplicil¥ of these building blocks has a dramatic impact on the feasibility of the construction of a computational medium composed of potentialJ¥ billions of basic elements. Ironically, while the simplicil¥ of neural characteristics is well-adapted to biological implementations, it is not so suited to silicon ones!

The disturbing fact is that the normal characteristic equation of a processing element (a neuron), or of an interconnection (a synapse) in an artificial neural network, is not at all similar to the characteristic equation or physical behavior of any basic building element easiJ¥ available in any current VLSI technology. This is in sharp contrast to the structure of biological neural netwolks where a startling natural harmony exists between the resources available in biological implementation media

xxiv

(including liquids, ions, membranes, ... ), and the abstract ~stem-Ievel behavioral models describing their operation.

Following the lead represented by this observation, it appears logical that instead of continuing to worlc on pure biologically-inspired models, one should concentrate on another, more global, aspect of biological neural networks: the fact that they are parallel interconnected networlcs of simple processing and interconnecting elements that naturally employ existing resources available in their implementation media. Therefore, one can conclude that the simplicity of the processing elements, and the way their natural and intrinsic characteristic equations are employed, are important elements in attempting to make the most out of what is readily available.

These biologically-inspired observations encouraged us to view the VLSI implementation of artificial neural networlcs in another light: It was f11'St to understand the optimali~ principles governing the development of biological neural networlcs; Then, second, with these principles as a guideline, it was to consider the resources available in our VLSI implementation medium, with a view to designing distributed parallel networks in a way that makes the most of them.

Correspondingly, in the presentation to follow, the major inspiration to be derived from biological neural networlcs will be the simple idea of employing parallel networlcs composed of distributed simple processing elements. However, rather than on biologically -inspired models and equations, the reader will find the emphasis to be placed on the intrinsic characteristics of the electronics building blocks available in the CMOS-VLSI technology being used.

Through this approach, we believe that an effective union of the VLSI and neural network fields can be achieved in which natural simplici~ is emphasized. Correspondingly, rather than a direct ~napse equivalent being implemented as a system composed of dozens of transistors, as has been done conventionally, in our work, the intrinsic operating equation of a simple MOS transistor will be employed as the basic synaptic operation.

In this book, we introduce the basic premise of our approach to biologicallyinspired and VLSI-compatible definition, simulation, and implementation of artificial neural networks. As well, we develop a set of guidelines for general hardware implementation of ANNs. These guidelines are then used to fmd solutions for the usual difficulties encountered in any potential work, and as guidelines by which to reach the best compromise when several options exist. As well, ~stem-Ievel consequences of using the proposed techniques in future submicron technologies with almost-linear MOS devices are discussed.

While the major emphasis in this book is on our desire to develop neural networlcs optimized for compatibili~ with their implementation media, we have also extended this work to the design and implementation of a fully-quadratic ANN based

xxv

on the desire to have network definitions optimized for both efficient discrimination of closed-boundary circular areas and ease of implementation in a CMOS technology.

Overall, this book implements a comprehensive approach which starts with an analytical evaluation of specific artificial networks. This provides a clear geometrical interpretation of the behavior of different variants of these networks. In combination with the guidelines developed towards a better final implementation, these concepts have allowed us to conquer various problems encountered and to make effective compromises. Then, to facilitate the investigation of the models needed when more difficult problems must be faced, a custom simulating program for various cases is developed. Finally, in order to demonstrate our findings and expectations, several VLSI integrated circuits have been designed, fabricated, and tested. While these results demonstrate the feasibility of our approach, we emphasize that they merely show a direction in which to go, rather than the realization of the ultimate destination!

Finally, it is our hope that while providing our readership with the results of advanced ongoing research, they will also fmd here the outlines of a comprehensive framework within which they can develop, examine, and firmly analyze their own innovative neural-network models and ideas. Further, it is our hope that, in this respect, theoretical neural-network researchers and software developers might also find the proposed methodology quite useful for their own purposes.

Organization of the Book

After discussing the underlying motivation and objectives in Chapter 1, we will review various existing hardware-implementation techniques in Chapter 2. In Chapter 3, we present the general model we have used in developing our neural networks together with the idea of employing MOS-transistor-like processing elements directly in a neural network. Our simulation program and the test problems used in developing and verifying the ideas presented in the book are discussed as well. In Chapter 4, the foundation elements for the architectural design are described. To begin, conventional linear- and quadratic-synapse networks are introduced together with simple geometrical interpretations of their operation. A detailed analysis of the operation of our proposed single-transistor-based network follows, and related problems and potentials are examined. Then, architectures by which to take full advantage of the proposed synaptic elements, and to solve their associated problems, are described. Next, the performances of different architectures are compared based on extended simulations, and the effectiveness of our new direction is shown. In Chapter 5, we highlight our expectations for a low-level silicon device, one which we call a Synapse-MOS or Sy MOS device, which can be employed in a range of networks. Then, our present approach to implementing such a Sy MOS device in a standard double-polysilicon CMOS process, is described. Experimental results based on fabricated chips completes Chapter 5. In Chapter 6, further details of the VLSI implementation of proposed Synapse-MOS Artificial Neural Networks (SANNs) are

xxvi

discussed. and experimental results are reported. In Chapter 7, we describe a second approach, the parallel development of a novel fully-quadratic analog neural network, for which we show circuit-design detail and the results of VLSI implementation. As well, the applied advantages of this type of network in unsupelVised-competitivelearning and function-approximation problems are discussed. Finally, Chapter 8 concludes this book by providing an overall view and summary, together with the introduction of directions for possible future work.

In addition, several appendices have been added to further facilitate and extend future application of the techniques introduced in this book. Appendix A provides some information about different approaches to implementation of nonvolatile semiconductor devices. Appendix B describes our view of possible utilization of the techniques developed in this book in coming submicron CMOS technologies. Finally, in Appendix C, based on power, speed. and chip-area performance measures, some of the practical advantages of our approach are discussed

xxvii

Documents

VLSI - COMPATIBLE IMPLEMENTATIONS FOR ARTIFICIAL NEURAL ...978-1-4615-6311-2/1.pdf · ANALOG VLSI IMPLEMENTATION OF NEURAL NETWORKS, ... 2.5.6 Sub-Threshold Neural-Network Designs