Upload
kpyes34
View
217
Download
0
Embed Size (px)
Citation preview
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 1/11
RESEARCH PAPERS
accuracy, less noise sensi tiv ity, more f lexibil ity and
compatibili ty with different types of processors. These
digital implementations can be done either with a digital
signal processor or FPGA or programmable logic design.
An FPGA-based implementation, would be the best
choice from the previouslymention platforms since it can
work in parallel as is the case of ANNs behavior (Cantrel l &
Wurtz, 1993)(Baker & Hammerstrom, 1989)(Blais & Mertz,
2001)(Vargas, Barba, Torres & Mattos, 2011). Previous
research on implementing various kinds of neural networks
on the HDL platform in (Ali, &Mohammed, 2010)(Omondi &
Rajapakse, 2006)(Izeboudjen, Farah, Bessalah, Bouridane
& Chikhi, 2008)(Schemmel, Meier & Schurmann, 2001) has
focused on developing the neuron models and their
INTRODUCTION
The neuroscience, study of the human brain, is thousands
of years old. This fascination with the human brain has led
to the development of Artificial Neural Networks (ANNs)
which have been made possible due to advances in
electronics. ANNs have been used successfully in a broad
spectrum of applications such as pattern recognition,
data classification, control systems signal processing and
functional approximations, etc. Much work has been
done in these fields that rely on software simulations and
investigating the capabil it ies of the ANN models using
both analog and digital implementations (Torres-Huitzil,
Girau, & Gauffriau, 2007). Digital implementations are
more popular for their basic advantages of higher
DESIGN ENHANCEMENT OF COMBINATIONAL NEURAL
NETWORKS USING HDL BASED FPGA FRAMEWORK
FOR PATTERN RECOGNITION
PRIYANKA MEKALA *
* Research Assistant and PhD Candidate, Department of Electrical and Computer Engineering, Florida International University, Miami, FL, USA.
** Assistant Professor, Department of Electrical and Computer Engineering, Florida International University, Miami, FL, USA.
ABSTRACT
The fast emerging highly-integrated multimedia devices require complex video/image processing tasks leading to a
very challenging design process; as it demandsmore efficient and highprocessing systems.Neural networks are used in
many of these imaging applications to represent the complex input-output relationships. Software implementation of
these networks attain accuracy with tradeoffs between processing performance (to achieve specified frame rates,
working on large image data sets), power and cost constraints. The current trends involve conventional processor being
replaced by the Field programmable gate array (FPGA) systems due to their high performance when processing huge
amount of data. The goal is to design the Combinational Neural Networks (CNN) for pattern recognition using an FPGA
based platform for accelerated performance. The enhancement in speed and computation from the hardware is
being compared to the software (using MATLAB) model. The employment of HDL on the FPGA enables operations to be
performed in parallel. Thus allowing the exploitat ion of the vast parallelism found in many real-world applications such as
in robotics, controller free gaming and sign/gesture recognition. As a validation of the CNN hardware model a case
study in pattern recognition is being explored and implemented on Xilinx Spartan 3E FPGA board. Tomeasure thequality
of learning in the trained network mean squared error is used. The processing performance of this non-linear stochastic
tool is determined by comparing the HDL (parallel model) simulations with the MATLAB design (sequential model). The
gain in training time andmemory used forprocessingis also derived.
Keywords: VHDL, Combinatorial Neural Networks, Back Propagation, Pattern Recognition.
By
JEFFREY FAN **
6 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 2/11
RESEARCH PAPERS
are adaptation, parallelism, classification, optimization and
generalization. The debate over whether to build a generic
system that can be reprogrammed on user demands/
applications or a single specialized dedicated to one
application with high speed performance stil l prevails
(Omondi,&Rajapakse,2006).
The researchersproposea HDLbased designmethodology
to tradeoff the high level application requirements and the
low level FPGA hardware for patter recognition. The HDL
description has the advantages of being generic, flexible,
dynamic reconfiguration on user demand and useful to gain
more control of parallel processes. Efficient reusability and
performance is derived by providing the characteristics of
entities into a model library (Izeboudjen, Farah, Bessalah,Bouridane & Chikhi, 2008). Table 1 shows the comparison of
VHDL with the procedural languages and outlines the
advantages of characterizing digital hardware using
hardware descript ion language based on ent it y
validation. The computations involved mostly the fixed
point integer rather than floating point which results in
some false outputs. This can be fixed byintroducing libraries
defining the float type variables and vectors. Pattern
recognition using the neural networks is dealt recently in
(Vargas, Barba, Torres & Mottos, 2011). The values of pixels
of an image frame were used as inputs for recognition
thus causing increas ed memor y us age and
computations. We choose to perform the recognition on
thebitmapped (depth 4) image rather than thegrayscale
(8 bits) image. Gain in bandwidth is achieved in terms of
memory storage. Also, once the image is preprocessed,
the features are extracted and used as inputs in this
architectureproposed. In thispaper, theauthorspresent the
new generic design of Combinational Neural Network
(CNN)proposed inearlier research (Mekala, Erdogan& Fan,
2010) for pattern recognition based on Xilinx Spartan 3E
board using VHDL model called HDL-CNN. The simulation of
VHDL models are facilitated by the use of stimulus
sequences and checkers (e.g., VHDL test benches, mean
square error). The training time and computations
variations (which depend on global parameters defined
by theuser) areanalyzed anddisplayed in later sectionsof
this paper. Comparison is made in order to establish the
performance in speed of themodelproposed.
The rest of the paper is as follows: section 2 presents the
design progression using HDLand FPGA logistics, section 3
explains the Combinational Neural Networks (CNN) and
HDL-CNN, section 4 explains the Sign/Gesture recognition
model, sect ion 5 presents the result, and, sect ion 6
concludes thepaper.
1.Design progression using HDL
The first question that comes tomind is: Why use a high level
design methodology (such as HDL) for CNN implementationas opposed to other object–oriented simulations. The
answer would be that high speed processing can be
achieved through dedicated hardware working in parallel
which can be implemented on FPGAs using HDL. ANNs are
powerful systems capable ofmodeling the complex input-
output relationships. Information is processed via the
mathematicalmodels usingthe interconnections of neurons.
Some interesting features displayed by the network engine Table 1. VHDL vs. Procedural languages
7l
i-manager’s Journal o Electronics Engineering, Vol. No. 1l
n 2 September - November 2011
VHDL provide ways to descr ibe
propagat ion of time and signal
dependencies. Hardware oriented –
Digital logic design (The operations
and structure are described in gate
leveland RTlevel – hierarchaldesign).
VHDL suppor ts unsynthe si zable
constructs that are useful in writinghigh-level models, test benches and
other non-hardware artifacts needed
in hardwarelogic design.
VHDL has static type checking-many
errors can be caught before synthesis
and/or simulation.
VHDL has a rich collection of data
types and wel l-defined s tandard
with a full-featured language and
module s ys tem ( li br ar ie s a nd
packages).
No way to describe time and signal
dependency. Software or iented-
B inar y e xecu table ( Da ta f low
languageand non-hierarchaldesign)
Explicit constructs and assignments
are not supported by the procedurallanguages.
Errors can be analyzed only after
debugging. Synthesis errors are hard
to debug.
Object oriented programs are written
withpure logical or algorithmic thinking.
Inherently procedural (single-threaded),
with limited syntactical and semantic
supportto handleconcurrency.
VHDL (Hardware descriptive)
VHDL contains components that areconcurrent i.e. run in parallel/ simultaneously.
Procedural languages (C,C++, MATLAB)
Traditional software languages likeC,C++,andMATLABare sequential.
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 3/11
RESEARCH PAPERS
linear functionof multiplyand accumulate as follows:
(1)
The synaptic weight of the connection is given by w ;ki
where 'p' is the number of incoming inputs to the neuron.
The output of the model is given by y g iven by the pre-k
output passed through the activation function φ(.)
(Sigmoid functiondefinedin Eq. 3)as shown below:
jn (2)
2.2 HDL- CNN architecture Model
The CNN is built on the basic network of a back
propagation described in previous research (Mekala,Erdogan & Fan, 2010). The features extracted from the
prior module block are d ivided into classes or stages
where a set of features describe some information on the
probability of the decision of the recognition. Hence from
a set of M actual features to be extracted, the
probabilisticdecision ismade on three levelswith first level
monitoring the other two levels. Each level is fed with a
vector V of variable length. Hence there exists vector V of
size L1, L2 and L3. Since the platform is being designed to
serve as a generic model and flexible to user demands,the value of M varies from application to application
depending on the linearity of theoutput classes. A parallel
communication bus is provided between the feature
extraction layer and the three level CNN recognit ion
model inorderto allow the flowof the three level sets of the
vector data as well as an initialization clock signal to
choose either level2 or level3 once the decision on level1
is made (Mekala, Erdogan & Fan, 2010). The t ime and
y = ( )k k
connections, concurrent operations, propagation delay
and timing information (Omondi & Rajapakse, 2006)
(Berry, 2002)(Schemmel,Meier& Schurmann,2001)(Short,
2009)(Ashenden, 1995).
2. Hardware Descriptive Language - Combinational
NeuralNetworks (HDL-CNN)
The CNN is a special class of ANN being described as
follows. Thedesign resembles the tree structure in addition
to the generic architecture of a neural network. The
previous research on the software solution CNN design as
proposed in (Mekala, Erdogan & Fan, 2010) is based on
the address search of the virtual memory of a CPU. This
paper examinesan alternative implementation of theCNN
on the hardware platform called as the HDL-CNN whichmodifies the architecture with the help of VHDL design on a
FPGA to better the performance. A basic neural network
engine and the extension of back-propagation network on
to theHDL-CNNmodelaredescribedbelow.
2.1 Generic Neuron Model in HDL
In order to model an artificial neuron from a biological
neuron, three basic components are used - input to the
neurons, synaptic weights and activation threshold
function. The synapses of the biological neuron (i.e. the
onewhich interconnects the neural network and gives the
strength of the connection) are modeled as synaptic
weights. Mathematically they can be considered as
functions- two l inear and one non-l inear. All inputs are
modified by the weights and summed altogether. This
activity is referred as a l inear combination. The l inear
combination of the input stage and aggregation is being
modeled as a simple MAC (multiply and accumulate)
function. The output of the MAC is passed through a non-
linear activation threshold to determine the output. The
activation function considered could be – step function
(simplest non-linear function), ramp function or a sigmoid
function (Mehrotra, Chilukuri & Ranka, 1997)(Caudill &
Butler, 1992)(Stergiou & Siganos, 1996)(Dreyfus, 2005).
The Figure 1 shows the neural network engine with the
three layer structure (Fausett, 1994). Each neuron receives
several inputs i.e. x and generates pre-output v (k i k
representing the neuron generating output) through the
Figure 1. Generic Neural Network Model
8 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011
x0
x1
x2
wk 0
wk 1
wk 2
wk = bk (bias)0
wk p x p
Inputsignals
Synaptic Weights
SummingJunction
Svk
Ouput yk
qk
Threshold
ActivationFunction
j·( )
LINEAR FUNCTIONSMAC LAYER
NON-LINEAR FUNCTION
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 4/11
RESEARCH PAPERS
change to influence the weight change and α is the
learning rate adopted for the training. The mathematical
equations are given in Table 2 for the design of each level
ofCNNadopted(Fausett, 1994).
Generally theerror threshold adjustment and learning rate
(generally between 0.5 and 0) variations adds little to the
process; so the idea of momentum is used to boost the
performance. On each pass through the layers, the
weight change of amatrix of synapses is influenced by the
previous pass's weight change. The degree to which it is
influenced is determined by the momentum term
(generally varies between 0 and 1). The weight
adjustments are made as epoch based trainingwhere at
the end of each epoch the cumulative error is alsot racked. All the factors such as s ize of the three level
vectors L1, L2, L3, each BP input, hidden andoutput layers,
learning rate parameters are user defined and are being
set in a configuration file. Figure 2 shows the block
diagram of the HDL- CNN model. In order to generate the
memory consumption involved in computing the feature
vector depends on the length of it. Thus instead of deriving
all the values of the vector at a t ime, three levels are
involved in order to improve the speed and performance
of theHDL-CNNmodel.
The activation function used in the modeling of the CNN
architecture is the sigmoid function. Each node in the
network receives severalinput values andcombines them
to produce an output value. The node's activation
functiondetermines themanner in which these values are
combined. It is necessary for the activation function to
combine the input values in a non-linear manner so as to
fit forwider range of taskapplications. Since each stage of
CNN is constructed using back-propagation network (three layers- input, hidden and output layers) it is
important that the activation function used needs to be
continuously differentiable. There exists several functions
which meet this criteria but the most commonly used
activation function is the sigmoid function as described
by the Equation3 below (Kwan, 1992).
(3)
It is not easy to represent sigmoid function indigital design
s ince i t contains the exponential ser ies. In the object
oriented programmingbased models it is definedwith thehelp of a look up table consuming more memory
resources.In theHDL-CNNdesign piecewise second order
approximation of the function using quadratic functions is
implemented shown in Equation 3 where c , c and c are0 1 2
coefficients of the quadratic function (Tommiska, 2003).
This requires two adders and three multipl ier operators
redesigned as two MAC operations and a shift register for
calculating thesquare.
Each level vector based back propagation module is
evaluated as described below. Assuming H is the vector of
hidden-layer neurons, I is the vector of input-layer neurons
andW1 is the weight matrix between the inputand hidden
layer, W2 is thematrix of synapses connection hidden and
output layers, th1 and th2 a re the effect b iases on the
computed activations (set to value 1 for this design), T is
the target activation of the output layer, μ is the
momentum factor used to a llow the previous weight
Table 2. Equations governing each level of
CNN (Back propagation neural network)
Hidden layer neuron activations
Output-layer neuron activations
Output-layer error
Hidden-layer error
Weights for second layer of synapses
Weight adjustment of first layer of synapses
H =j( I.W 1+th1)
0 = ( H.W 2+th2)j
D = 0(1-0)(0-T)
E = H(1-H)W2.D
W 2= W 2+DW 2 DW 2 =m HD+DW 2t-1 where
W 1= W 1+W 1 W1 =a IE+mDW 1t t t-1 where
Figure 2. HDL- CNN recognition model – Block diagram
9l
i-manager’s Journal o Electronics Engineering, Vol. No. 1l
n 2 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 5/11
RESEARCH PAPERS
3. Sign/Gesture RecognitionModel
A recognition model is shown below in Figure 3. Sign
recognition using neural networks is based on the learning
of the network using a database set of s igns/ gestures
(Vargas, Barba, Torres & Mottos, 2011). The architecture is
des igned based on camera -based r ecogn it ion
methodology. Once the video/ image are being
obtained from the acquisition unit, the image (256x256
pixels) is processed in various stages anddata is extracted
to implement the recognition model. The first step of the
model after the image data acquired is pre-processing.
In general raw image data processing consumes high
memory and other resources due to redundancy in the
spatial and temporal basis. Pre-processing involves
filtering and background subtraction in order to consider
var ious environment factors such as illumination,
unwanted noise and other scene conditions done using
MATLAB. These pre-processed frames are taken as input
(bitmapped images) into the FPGA for LoG (Laplacian of
random weights a l inear shift register module is used
(weights between -1 and 1 are generated). The
asynchronous RESET when set to high, the internal finite
statemachine of CNN is reset to the initial state. During the
initialization phase, the CNN randomizes all connection
weights using the shift register module and when
completed it enters the idle state. The training and testing
is done in two different modes called TRAIN and TEST.
When the mode is set to TRAIN - the CNN enters the training
state from idle and during TEST- CNN enters the run state
with the corresponding flags being set.
2.2.1ComputationalAnalysis
The number of computations involved and the gain
acquired by shifting the architecture to HDL platform isbeing modeled below. With the feature vector V of
variable length is a function of number of patterns ' p'
considered for recognition and the number of features
extracted ' n'.
(4)
General consideration of the CNN level is n input neurons,
h hidden neurons and l output neurons where n < h < 2n-
1; MAC representsmultiplyandaccumulate, A is adder, M
is multiplier and S is a shi fter operation. For a s igmoid
operation the software solution using a look up table ( LUT )
where the time taken for performing one LUT depends on
the speed of the processor. In the HDL-CNN the quadratic
equation depicted uses one MAC and one Shifter (MAC
+S) for the calculation of s ingle neuron act ivat ion
function.
In the HDL-CNN the matr ix operations involved in the
weight layer updates and error calculations are
performed by the dedicated adders, multipliers, shifters
and MAC units and hence concurrently (in one complete
clock cycle) done at a time for all neurons rather than the
for loop control used in software models. Each level of the
CNN has the different number of computations involved
listed in Table 3 where the values of n, h, and l (the input,
hidden and output layer neurons) vary from level1, level2
and level3. The average gain in training time is plotted
and discussed in the results section supports the above
analysis.
Table 3. Comparison of Number of computation involved
in each level between the MATLAB CNN and HDL-CNN
CNN HDL-CNN
Hidden layerneuron activations
Outputactivations
-layer neuron
Output-layer error
Hidden-layer error
Weight adjustmentof second layer
Weight adjustmentof first layer
{nhM + (1+h(n-1)) A} + h( LUT )
{hlM + (1+l (h-1)) A} + l ( LUT )
2lA + 2lM
(h( -1)) A + ( ) M l lh + 2h
(2h ) A + ( ) M l 2lh
(nh) A + ( ) M 3nh
{ } + ( ) MAC h MAC + S h
{ MAC }l + ( MAC + S )l
( 1) MAC l +
h( ) MAC + M
hMAC 2 A+
nMAC + 2( ) A + M
Figure 3. System Overview of the sign recognition model
10 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 6/11
RESEARCH PAPERS
components in use in this case study realization of CNN
using HDL for sign language recognition. Figure 5 shows
the connections between the FPGA board – Xilinx Spartan
Gaussian) edge detection. The feature extraction block
and CNN block perform in parallel with dual bus
communication i nter face p rovided. The feature
extraction layer extracts the necessary features of size,
shape and state attributes of thehand (described in detail
(Mekala, Gao, Fan & Davari, 2011)). Since it is time
consuming forthe processor towaituntil allthe 55 features
have been extracted, the CNN layer i s ini tiated at the
arrival of first 15 feature elements to Level1 and then the
40 for the next level 2 or level 3 adopted based on the
decision of level 1 network (Mekala, Gao, Fan & Davari,
2011). The trainingof CNN is done using the sign language
patterns from A to Z (without J and Z characters involving
the motion). Inorder to test the ability and performanceof
the network, usually a test set of independent examples is
used in order to generalize the network with regards to
examplesets which arenot present in thetraining set.
The case study of American Sign Language (ASL)
recognition is being interpreted stepby step as follows:
·Image acquisition via camera and generating still
image frame data (Video to Frames conversion with
background subtraction) – done using MATLAB and
stored as “.coe” files.
·Transfer of the image data to a Xilinx Spartan 3E FPGA(Field Programmable Gate Array) board via USB 2.0
using a PC.
·Saving the data to the onboard SRAM (Static Random
Access Memory) to allow image processing functions
tobeperformedon theimage.
·Implementing the edge detection and feature
extraction algorithms on the image and storing the
feature vectorback in theSRAM.
·Recognition via CNN model with parallel interaction
to thefeatureextraction unit.
·Display of the input frame and processed frame on to
the PC to be viewed bythe user via the VGA controller,
recognized sign displayed on theLCD segment.
The model schematic of the sign recognition is shown in
Figure 4. I t contains the SRAM module, preprocessor
module, feature extraction module and the CNN
recognit ion module. There are three main hardware
Figure 4. System Overview of the sign recognition model–
Preprocessing, Feature Extraction and CNN Engine.
Figure 5. Xilinx Spartan 3E kit Hardware connections
11l
i-manager’s Journal o Electronics Engineering, Vol. No. 1l
n 2 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 7/11
RESEARCH PAPERS
parallel procedures hence the gain in speed. On an
average (mean over different number of patterns) the
time consumed for the same design on MATLAB was
around 381.36 seconds while the time taken for the
design on VHDL is CPU time 29.34 seconds and real time
30 seconds as listed in Table 4. The HDL solution proposed
proved to be 13x times better in consideration with time
andspeed of the trainingof network.
The images after background subtraction are converted
from grayscale to bitmap of depth 4 (sent as input in the
form of '.coe' files to the ISE). Grayscale involves 8 bits to
represent each pixel where as the bitmap image ofdepth
4. Thus processing of 256x256 image saves 262144
(256x256x4) bits in representation that is 256Kb inbandwidth listed in Table4.
To validate the per formance of the HDL-CNN, they
generate the mean square error test-bench considering
the actual situation of the neural network operation. The
test-bench adopting the three-level feature vector as
input signal vectors, and the weight coefficient of hidden
and output layers are stored in RAMs. Both mean square
error and evolution of weights are transferred to text files
and plotted using MATLAB. Epoch based updating of the
weights is performed and Mean Square Error is decreasingat an exponential rate and settl ing down to an almost
constant value shown in Figure7.
An epoch is the presentation of the entire training set to
the neural network once and for the network to reach the
minimum threshold error the training is done multiple
t imes counted as number of epochs. Maximal weight
change in each epoch is decreasing and finally reaching
to a least valuepossible. The best, intermediate and worst
case scenarios are shown in Figure 7 where the evolution
of weights issettling down to a constantvalueattheendof1815 epochs for the best case obtained by the global
3E, the USB to Peripheral communications module and a
monitor with VGA connect ion in order to display the
recognized output sign. The authors adopt Xilinx
Integrated Software Environment ( ISE 10.1) which is a
powerful, f lexible integrated design environment that
allows designing Xilinx FPGA devices from basic modules
to complete microprocessor architectures. Project
Navigator is the user interface that manages the entire
design p rocess including design entry, simulation,
synthesis, implementation, and finally download the
configuration of the FPGA device. PACE is responsible for
placing and routing the code for optimization. IMPACT
then generates theprogramming files anddownloads the
code to hardware(Xilinx, 2009).
4.Results
Most components of the architecture perform in parallel
and hence the potentially infinite t raining t imes are
reduced reasonably. The training time is the time taken to
train the network for a given number of patterns ' p' without
duplicates input frames. The training time is being plotted
as varied with respect tonumber ofpatterns beingtrained
in Figure 6. It clearly shows that the HDL-CNNmodel saves
atan average13x timesthe time involved intraining when
compared to the software based CNN model. Also thecurve states that the time saved increases exponentially
as the number of patterns increases i.e. as complexity of
recognition is more non-l inear and thus an average is
considered for comparison. The adjustments of the
weight matrices and neuron activation vector are all
Figure 6. HDL-CNN vs. CNN architecture Training time (in seconds)
variation based on number of patterns to be trained
3 4 5 6 7 8 9 10 110
100
200
300
400
500
600Trainingtime vs.No. of Patterns
T r a i n i n g T i m
e
Number of Patterns to be trained
CNN Traintime
HDL CNN Train time
9x times
13x times
15xtimes
Table 4. HDL-CNN recognition model vs. CNN recognition
model (Software vs. Hardware architecture)
HDL-CNN (Hardwaresolution) CNN (MATLAB solution)
Average Training Time 29.34secs 381.36secsSingle Pattern Recognition Time 43.45ms 0.52secs Average Performance 95% 92.8% Average Noise Immunity 51% 48%Epochs (Best case scenario) 1815 1832Limitations J,Z (signs with motion) -
Gain in bandwidth 256Kb per frame -
12 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 8/11
Best case scenario–11
Hidden layer neurons,
learning rate 0.1,
momentum 0.4 and
threshold 0.0001.
Intermediate case
scenario-Hidden layer
units=11; learning
rate=0.1, momentum=0.7;
threshold=0.0001
Worst case scenario-
Hidden layer units=11;
learning rate=0.01,No
momentum;
threshold=0.001
RESEARCH PAPERS
strangepatterns.
Figure 8 generalizes the results of few test patterns with the
LoG edge operator and the sign recognized. Few noisy
patterns are also tested in order to evaluate the accuracy
of the architecture. Though the network is trained using
different test patterns, it appears that the noise immunity
levels are varying foreach sign involved. Noise immunity is
the level of noise under which the pattern can sti ll be
recognized accurately. Thecorrelation between the signs
plays a role factor for the inconsistency of the noise
immunity seen. The performance is calculated as the
ratio of correct patterns recognized to the total number of
test patterns. On an average performance of 95% is
achieved with the pattern identification and takes around
43.45ms to retrieve one image pattern. Given an input
frame for testing the time taken by the network
architecture to process and recognize the s ign is the
parameters (lea rning rate, momentum and error
threshold) optimization. Inclusion of the momentum is
proved tobe useful with the training sets that include a few
patterns that are very different than the rest (as in patterns
B, W, Y are completely different from the patterns A, C, O
where the finger tips are not present) demonstrated in the
worst case where there is no momentum compared to
thebestand intermediatecases.Normally, these patterns
will upset the convergence towards the minimum defined
by the majority of the patterns. To improve that, one could
use a very small learning rate (<0.1), but then the
convergence would be very slow. Instead the study keep
a moderate learning rate (0.1) but the authors will involve
the previous weight change, in addition to the current
data (weight change), for defining the weight upgrade.
This wil l provide certain inertia to the training, which wil l
minimize the disruption of the convergence caused by
Figure 7. Various simulations for Best, Intermediate and Worst case considerations
for the each level of HDL-CNN training acquired (15 input neurons)
13l
i-manager’s Journal o Electronics Engineering, Vol. No. 1l
n 2 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 9/11
RESEARCH PAPERS
updates and hence concurrently done at a t ime for all
neurons rather than the for loop control used in object
oriented languages. Arithmetic precision is also achieveddue to the use of f loating point l ibraries. Moving to this
parallel hardware provided the speedups in orders of
magnitude (13x t imes in this case). Many advanced
famil ies of FPGAs have been manufactured (Vargas,
Barba, Torres & Mattos, 2011) that contain more logic
blocks and also video input control lers which clearly
implies the design to be optimized on different goals of
area, power and speed. The use of VHDL for the
architectural design represents a very practical option
when dealing with complex systems. Thus the FPGAsconstitute a very powerful option for implementing CNNs
s ince we can really exploit their parallel processing
capabilities to improve the performance. To progress the
research the algorithm needs to be extended to
recognize words or sentences which involve a set of
images (i.e. video frames) to be processed at a time with
thehelp of a vectorbank. Also theHDL-CNN architectureis
generic as it could be used for other pattern recognition
,
single pattern recognition time. The signs involvingmotion
(J, Z)arethe limitations of thearchitecture as compared to
the software solution. The epochs involved to reach thesteady s tate and the noise immunity achieved are
approximately equal in both cases. An inclusion of a SRAM
vector back to store the motion vectors of adjacent
frames could be considered for future research in order to
eliminate theabove limitations.
Conclusion
The combinational neural networks are one of the most
powerful tools in the recognition/ identification process
applications. The VHDL based model design of the sign
recognition model presents a performance pretty good
to identify the static images of the American Sign
Language alphabets with implementation on the Xilinx
Spartan 3E FPGA. Per formance is achieved as the
expensive operations are optimized in VHDL by the use of
a matrix-vector multiplication performed during each
layer and level data flow. Dedicated adder and
multipliers are used for per forming the weight layer
Image Processed
of Gaussian) Edge detection
-LoG (Laplacian Sign
Recognized
Alphabet Noisy Image Processed-LoG Edge detection
Sign Alphabet
Recognized
B Y
I V
C L
Figure 8. Sign language alphabets recognized by the HDL-CNN recognition model
14 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 10/11
RESEARCH PAPERS
October). A VLSI Implementation of an Analog Neural
Network suitable for Genet ic Algorithms. ICES '01
Proceedings of the 4th International Conference on
Evolvable Systems: From Biology to Hardware. Springer-
Verlag London, 50-61.
[11]. Short, Kenneth L., (2009). VHDL for Engineer . NJ:
Pearson Prentice Hall.
[12]. Ashenden, Peter J., (1995). The designer's guide to
VHDL. San Francisco: Morgan Kaufmann publishers.
[13]. Mekala, P., Erdogan, S. and Fan, J., (2010,
November). Automatic object recognition using
combinational neural networks in surveillance networks.
IEEE 3rd International Conference on Computer and
Electrical Engineering (ICCEE'10), Chengdu, China, Vol. 8,pp. 387-391.
[14]. Mehrotra, K., Chilukuri, K.M., and Ranka, S., (1997).
Elements of Artif icial Neural Networks, The MIT Press, pp1-
2.
[15]. Caudill, M., Butler, C., (1992). Understanding neural
networks: Computer explorat ions, MIT press.
[16]. Stergiou, C., and Siganos, D., (1996). Report: Neural
N e t w o r k s . V o l 1 4 . R e t r i e v e d f r o m
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs
11/report.html.
[17]. Dreyfus, G., (2005). Neural networks: methodology
and applications. Berlin, New York: Springer.
[18]. Kwan, H.K., (1992, July). Simple s igmoid l ike
acti vation funct ion su itable for digi ta l hardware
implementation. Electronic Letters, 28(15), 1379-1380.
doi: 10.1049/EL: 19920877.
[19]. Fausett, L.,(1994). Fundamentals of Neural Networks
– architecture, algorithms andapplications. Prentice Hall.
[20]. Mekala, P., Gao, Y., Fan, J., and Davari, A., (2011,
March). Real-time sign language recognition based on
neural network architecture. Joint IEEE International
Conf erence on Indu str ial Tech nol ogy & 43rd
Southeastern Symposium on System Theory (SSST'11),
Auburn, AL, pp. 197-201.
[21]. Xilinx (2009). XST User Guide, Xilinx Inc. Retrieved from
http://www.xilinx.com/support/documentation/sw_manu
(like objects, face) provided the training sequences have
tobe varied.
References
[1]. Torres-Huitzil, C., Girau, B., and Gauffriau, A., (2007).Har dwar e/ So ftwa re Co-de si gn f or Embedded
Implementation of Neural Networks. Reconfigurable
computing: architectures, tools and applications-
Lecture notes in computer science, 4419, 167-178.
[2]. Cantrel l, C., and Wurtz, L., (1993). A Parallel Bus
Architecture for artificial neural networks.Southeastcon'93
P r o c e e d i n g s , I E E E ( p p . 5 ) . d o i :
10.1109/SECON.1993.465674.
[3]. Baker, T., and Hammerstrom, D., (1989).
Characterization of Artificial Neural Network Algorithms.
Circuits and Systems- IEEE International Symposium, vol.1,
78-81.doi: 10.1109/ISCAS.1989.100296.
[4]. Blais, A., and Mertz,D., (2001, July). An Introduction to
Neural Networks – Pat tern Learning wi th Back
P r o p a g a t i o n A l g o r i t hm . R e t r i e ve d f r o m
http://www.ibm.com/developerworks/library/l-neural/.
[5]. Vargas, P. Lorena, Barba, L., Torres, C. O., andMattos,
L., (2011). Sign Language Recognit ion System using
Neural Network for Digital Hardware Implementation.
Journal of Physics: Conference Series, 274(1). doi:
1088/1742-6596/374/1/012051.
[6]. Ali, H. K., and Mohammed, E. Z., (2010, August).
Design Artificial Neural Network using FPGA. International
journal of computer science and network security , 10(8),
88-92.
[7]. Omondi, R. Amos, and Rajapakse, Jagath C., (2006,
July). FPGA Implementations of Neural Networks .
Springer.
[8]. Izeboudjen,N., Farah, A., Bessalah,H., Bouridane, A.,and Chikhi, N., (2008, July). High Level Design Approach
for FPGA Implementation of ANNs. Encyclopedia of
Ar ti ficial In tell igence, IGI-Global Publ ishers . doi:
10.4018/978-1-599-4-849-9.
[9]. Berry, D. L., (2002). VHDL programming by examples.
McGraw-Hill, fourth edition.
[10]. Schemmel, J., Meier, K. and Schurmann, F., (2001,
15l
i-manager’s Journal o Electronics Engineering, Vol. No. 1l
n 2 September - November 2011
8/2/2019 Project2MTECH
http://slidepdf.com/reader/full/project2mtech 11/11
RESEARCH PAPERS
reprogrammable logic. IEEE proceedings, Computer
Digi ta l Techn iques, 150 (6 ). doi : 10 .1049/ ip -cdt :
20030965.
als/xilinx12_2/xst.pdf.
[22]. Tommiska, M.T., (2003, November). Efficient digital
implemen ta tion of t he s igmoi d f un ct ion for
PriyankaMekala received her M.S. degree in Electrical Engineering from Arizona State University and B.E. degree in Electronics
and Communications fromOsmania University, India inMay 2009and June 2007,respectively. She started towork on her Ph.D.
degree in Electrical Engineering at FIU in fall 2009. She is currently a Ph.D. candidate. Her research interests include Signal
Processing,Real-time Image/ Video processingand VLSI design/ testing. She is also a studentmember of IEEE.
Dr. Fan is currently working as an Assistant Professor in Electrical and Computer Engineering at Florida International University. His
research interests include very-large-scaled-integrated (VLSI) circuit simulation, modeling, optimization, bio-electronics,
embedded real-time operatingsystemsin application to roboticcontrol,and wirelesscommunications in sensor networks. Prior
to his academic career, He served as Vice President of Vivavr Technology, Inc., and General Manager/co-founder of Musica
Technologies, Inc. From 1988 to 2002, he held various senior technical positions in California at Western Digital, Emulex
Corporation, Adaptec Inc., and Toshiba America. Hisproduct lineof research anddevelopment includes Virtual Reality (VR) 3-D
animation, MP3 players, hard drives, fiber channel adapters, SCSI/ATAPI adapters, RAID disk array, PCMCIA cards and laser printer controllers. He received his Ph.D. degree in Electrical Engineering at University of California, Riverside in 2007, and the
Master of Science degree in Electrical Engineering from State University of New Yorkat Buffalo in 1987. He also holds Bachelor of
Science degree in Electronics Engineering from National Chiao Tung University in Taiwan, R.O.C. He has served as a steering
committee member of SSST, a technical program committee member for ICESS, CAMAD, ISQED, ISCAS, and an invited tutorial
speaker for ASICON'07. He is a SeniorMemberof IEEE.
ABOUT THE AUTHORS
16 i-manager’s Journal o Electronics Engineering, Vol. nll
2 No. 1 September - November 2011