EMPIRICAL AND FEED FORWARD NEURAL NETWORKS MODELS OF TAPIOCA STARCH HYDROLYSIS

This article was downloaded by: [University Of Maryland]On: 17 October 2014, At: 07:13Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Applied Artificial Intelligence: AnInternational JournalPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/uaai20

EMPIRICAL AND FEED FORWARD NEURALNETWORKS MODELS OF TAPIOCA STARCHHYDROLYSISRoslina Rashid a , Hishamuddin Jamaluddin b & Nor Aishah SaidinaAmin aa Faculty of Chemical & Natural Resources Engineering , UniversitiTeknologi Malaysia , Skudai, Malaysiab Faculty of Mechanical Engineering , Universiti Teknologi Malaysia ,Skudai, MalaysiaPublished online: 23 Feb 2007.

To cite this article: Roslina Rashid , Hishamuddin Jamaluddin & Nor Aishah Saidina Amin (2006)EMPIRICAL AND FEED FORWARD NEURAL NETWORKS MODELS OF TAPIOCA STARCH HYDROLYSIS, AppliedArtificial Intelligence: An International Journal, 20:1, 79-97, DOI: 10.1080/08839510500191422

To link to this article: http://dx.doi.org/10.1080/08839510500191422

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

http://www.tandfonline.com/loi/uaai20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/08839510500191422

http://dx.doi.org/10.1080/08839510500191422

Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

EMPIRICAL AND FEED FORWARD NEURAL NETWORKS MODELSOF TAPIOCA STARCH HYDROLYSIS

Roslina Rashid & Faculty of Chemical & Natural Resources Engineering,Universiti Teknologi Malaysia, Skudai, Malaysia

Hishamuddin Jamaluddin & Faculty of Mechanical Engineering,Universiti Teknologi Malaysia, Skudai, Malaysia

Nor Aishah Saidina Amin & Faculty of Chemical & Natural Resources Engineering,Universiti Teknologi Malaysia, Skudai, Malaysia

& The aim of dynamic modeling of the tapioca starch hydrolysis process is to generate models forforecasting the future product concentration (glucose) from the initial conditions of available pro-cess measurements. This paper compares two methods of modeling the tapioca starch hydrolysisprocess: (1) The empirical approach and (2) the feed forward neural network (FFNN) approach.Experiments were conducted to obtain a set of data for the modeling purpose. The Gauss-Newtonmethod was used for parameter estimation in the empirical analysis and a multilayer neuralnetwork with one hidden layer was utilized in the neural networks approach. This study indicatesthat the FFNN model of tapioca starch hydrolysis produces better predictive accuracy, that is simplerto develop and has a generalization capability compared with the empirical model.

Tapioca starch hydrolysis is a process to produce glucose syrup that can beused in various applications in food and pharmaceutical industries becauseof its unique properties and its interaction with other ingredients present. Glu-cose production is actually the net overall effect of many simultaneous reac-tions, including hydrolysis of maltose, maltotriose, and so forth. Often it isneither possible nor practical to measure and monitor the concentrations ofall species present in the reaction mixture. In addition, the molecular weightdistribution of a native starch is rarely known (Schenck and Hebeda 1992).Therefore, it is very difficult to determine the individual reaction rates andthe kinetic parameters that are required in a theoretical modeling approach.Moreover, this approach involves a large number of experiments to adequately

Address correspondence to Roslina Rashid, Faculty of Chemical & Natural Resources Engineering,Universiti Teknologi Malaysia, 81310 Skudai, Malaysia. E-mail: [email protected]

Applied Artificial Intelligence, 20:79–97Copyright # 2006 Taylor & Francis Inc.ISSN: 0883-9514 print/1087-6545 onlineDOI: 10.1080/08839510500191422

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

characterize the kinetic parameters in order to describe the reaction curvefaithfully (Zanin andDeMoraes 1996). Due to these reasons, it is often imprac-tical to conduct a theoretical approach in modeling the process.

For these reasons, many researchers used an empirical approach asan alternative to the theoretical approach. Empirical or semi-empiricalequations that relate the reaction rate with operating variables have beenproposed by various authors to model similar processes. An empiricalmodel describing the kinetics of cassava starch hydrolysis with Termamylenzyme was proposed by Paolucci-Jeanjean et al. (2000). The model satis-factorily fits experimental data for oligosaccharides with a degree of poly-merization (DP) ranging from 1 to 7. However, the model satisfactoryaccounts only for cassava starch hydrolysis by Termamyl at T ¼ 80�C,50 < S0 < 270 g�dm�3, 0.17 < E < 1 cm3�dm�3. Gonzalez-Tello et al. (1996)proposed a potato starch hydrolysis model that is based on fitting theexperimental data to cubic spline functions. They claimed that the empiri-cal model of enzyme hydrolysis in whey proteins could be successfully usedin the enzyme hydrolysis of other biopolymers. Akerberg et al. (2000) uti-lized a semi-empirical model that involved determination of more than fivekinetic parameters to describe the wheat starch saccharification process.The disadvantage of the model developed using this approach is that themodels are greatly dependent on the operating conditions such as tem-perature and pH. Different operating conditions will be represented by dif-ferent equations and the coefficients need to be determined every time thecondition changes. Thus, attempts were made to model the process using aneural network approach. This method was chosen because of its ability tocapture nonlinear behaviors from input-output data of a process (Billingset al. 1992).

In recent year, artificial neural networks or just neural networks havebecome one of the major buzzwords in biotechnology and biochemicalfields. Previous studies have demonstrated the potential of neural networksfor estimation and control in many bioprocesses (Di Massino et al. 1992;Linko and Zhu 1992; Kurtanjek 1994). For example, Thibault et al.(2000) applied the neural networks modeling approach in the fermen-tation of glucose to gluconic acid by Pseudomonas ovalis and Petrova et al.(1998) used neural networks for the specific growth rate approximationof a strain Saccharomyces cerevisiae on a glucose limited medium. The capa-bility of neural networks in identification of starches in manufacturing foodproducts was demonstrated in the work done by Huang et al. (1993). Zhu(1995) constructed recurrent neural networks for application in dynamicbioprocess for the production of ethanol and glucoamylase. Chaudhuriand Modak (1998) utilized neural networks for the optimization of fed-batch bioreactor. Yang and Linkens (1994) applied neural networks forcontrolling stirred tank bioreactors. Syu and Chang (1999) successfully

80 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

used neural networks to adaptively control Penicillin acylase fermentation.There have been also several successful developments of software sensors(estimating hard to measure variables from other online measurable pro-cess variables) based on the neural network approach (Thibault et al.1990; Willis et al. 1991; Montague et al. 1992; Linko et al. 1997). Untilrecently, the neural network approach has not been utilized for modelingthe tapioca starch hydrolysis process (Rashid et al. 2003). They provided acomprehensive study to determine the suitable neural network model fortapioca starch hydrolysis. It was shown that the structure of 4-5-1 with logsigmoid activation function in both hidden and output layer is identifiedas the best network model for the process. To the best of our knowledge,no comparative study of neural network and empirical models on tapiocastarch hydrolysis has been reported so far. Thus, the main contributionof this paper is to provide a comparative study on the performances of bothneural networks and empirical models of tapioca starch hydrolysis process.

The whole purpose of modeling the process is to predict the glucoseconcentration profiles by considering the existence of measurement errorsusing the neural network model. The neural network model will then beassessed for process design using model predicted output (MPO). In thestudy presented here, empirical and feed forward neural network (FFNN)models of tapioca starch hydrolysis are discussed and their predictive per-formances are compared and analyzed.

PROCESS BACKGROUND

In the study presented here, the starch was hydrolyzed in two separatesteps; liquefaction and saccharification, to obtain the product behavior orprofile. Two parameters were varied: enzyme dosage (EN) and initial drysolid (DS) since those parameters have a major influence on the pro-duction of glucose. Thirteen experiments were conducted to obtain a dataset for the modeling purpose. For the neural network modeling approach,the data sets were divided into two; 10 sets were used for parameter esti-mation and three sets were used for generalization by cross validation.

In the liquefaction step, Termamyl from Novo was utilized for theenzymatic liquefaction of starch to dextrin. Termamyl is a thermo-stablealpha-amylase enzyme. At the temperature of 105�C, it is highly activeand sufficiently stable for the reaction to occur. That temperature permitsthorough gelatinization of the starch without any significant loss ofenzyme activity. The enzyme attacks the alpha (1-4) glucosidic linkagesof the starch to reduce the length of starch chain. This reaction will givea rapid reduction of the viscosity and increase the dextrose equivalence(DE). The breakdown products are dextrin of different chain lengthsand oligosaccharides.

Models of Starch Hydrolysis 81

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

In the subsequent step, dextrins were further hydrolyzed by amygluco-sidase (AMG) or glucoamylase enzyme. The enzyme breaks down the 1,4linkages faster than 1,6 linkages. AMG products are free from transglucosi-dase activity, which could otherwise result in the formation of panose andisomaltose by transfer of glucosyl moieties from 1,4-alpha to a 1,6-alphaposition, which again could result in a lower glucose yield. The reactionshould be stopped when the maximum dextrose level is obtained. If thereaction continues, the glucose level will fall due to the reverse reactionwhereby a condensation reaction produces maltose and isomaltose.

EMPIRICAL MODEL

Problem Formulation

A modified version of the empirical model developed by Gonzalez-Telloet al. (1996) was utilized to describe the rate of glucose production. Thestudy showed that the overall rate of enzyme hydrolysis could be success-fully represented by

dx

dt¼ aðc þ xÞ expð�bxÞ; ð1Þ

where a and b are the kinetic parameters of the process and c is a constant.A study by Gonzalez-Tello et al. (1996) assumed c to be negligible, neverthe-less, c cannot be ignored in this case as will be shown later. This indicatesthat the value of c is appreciable and its value is significant compared withthe value of x. Based on the experimental data, a, b and c for tapioca starchhydrolysis are estimated. The investigations of Gonzalez-Tello et al. (1996)showed that a is a function of enzyme dosage and temperature and b isequal to 8.75. Before parameter estimation is performed, the product con-centration G is expressed using the dimensionless conversion x

x ¼ G

1:1ðS0Þ; ð2Þ

where G represents the glucose concentration and S0 is the initial dry solid.By integrating equation (1) between an initial value (t0, x0) and at any giveninstant (t, x), the following equation is obtained:Z x

x0

expðbxÞc þ x

dx ¼ aðt � t0Þ ð3Þ

The term exp(bx) of equation (3) is expanded using Taylor seriesexpansion and evaluated at x0. Then, each term of the expansion is divided

82 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

by (cþ x) before integrating each of the expression. Thus, the integralexpression in equation (3) can be approximated by the series expansionevaluated at x0 defined asZ x

x0

expðbxÞc þ x

dx ¼ exp999

2000b

� �log c þ xð Þ þ exp

999

2000b

� �bx

� 999

2000exp

999

2000b

� �b logð2000c þ 2000xÞ þ � � � ð4Þ

Finally, equation (4) is substituted into equation (3) and the followingequation is obtained:

exp999

2000b

� �log c þ xð Þ þ exp

999

2000b

� �bx � 999

2000exp

999

2000b

� �

� b logð2000c þ 2000xÞ þ � � � ¼ aðt � t0Þ ð5Þ

In this study, Gauss-Newton methods were utilized to estimate the para-meters a, b, and c in equation (5) from the measured values x. The esti-mated parameters were then substituted into equation (1) and theequation was solved numerically to obtain the conversion profile and com-pared with the experimental data.

Least Squares Problem

In the present work, the nonlinear least squares problem was utilized toestimate the parameters a, b, and c of equation (5). AMATLAB statisticaltoolbox (version 5.3) was utilized to implement the parameter estimation.Suppose that some variable y is thought to depend upon a variable tthrough a formula of the form

y ¼ Y ðt; xÞ; ð6Þ

where Y is a known function of t and of a vector x of n parameter. A non-linear least-squares problem is

Minimize f ðxÞ ¼ RðxÞTRðxÞ ¼Xmi¼1

riðxÞ2 ¼Xmi¼1

½Y ðti ; xÞ � yi �2 ðx 2 RnÞ;

ð7Þ

where m > n, the residual function R is nonlinear in x, ri(x) denotes the ith component function of R(x), and yi is the experimental data. If Y is a


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

linear function of x, equation (7) is a linear least-squares problem. Inthe least-squares problem, one tries to fit an experimental data with amodel Y(t; x) by estimating x so that the residuals are minimized. Numeri-cal methods such as Newton and Gauss-Newton can be used to solveequation (7). In our study, the estimated parameters are a, b, and c.The Gauss-Newton method was used to calculate the estimated para-meters because this method deals only with the first derivative of theresiduals. This offers a substantial computational saving as compared withthe Newton method that involves the calculation of the second derivativeof the residuals.

The basis of the Gauss-Newton is to approximate H (Hessian matrix) byneglecting the second derivative of the residuals as

Hkj ¼Xmi¼1

@ri@xk

� �@ri@xj

� �þ ri

@2ri@xk@xj

ð8Þ

The resulting approximation to H can be written as

H ffi STS; ð9Þ

where

Sik ¼@ri@xk

: ð10Þ

The matrix S is called the Jacobian of the residuals. The algorithm ofthe Gauss-Newton method involves the following steps:

1. Assume initial guesses for the parameter vector.2. Evaluate the Jacobian matrix S in equation (9).3. Calculate the correction vector Dx

Dx ¼ �H�1g; ð11Þ

where

gk ¼Xmi¼1

@ri@xk

ri: ð12Þ

4. Repeat steps 2 and 3 until the correction vector has been reduced tosome error goal and f(x) does not change appreciably.

84 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

NEURAL NETWORK MODEL

A feed forward neural network (FFNN) is one of the most importantand widely used neural network models, which was first introduced byMinsky and Papert (1969). In the most simplified representation of FFNN(Billing et al. 1992), nodes are arranged in three layers: input, hidden, andoutput. Each connection has a weight associated with it. Hidden andoutput layers have their own specific activation function such as log sigmoidand hyperbolic tangent function. The input layer accepts input signals andredistributes these signals to all neurons (nodes) in the hidden layer. Theinput layer is not a computing nodes, thus it does not process inputpatterns. The output layer accepts output signals from the hidden layerand establishes the output pattern of the entire network. Depending onthe strength of the interconnections (i.e., the magnitude of the weightfactor), those signals can excite or inhibit the neurons.

Figure 1 shows a two-layer neural network. The activation of the outputsunits are denoted by ai, hidden units by Vj, and input terminals by pk. Thereare connections wjk from the inputs to the hidden units, and Wij from thehidden units to the output units. Since a bias can be interpreted as weightacting on an input clamped to one, the joint description ‘‘weight’’ will mostoften be applied covering both weights and biases. The index i refers to anoutput unit, j to a hidden one, and k to an input terminal. Hidden unit jreceives a net input

hj ¼Xk

wjk pk ; ð13Þ

FIGURE 1 A two-layer feed forward network, showing the notation for nodes and weights.


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

and produces output

Vj ¼ f ðhjÞ ¼ f�X

k

wjk pk�; ð14Þ

where f( ) is the activation function. Output unit i thus receives a net input

hi ¼Xj

WijVj ¼Xj

Wij f�X

k

wjk pk�: ð15Þ

The calculated output of the neural network takes the form

ai ¼ f ðhiÞ ¼ f�X

j

WijVj

�¼ f

�Xj

Wij f�X

k

wjkpk��

: ð16Þ

Neural network design procedure includes four steps: (1) data collec-tion, (2) model structure selection, (3) training or model estimation, and(4) model validation. The first step is to collect sets of data that describehow the process behaves over its entire range of operation. In the trainingstep, for each neural network structure that is being considered, an optimalset of weights is determined using a training algorithm such as back-propagation. Finally, the estimated model is evaluated to clarify if itrepresents the process adequately.

Optimizing a neural network model involves varying many parameters,such as the size of the architecture (number of layers and number of nodesper layer), the types of activation functions, the size of the training set, andthe training algorithm. When constructing a neural network model, theproblem of choosing a network of the correct size for the task is commonlyfaced. With too few nodes (also called neurons), the network may not bepowerful enough for learning. The developed model will be inadequateand will not be able to represent the underlying process. With a large num-ber of nodes (and connections), computation is too expensive such that itrequires longer training time (Geman et al. 1992). In addition, a neuralnetwork may perform poorly on new test samples, resulting in generaliza-tion problem. Because the network usually has more than one local mini-mum, it is recommended to always train the network 5–7 times withdifferent initializations of the weights (Norgaard et al. 2000).

Model Development

This work focuses on one specific topology, the feed forward networkwith time delay as depicted in Figure 2. The model predicts the valueof the glucose concentration at the next instant G(tþ1) using the

86 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

information for the present G(t) and past values G(t� 1) of the glucoseconcentration, the present enzyme dosage EN(t), and the initial dry solidconcentration DS. From a theoretical model (physical model) of the starchhydrolysis, the process can be approximated by a first order system(Kusunoki et al. 1982). Thus, G(t� 1) was also considered as the input vari-able to the network. One hidden layer was chosen in this study because ithas been proved by Cybenko (1989) that one hidden layer is enough toapproximate any continuous function. Based on our previous study (Rashidet al. 2003), the best structure of the feed forward neural network modelfor tapioca starch hydrolysis process is 4-5-1 (four input nodes, five hiddennodes, and one output node) with log sigmoid activation functions inhidden and output layers. AMATLAB neural networks toolbox (version5.3) was used for modeling the process.

Model assessment of one-step ahead predictions (OSA) was used. Theone-step ahead prediction at time instant k is defined as

y^OSAðkÞ ¼ F

^½yðk� 1Þ; :::; yðk�nyÞ;uðk� 1Þ; :::;uðk�nuÞ; eðk� 1Þ; :::; eðk�neÞ�;

ð17Þ

where F^is the approximate model for the process, y is the measured

(observed) output, u denotes the input, and e is the residual. This methodof validation is also called cross-validation where the data set not used fortraining or named the test set or validation set was utilized and error indexvalues were calculated to measure the generalization capability of themodel.

Training Algorithm

Training algorithms are employed to determine the optimal set ofweights. A back-propagation algorithm is commonly used for the updating

FIGURE 2 Feed forward neural networks model with time delay for tapioca starch hydrolysis.


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

(training) of the weights of a multilayer neural network and random initi-alization of weight parameters is utilized. This back-propagation learningmechanism is a generalization of the delta rule, similar to the LMS (leastmean squared error) learning algorithm and it is considered as a super-vised learning method. For a small and moderate total number of netweights, the most efficient calculation procedure adapted for learning isthe Levenberg-Marquardt method that is based on nonlinear optimization(Molga 2003).

In this work, the Levenberg-Marquardt algorithm was employed insteadof back-propagation algorithm. It gives a fast convergence (Rashid et al.2002), it is robust, simple to implement, and it does not involve with anypeculiar design parameter initialization (Norgaard et al. 2000). Further-more, the algorithm is very efficient when training networks have up toa few hundred weights (Hagan and Menhaj 1994). The Levenberg-Marquardt algorithm is basically a Hessian-based algorithm for nonlinearleast squares optimization. For neural network training, the objectivefunction is the error function of the type

eðhÞ ¼ 1

2

Xp�1

k¼0

Xn�1

l¼0

ðdkl � aklÞ2; ð18Þ

where akl is the actual (calculated) output at the output neuron l forthe input k and dkl is the desired (target) output at the output neuron lfor the input k. p is the total number of training patterns and n representsthe total number of neurons in the output layer of network. h representsthe weights and biases of the network. The weights and biases update vectorDh is calculated as

Dh ¼ ½JT Jþ lI��1JTe; ð19Þ

where J is the Jacobian matrix, which contains first derivatives of the net-work errors with respect to the weights and biases, I is the identity matrix,e is a vector of network errors, and l is an adaptive learning rate. If thelearning rate is decreased to zero, the algorithm becomes Gauss-Newton.The steps involved in training a neural network using the Levenberg-Marquardt algorithm are as follows:

1. Compute the corresponding network outputs and mean square errorover all inputs as in equation (18).

2. Compute the Jacobian matrix, JðhÞ, that is the first derivative of errorwith respect to the weights and biases of the network.

3. Solve equation (19), to obtain the update vector Dh.

88 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

4. Recompute the error using hþ Dh. If this new error is smaller than errorcomputed in step 1, then reduce the training parameter l by l low, leth ¼ hþ Dh, and go back to step 1. If the error is not reduced, thenincrease l by l high, and go back to step 3. l low and l high are prede-fined values.

5. The algorithm is assumed to have converged when the error has beenreduced to some error goal.

In this study, training was terminated when maximum epochs havebeen reached or the lowest mean square errors (MSE is equal to 0.0001)of the training set have been found. However, each run was repeated fivetimes, with each run starting with a different set of initial random weightsto assure the reproducibility of the results.

RESULTS AND DISCUSSION

The goal of this analysis is to determine if the empirical modelingresulted in a comparable or better model than the FFNN model. The pre-dictive performance of both models were compared and analyzed. The stat-istics used to measure the prediction performance of both models is theerror index (EI), defined as

EI ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPðyyðkÞ � yðkÞÞ2P

y2ðkÞ

s¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPe2ðkÞPy2ðkÞ

s; ð20Þ

where e is the residual between the estimated and observed value of glucoseconcentration and y is the observed value of glucose concentration.

Results from the Empirical Approach

The least squares problem was used to determine the optimum valuesof a, b, and c. This method minimizes the residuals (the difference betweencalculated and observed values). Table 1 presents the predictive accuracy interm of error index value (EI) and values of the estimated parameters ofthe empirical model. It is clearly seen from the table that the estimatedvalues of a range from 0.09 to 46000, the estimated values of b range from25 to 459 and the estimated values of c range from �0.3 to 3.5. For differ-ent operating conditions, the coefficients of the model need to be determ-ined separately where tremendous guessing of the initial parameters wasinvolved in order to ensure convergence and to avoid complex solutions.Thus, for each operating condition, many runs were required to estimatethe parameters and only the one that produces the best predictive accuracy


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

is presented here. As can been seen in Table 1, some of the estimated values(such as a equals to 28,123 or 45,561) are so large and it seems that thosevalues are not feasible solutions for the process under study.

Comparison between the predicted (solid line) and the experimentalvalues of glucose concentration at different operating conditions are shownin Figures 3 to 8, and each figure is represented by different set of data.

TABLE 1 Prediction Accuracy and Estimated Parameters of the Empirical Model

Exp. no.Initial dry

solid (%w=v)Enzyme

dosage(L=ton) EI(%)

Estimated parameter

a b c

Set 1 20 0.95 3.67 0.95 150.00 �0.022Set 2 30 0.95 4.47 0.09 164.98 �0.003Set 3 40 1.2 0.89 1599.50 26.50 �0.300Set 4 10 1.2 3.15 28123.00 240.50 �0.005Set 5 40 0.7 3.06 1.80 28.79 �0.134Set 6 30 0.7 2.77 11.67 350.91 �0.014Test set 1 20 0.7 0.85 45561.00 390.00 0.000Test set 2 20 1.2 6.83 30000.00 459.00 �0.020Test set 3 10 0.7 2.77 25660.00 25.00 3.500

FIGURE 3 Prediction curve of set 3. Initial dry solid ¼ 40%w=v, enzyme dosage ¼ 1.2 L=ton.

90 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

Those figures indicate that the models do not catch the relevant features inglucose concentration variation at the end of reaction time. One possibleexplanation for the poor agreement between the model and the experi-mental values at the end of the reaction is the failure of the model toaccount for end-product inhibition. In this study, two trends of predictioncurve are observed. The first trend is displayed as in Figures 3, 5, and 6 andthe second trend is presented in Figures 4, 7, and 8. Here, it is interestingto note that for large values of parameter a, the models produce thesecond trend of the prediction curve. The results also demonstrate thatdifferent sets of data produce different models, and the models cannotbe generalized.

In many cases, the empirical model is able to achieve prediction per-formance with less than 5% error (refer to Table 1). However, very poorprediction performance (EI of 6.83%) is observed as depicted inFigure 7. This is due to the inability of the model to deal with noise orinconsistent data. A weakness of this method for solving the coefficientsof the nonlinear equation is that accurate initial values of a0, b0, and c0are needed to ensure convergence. In addition, the wrong starting valuesof the coefficients may lead to a solution of complex numbers that hasno physical meaning in the process under study.



Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

Results from Neural Network Approach

Results from the previous section have demonstrated that the empiricalmodel is good for modeling the prediction curve for a particular dataset. Thus, model assessment for generalization capability by cross-validationis not possible for the empirical modeling approach. It is shown here thatthe feed forward neural network model has a distinctive advantage over theempirical model, and only one FFNN model is required to model thepredictive curves for all data sets with different initial conditions.

In this study, feed forward neural network with the time delay structureof 4-5-1 was employed. Six hundred and thirty points were used for para-meter estimation and 189 points were used for assessing the generalizationcapability of the model by cross-validation. The model predicts the value ofthe glucose concentration at the next instant G(tþ 1) using the infor-mation for the present G(t) and past values of G(t� 1) of the glucose con-centration, the present enzyme dosage EN(t), and the initial dry solidconcentration DS. One step ahead prediction plots are presented here.

Predictive accuracy of the FFNN model and the empirical model on thetraining set are compared as in Table 2. The overall prediction error of theFFNN model is 1.77% and the empirical model is 3%. In most cases, as


92 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

shown in Table 2, the FFNN models gave better fit to the measured datathan the empirical model. As can be seen from Figures 4 to 8, the FFNNmodel gives significantly better predictive performance than the empiricalmodel. Although the FFNN model of data set 3 gives slightly worse per-formance index value (see Table 2), comparison of predictive curves inFigures 3 indicates that the FFNN model predicts the final glucose concen-tration better than the empirical model.

The results of cross validation of the FFNN model are tabulated inTable 3. Test sets were used here to assess the generalization capability ofthe model. For these attributes, prediction errors ranged from 1.10% to1.92% and the overall prediction error is 1.39%. Scatter plots of the predic-tions on the test sets in Figures 7 and 8 demonstrate good modeling accu-racy and generalization ability of the FFNN model. For different sets ofdata, the FFNN model is still able to produce a good glucose concentrationprofile. Only one set of coefficients (weights) was employed, unlike theempirical model where different coefficients that need to be determinedevery time the operating condition changes or for different data sets. Thisclearly demonstrates that the FFNN model has a generalization capability asillustrated in Figures 7 and 8.



Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

FIGURE 7 Prediction curve of test set 2. Initial dry solid ¼ 20%w=v, enzyme dosage ¼ 1.2 L=ton.

FIGURE 8 Prediction curve of test set 3. Initial dry solid ¼ 10%w=v, enzyme dosage ¼ 0.7 L=ton.

94 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

CONCLUSIONS

The present work has been undertaken in an attempt to compare theperformances of the empirical model with the FFNN model of tapiocastarch hydrolysis process. A modified version of the empirical modeldeveloped by Gonzalez-Tello et al. (1996) was employed and an FFNNmodel with a time delay structure of 4-5-1 was utilized in this study. Themodel performances were evaluated based on the error index values andgraphical representation.

A deficiency that was observed by using the empirical model in model-ing the tapioca starch hydrolysis is its lack of generalization capability. Fordifferent operating conditions (different data sets), the coefficients of theempirical model need to be determined again. Thus, an exhaustive guess ofthe starting values was made to guarantee the optimum solutions. If theguessing values are far from the solutions, the simulation will not convergeor complex number solutions are obtained. Obviously, the FFNN model inthis study is capable of accurately estimating glucose concentration for dif-ferent operating conditions. In addition, the FFNN model results in onlyone model, where as for the empirical modeling approach, separate modelsare required to predict the glucose profile for different initial conditions.Moreover, the FFNN model is able to deal with noise or inconsistent data.

Upon analyzing the results, it becomes evident that the FFNNmodel presented here is a better model than the empirical model used

TABLE 2 Prediction Performances of the Neural Network and Empirical Models on the Training Set

Exp. no.Initial dry

solid (%w=v)Enzyme

dosage(L=ton)EI for

neural network (%)EI for

empirical (%)

Set 1 20 0.95 2.40 3.67Set 2 30 0.95 0.88 4.47Set 3 40 1.2 1.17 0.89Set 4 10 1.2 2.61 3.15Set 5 40 0.7 1.19 3.06Set 6 30 0.7 2.38 2.77

Overall prediction error 1.77 3.00

TABLE 3 Prediction Accuracy of the Neural Network Model on the Test Set

Exp. no.Initial dry

solid (%w=v)Enzyme

dosage(L=ton)EI for

neural network (%)EI for

empirical (%)

Test set 1 20 0.7 1.10 0.85Test set 2 20 1.2 1.92 6.83Test set 3 10 0.7 1.14 2.77

Overall prediction error 1.39 3.48


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

for predicting the glucose concentration of tapioca starch hydrolysis underdifferent operational conditions. Furthermore, the FFNN model also has ageneralization capability such that the model can produce good estimationeven though for a different data set with different initial conditions.

REFERENCES

Akerberg, C., G. Zacchi, N. Torto, and L. Gorton. 2000. A kinetic model for enzymatic wheat starch sac-charification. Journal of Chemical Technology and Biotechnology 75:306–314.

Billings, S. A., H. B. Jamaluddin, and S. Chen. 1992. Properties of neural networks with applications tomodeling non-linear dynamical systems. International Journal of Control 55(1):193–224.

Chaudhuri, B. and J. M. Modak. 1998. Optimization of fed-batch bioreactor using neural networkmodel. Bioprocess Engineering 19:71–79.

Cybenko, G. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Sig-nals and Systems 2:303–314.

Di Massimo, C., G. A. Montague, M. J. Willis, M. T. Tham, and A. J. Morris. 1992. Towards improvementpenicillin fermentation via artificial neural networks. Comp. Chem. Eng. 16:283–291.

Geman, S., E. Bienenstock, and R. Doursat. 1992. Neural networks and the bias=variance dilemma.Neural Computation 4:41–58.

Gonzalez-Tello, P., F. Camacho, E. Jurado, and E. M. Guadix. 1996. A simple method for obtaining kin-etic equations to describe the enzymatic hydrolysis of biopolymers. Journal of Chem. Tech. Biotechnol.67:286–329.

Hagan, M. T. and M. B. Menhaj. 1994. Training feedforward networks with the marquardt algorithm.IEEE Transactions on Neural Networks 5:989–993.

Huang, W., R. Mithani, K. Takahashi, and L. T. Fan. 1993. Modular neural networks for identification ofstarches in manufacturing food products. Biotechnol. Prog. 9:401–410.

Kurtanjek, Z. 1994. Modeling and control by artificial neural network in biotechnology. Comp. Chem.Eng. 18:627.

Kusunoki, K., K. Kawakami, and F. Shiraishi. 1982. A kinetic expression for hydrolysis of soluble starchby glucoamylase. Biotechnology and Bioengineering 24:347–354.

Linko, P. and Y. Zhu. 1992. Neural network programming in bioprocess estimation and control.In: Modeling and Control of Biotechnological Process, eds. M. N. Karim and G. Stephanopoulos, 163–163. Oxford: Pergamen Press.

Linko, S., J. Luopa, and Y. H. Zhu. 1997. Neural networks as software sensors in enzyme production.Journal of Biotechnology 52(3):257–266.

Minsky, M. L. and S. A. Papert. 1969. Perceptrons. Cambridge, MA: The MIT Press.Molga, E. J. 2003. Neural network approach to support modelling of chemical reactors: Problems, reso-

lutions, criteria of application. Chemical Engineering and Processing 42:675–695.Montague, G. A., A. J. Morris, and M. T. Tham. 1992. Enhancing bioprocess operability with generic

software sensors. Journal of Biotechnology 25(1–2):183–201.Norgaard, M., O. Ravn, N. K. Poulsen, and L. K. Hansen. 2000. Neural Networks for Modelling and Control of

Dynamic Systems. London: Springer.Paolucci-Jeanjean, D., M.-P. Belleville, N. Zakhia, and G. M. Rios. 2000. Kinetics of cassava starch hydroly-

sis with termamyl enzyme. Biotechnology and Bioengineering 68:71–77.Petrova, M., P. Koprinkova, T. Patarinska, and M. Bliznakova. 1998. Neural network modelling of

fermentation processes. Bioprocess Engineering 18:281–287.Rashid, R., H. Jamaluddin, and N. A. Saidina Amin. 2002. Comparison of four schemes of training algo-

rithms for neural network modeling of tapioca starch hydrolysis. In Proceedings of the Second WorldEngineering Congress, pages 300–304, Kuching, Sarawak, July 22–25.

Rashid, R., H. Jamaluddin, and N. A. Saidina Amin. 2003. Modeling of tapioca starch hydrolysis processvia neural network approach. Biochemical Engineering Journal. (under review).

Schenck, F. W. and R. E. Hebeda. 1992. Starch Hydrolysis Products Worldwide Technology, Production, andApplications. New York: VCH Publishers, Inc.

96 R. Rashid et al.

Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

Syu, M.-J. and C.-B. Chang. 1999. Experimental studies of network parameters and operational variableson recurrent backpropagation neural network adaptive control of penicillin acylase fermentationby arthrobacter viscosus. Bioprocess Engineering 21:69–76.

Thibault, J., G. Acuna, R. Perez-Correa, H. Jorquera, P. Molin, and E. Agosin. 2000. A hybrid represen-tation approach for modeling complex dynamic bioprocesses. Bioprocess Engineering 22:547–556.

Thibault, J., V. Van Breusegem, and A. Cheruy. 1990. On line prediction of fermentation variables usingneural networks. Biotechnol. Bioeng. 36(10):1041–1048.

Willis, M. J., G. A. Montague, M. T. Tham, and A. J. Morris. 1991. Artificial neural networks in processengineering. IEEE Proc. Pt. D. 138:256–266.

Yang, Y. Y. and D. A. Linkens. 1994. Adaptive neural-network-based approach for the control of continu-ously stirred tank reactor. IEE Proc.-Control Theory Appl. 141:341–349.

Zanin, G. M. and F. F. De Moraes. 1996. Modeling cassava starch saccharification with amyloglucosidase.Appl. Biochem. Biotechnol. 57=58:617–625.

Zhu, Y.-H. 1995. Neural Network Applications in Bioprocess Engineering. Teknillinen korkeakoulu, Finland:DRTECHN.


Dow

nloa

ded

by [

Uni

vers

ity O

f M

aryl

and]

at 0

7:13

17

Oct

ober

201

4

Documents

EMPIRICAL AND FEED FORWARD NEURAL NETWORKS MODELS OF TAPIOCA STARCH HYDROLYSIS