Predicting haz hardness with artificial neural networks

Pergamon

00084433(94)00028->(

PREDICTING HA2 HARDNESS WITH ARTIFICIAL NEURAL NETWORKS

BILLY CHAN,? MALCOLM BIBBYS and NEAL HOLTZ$ tDepartment <>I’ hlechamcal & Aerospace Izng~neer~ng, C‘arieton Umversity. Ottav,a. Ontario.

K I S iB6. Canada :Depal-tment ot.C’1\11 Englneerlng. Carleton I’n~\er\it\. 0ttaw.I. Ontario. K IS 586. Canada

.Abstract -The u\e 01 .~rtltic~al neural network\ (ba~kpl-npugatlon ncrworkb) for predictmg heat-affected done hardneba. gl\en the XOO 5UO C cooling time dnd chemical composition. is in\estlgated m this study. The experimental rrainlng data are taken from a d,ctahase aasemhled by Yurtoka E[ al. Network predicted hardness values are compared mplth experimental value\ from the entlre Yurioka database and reasonable agreement is found (correlation factor = 0.98). The nctNork results are also compared with values calculated from the regrewon relatlonships of Yurioka ,md Suzuki baaed on the same database. Finally. an optimal netuork archltzcturr ( I hldden laqer. 4 hidden nodes and 40 training patterns) is suggested.

R&sumP& Nous avons rtudle I‘utlhsation de r&eaux ncurauI; artlficiel\ (rtseaux de Aropropagation) pour pr&dlre la durete de Tone? ,ttTectPes par la chaleur. compte tenu du tempa de refroidissement de 800 a 500 C et de la compositlon chlmque. Le\ donnees experimentales furent ohtenues des bases de donnCes assemblies par Yurloka VI LI/ iXoua ‘Lvons compare les \,aleur\ predites de duret-6 par le rPseau avec les valeurs expirlmentales de toute la base de donnieb de Yurtoha Ed nous aLon\ trouvP un accord raisonnable (facteur de corrClation = 0 9X) De plus. les resultats des I-&eauu writ compare:, avec les valeurs des rapports de regrewon de L urlok,t et Suzuki fondeel, sur les mPme\ baw de donnees. Finalement. nous suggerons une architecture de re\eaux optlmale ( I cnuchc cachie. 3 nruda cach& et 10 plans d‘entrainement).

1. IIVTRODCCT104 that can respond to a specific physical problem. It is then trained uith a data set --in this case the authors selected 30-50 exper-

Heat-affected zone (HAZ) hardnehh 15 often used by uelding Imental \,alues from the Yurioka database. As the network engineers to assess the quality of B weld. During the welding 01 trains. the nodes in the hidden layers organize themselves in carbon steels, hard microstructures form in the HAZ because such a via> that different nodes learn to recognize different of relatively high cooling rates. This can and often does lead to features from the total input space. When the trained network so called ‘cold cracking’ m this region. M hich is dangerous in is presented with a new input (beyond training), the network practice. Because of this. HAZ hardening has been extensivcll responds according to the knowledge it acquired during train- studied in the past twenty years [I]. For those studies that 111g

purport to be able to predict the hardening level. the X00--500 C‘ cooling time and base metal (steel) chemical composition 1, usually correlated to the HAZ hardness by means of regresslon analyses. 2. ARTIFICIAL NEURAL NETWORKS

The major objective ofthis stud! I\ to explore a new technique for predicting HAZ hardness. Recent developments in artificial .i\rtiticial neural network studies represent a major field of neural network (ANN) concepts prn\idc an ‘lttractive alter- interest in current artificial intelligence research. Researchers of native to conventional regression techniques. In this inbes- ANN are trying to understand and simulate the human nervous tigation, the ANIi backpropagation network technique is used. \>strm. and to adapt this knowledge to solve physical problems Experimental data taken from the work of Yurioka ct r/l. [‘I [5]. Unlike most of the other AI technologies, ANN is based are used as the basis for the stud). One hundred and fort! H.4Z on ‘I rigorous mathematical foundation. Knowledge is stored hardness values are reported for I1 lou-alloy steels covering ;I In the network as weights. which are distributed throughout the wide range of materials used in practice and with 800~500 C‘ system. The ability to generalize a problem with ANN is an cooling times ranging from 3MO S. RegressIon relationships interesting feature. which allows the network to respond to have been generated by Yurioka [2] and Suzuki [3] based on Inputs that it has never before experienced this database. It is therefore comement to use this in the current The ANN used m this study is a crude approximation of a study so that ANN can be compared \$lth regression methods. sophisticated biological nervous system. Neurons are rep-

The backpropagation network technique is MelI kno\vn fat resented by processing “nodes”. An ANN is built up of a its ability to handle pattern recognition and data mapping prob- collection of nodes connected together by weighted links ( W). lems [4]. The principle of ANh’ 15 that ;I network is generated Llsuall~ nodes are organized in layers : the input layer-nodes _

;i:

354 I3 C‘HAN c’/ <I/. PKkl)IC‘TING HAL HARDNESS

input signals welghted welghted

output responses

Input layer hldden layer output layer

(h) (‘1 (I)

Fig. I. A one-hlddzn-ia)er artlticul IKYUXI netaork

(H) that gather informatmn from the outside world ; the output layer-nodes (J) that pass network responses to the outside world: and the hidden layers nodes (I) that manipulate the network internal representation [6]. Usually there are no inter- connections within a layer, as shown in Fig. 1.

Each node carries two mathematical functions : summation and transfer. The summation function is the method used to combine weighted incoming information :

The outgoing signal depends on the value of the summed mfor- mation (S) and a specific transfer function (F). Many mathematical formulations can be used as the transfer function. e.g. step function. ramp function. sigmoid function. S-shaped curves. etc. In this study. the sigmoid function

F‘(s) = ‘- (2) I+r I

is used because both the functton and its derivatives are con- tinuous, which is similar to biological response. Therefore. each node responds to stimulus according to the connection weights. In theory. there would be a set of weights that would respond correctly to a specific domain of problems. The search for thus specific set of weights is called network training or learning.

In backpropagation learnmg [6]. a set ofdestred input output patterns is chosen for network training. The input signals are fed into the network with random initial weights (Fig. 3). Network errors (E).

are calculated by comparing the network output (0“) and the desired output (0”). The weights connecting the hidden and output layers are adjusted according to a change of *eights (A W,,) :

where

A W’,! = r/l,+ + zA I+‘:, . (1)

0, = F(S,)(I -F(S,))(O”~~o”). (5)

Input signals

weighted weighted links (Whi) links (Wii)

bock propagating

output responses

desired output

adjustment

Input layer hidden layer output layer

ih) !i) 0)

Fig. 2. .4 backpropagatron network.

Adjustments are made according to the network errors (a,), the learning rate (v) and the momentum coefficient (r). The learning rate (a) can be viewed as the step size in the learning iteration. The larger the step size the faster the network learns. However. a large step size may pose the danger of over-looking a specific set of weights. The momentum coefficient (x) helps the netuork continue in the direction of searching and hence reduces the training time. Both the learning rate and momentum coefficient should have a value between 0 and 1 [6]. These errors are backpropagated to refine the weights connecting the input and hidden layers (A W,,,) :

M here

AW,J, = qH,,b,+rAW;,y’, (6)

0, = F(S,)(‘-F(X)) i: (6<,W,,,). (7) L, = I

After the first pattern has been exposed to the network, the network adjusts its weights accordingly. Then the 2nd; 3rd, , and the pth patterns are submitted to the network and the weights are adjusted after each exposure. The network has finished one epoch of learning when the whole set of patterns (1st~pth) has been provided to the network. Similar to biological learning, the network requires many epochs of learning before it yields an acceptable accuracy, the benchmark for which is determined by the root mean squared error for an epoch (RMSerror) :

RMSerror = ,‘i t (@c(y)‘, I,,= ,h= I (8)

Y P -.i

3. PREDICTION OF HAZ HARDNESS

As mentioned above, the original experimental data used for this study were taken from the work of Yurioka et a/. [2]. Regression analyses were performed based on the hardness values. carbon equivalents (a quantitative measurement of the hardenability effect of alloying elements in steel) and the SOO- SO0 C cooling time (TX i) by Yurioka [2] and subsequently by Suzuki [3].

H. C‘HAN (‘I it/. : PRLDICTINCi H.4Z HARDNESS 355

In the present study, six backpropagation network arrange- ments are utilized for predicting the HAZ hardness based on a training subset of the original data from the Yurioka study Four different backpropagation networks with various num- bers of hidden layers and hidden nodes are studied m this investigation. The inputs to the network are the 800-500 C cooling time, the carbon content (C wt%) and the carbon equivalents (P,,, proposed by Suzuki [3]. and Cc, proposed by Ito et cd. [7]):

Mn+Cu+Cr NI P,,, = gj +

Mo V 20 +6n+~5~+m+5B (9)

c,,=c+7+- Mn Cu+Ni + Cr-Mo+V

IS 5 (10)

Two different carbon equivalents are used because. as suggested by Suzuki [3]. PC,,, is superior to C,, where the cooling is fast. while C,, is a better indicator than PL,,, for sloa cooling conditions. Carbon content is also submitted to the netuorks as an independent parameter because carbon equivalents indi- cate the hardenability of the steel, while the maximum hardness is determined by the carbon content. Submitting all these parameters (C wt%, C,, and P,,) to the system may seem con- fusing; however. the network includes them all and weights them to best match the training data set.

According to equation (2), the result ofF(.s) must be a value between 0 and 1. Furthermore. the result will be 0 (or I) if the summation is equal to negative (or positive) infinity. Therefore. it is logical to normalize all network inputs and outputs between limits such as 0.1 and 0.9. selected for this study. Since the training time is not a major concern of this study. a conservative learning rate (0.1) and momentum coefficient (0.25) are used.

4. RESULTS AND DISCL’SSION

In general. designing a backpropagation nets ork is more art than mathematics. There are no rules governing the number of hidden layers, hidden nodes or training patterns. Too few hidden nodes poses the danger of limiting the memory of the network. rendering it unable to learn. On the other hand. if there are too many hidden nodes. the network may memorize the training patterns instead of generating general knowledge for solving the problem. i.e. the network performs extremely

uell during training but fails to respond to unseen inputs. A backpropagation network used for prediction (such as those used in this study) can be viewed as a function for mapping the input to the output. I f the relationship between the input and output is very complex, more hidden nodes or even more hidden layers must be used. For those more mathematically inclined. the netuork may be considered a curve-fitting technique involv- ing high nonlinearity.

The networks investigated in this study were trained with 30, 10 or 50 patterns until the root mean squared errors (RMSer- rors) were less than 0.05. The finished networks were then tested against all 140 patterns (PC,,. C,,, r,,, and C wt% vs HAZ hardness values) in the Yurioka database. The correlation factors for the different networks are tabulated in Table I together with comparable results from the regression analysis of Yurioka [2] and Suzuki [3].

From the correlation factors listed in Table 1, a single hidden layer vvith 4 nodes and 40 training patterns was found to be the best for predicting HAZ hardness in this study. More layers (networks I. 2, 3 and 6), nodes (networks 1,2 and 3) or training patterns (networks 2,4 and 5) may not be necessary for improv- ing accuracy. More patterns can confuse the network, causing the learning to oscillate between conflicts. More patterns increase the risk of having more conflicts between patterns, and hence increases the learning time. As suggested by Hafez [8], the appropriate number of training sets should be about twice the number of total connections and the training set should cover the entire scope of the problem. Although the ANN technique does not outperform conventional regression analysis in this case (correlation factors--Table 1). it is comparable. Indeed. the ANN technique provides some advantages. At the beginning of the analysis. the user of ANN is not required to provide strong insight into the input-output relationship. The complexity of the relationship can always be increased by

adding more hidden nodes or even hidden layers. Furthermore, the network can isolate non-sensitive variables during training.

On the other hand. for regression analyses the relationship must be pre-formulated. Moreover, the number of sensitive v artables must be determined precisely in advance, since non- sensitive parameters can reduce the accuracy of the model [9]. Fmally. a backpropagation network formulation is more flex- iblc than conventional regression techniques. As new data come along (e.g. an extension of steel compositions. etc.), the network can be m-trained without resorting to an entirely new analysis.

Number of hidden Correlation Number of hrclden IlOdtY h umber of training factor

Network number la\tn (Id 2nd IJ>U) patterns (140 samples)

I I : 40 0.966 2 I 1. 40 0.982T 3 I i

1. 40 0.967

4 I 30 0.956 5 I 1. 50 0.981 6 2 4: 40 0.977

Yurioka [2] Regression analy\ls 140 0.984 Suzuki [3] Regression analy\ls 140 0.977

t A plot of the network predicted hardnec x> measured [?I hardnes\ I\ Included III FIN i.

356 B C‘HAN <‘T c/l. : PRFDlC TIhG HAZ HARDNESS

200 250 300 350 400 450 500

Measured VHN

Fig. 3 Backpropagation network t I htdden layer. 4 neuroni. 40 training patterns) predicted HAZ hardnes\ vs measured [I] hardness value\

The major disadvantage of AK’N is that reasoning about the solution is not available because the knowledge for achieving the solution is stored as weights and distributed in the entire network. Therefore. the physical basis for the solution is not apparent.

5. CONCLl!SIONS

(I) As-welded HAZ hardness can be predicted b!, ANN tech-

nology (backpropagation network) with acceptable accuracy given the 800 500 C cooling time (T, *) and steel chemical composition.

(2) A backpropagation netv*ork with too many or too fen

hidden nodes and/or layers may not necessarily improve the network performance. The optimal architecture found in this study was a single hidden layer network vvith four hidden neurons.

(3) A backpropagation network trained with too many or too few training patterns may not necessarily improve the network accuracy. The optimal condition found in thts study was 40 training patterns.

,-I( k-rro~ k,&vvr,zr.v ~The authors would hke to acknowledge the sup- port of the National Sctence and Engineering Research Council, grant NSERC A4601. Additionally, the authors greatly appreciate the valuable comments in the application of backpropagation network by Dr H. M. HafeL. Department of System and Computer Engineering, Carleton L’niverstty.

REFERENCES

I. B. Ghan and M. Bibby. paper presented at the AWS/AWI/NIST Int. Conf. Computerization for Welding Information IV, 3-6 Nov. 1992. Orlando. FL.

2. N. Yurioka. S. Ohsita and H. Tamehiro, .StudJa on Carbon Equi- i-uit~nr.~ to Assess Cold Cracking Tendem:, and Hardness in Steel U’el&y. Specialist Symposium on ‘Pipelute Welding in the SOS’, Melbourne. March 198 1. Australian Welding Research Association, Melbourne (1981). Also m JWS WM.784-80. I I pp. (1981).

3 H. Suzuki. T~ans. Jpn Welding Sot. 15(l), 25 (1984). 4 E. Gelenbe (ed.). Neurul Networks Adtmnces and Applications 2.

North-Holland. Amsterdam (1992). 5. B. Muller and J. Reinhardt, Neural Neiworks: an Introduction. pp.

2 2 I, Springer-Verlag. Berlin (199 I), 6 D. E. Rumelhart, G. Hinton and R. J. Williams, In Parallel Dis-

/riAurac/ Prorexsing, Vol. I (edtted by Rumelhart and McClelland), pp. 318-62. MIT Press, Cambridge. MA (1986).

7. Y [to and K. Bessyo. Weldahility Formula of High Strength Steels ReI~ted to Heat-Affected-Zone Cracking, IIW Dot 1X-576-68,45 pp. (196X). JU’S. 37(9). 983 (1968). IIW Dot 1X-631-69. I8 pp. (1969).

8. H. M Hafez. personal communication (Oct. 1993). 9 L. J. Yang, A S/udy on the Submerged Arc Welding Process Variables,

mternal report, School of Mechamcal and Production Engineering, Nan&yang Technological Instttute, Singapore (1991).

Documents

Predicting haz hardness with artificial neural networks