8
Correlating structure topological metrics with bulk composite properties via neural network analysis Dustin D. Gerrard a,b,, David T. Fullwood a , Denise M. Halverson b a Department of Mechanical Engineering, Brigham Young University, 435 Crabtree Building, 84602 Provo, UT, USA b Department of Mathematics, Brigham Young University, 275 TMCB, Provo, UT, USA article info Article history: Received 19 April 2013 Received in revised form 25 February 2014 Accepted 4 April 2014 Keywords: Homology Topology Stiffness Microstructure Machine learning Data mining abstract Given a database of any quantifiable set of cause and effect, machine learning methods can be trained to predict future effects based upon an assumed set of causes. In this paper, neural networks are trained to predict the bulk Young’s modulus and electrical conductivity of a two-phase composite with high mate- rial property contrast, based upon a sample’s microstructure. Various structure metrics are used to quan- tify the topological connectivity and disorder of analytically generated heterogeneous samples. The neural network is trained to predict the Young’s modulus and coefficient of electrical conductivity based upon values calculated for a training set of samples using a finite element model. By repeating the process with various subset of structure metrics we can determine which metrics—or combination of metrics— have the strongest influence in accurately predicting bulk material properties. Not only are neural net predictions of bulk properties in good agreement with calculated values for the 2D two-phase compos- ites, but the insights into which metrics most strongly correlate with these properties (in this case, the connectivity metrics) may help focus the development of improved structure–property relations. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction Composites composed of two or more constituent materials are studied extensively in materials science and engineering. Homog- enizing the locally heterogeneous properties and structure to arrive at global properties for such a material is a common goal [1–6]. Understanding how the microstructure of a heterogeneous material influences its bulk properties (such as Young’s modulus and electrical conductivity) may ultimately allow the microstruc- ture of the material to be designed and fabricated in such a way as to produce specifically desired bulk properties in the material [7–9]. Exact solutions for the homogenization of spheres, ellipsoids, and other geometries of a material deposited within a matrix of a second material have been determined analytically ([1,2]). How- ever, not all material arrangements have an analytical solution describing homogenized behavior. To gain better understanding of these materials, theories have been developed to predict bulk material properties as well as possible ranges of values. Effective medium theory is one such theory that works well in predicting the homogenization values of a sample whose constituent materi- als have material property values similar to each other [10]. Effec- tive medium theory becomes less accurate for homogenization as the constituent material properties become more and more dissim- ilar. For samples composed of dramatically different constituent materials percolation theory is a more appropriate tool to determine bulk properties [11–13]. Furthermore, various theories exist for providing bounds on the range of possible values of homogenized physical properties for a heterogeneous material [14–17]. Homogenization methods can often produce an accurate esti- mation of bulk properties under certain conditions, but each falls short in pinpointing exact values consistently. This may be due as much to insufficiency of structure information in the structure metrics that are input to the models as it is to the overall failings of the model itself to capture underlying physics. Typical descrip- tors of structural arrangement include n-point statistics [2,8], clus- tering [18] and percolation metrics [10,12,19,20]. More recently, interest has risen in more esoteric structure metrics, such as homology [21–23] and geometrical entropy [24,25]. The aim of this paper will be to assemble a variety of different structure metrics that describe connectivity, entropy, etc. and feed these into a machine learning environment to determine which structure descriptors have the strongest influence on the global properties, and to determine whether an accurate prediction of homogenized properties is possible using the full set of metrics. http://dx.doi.org/10.1016/j.commatsci.2014.04.014 0927-0256/Ó 2014 Elsevier B.V. All rights reserved. Corresponding author at: Department of Mechanical Engineering, Brigham Young University, 435 Crabtree Building, 84602 Provo, UT, USA. Tel.: +1 650 847 0402. E-mail address: [email protected] (D.D. Gerrard). Computational Materials Science 91 (2014) 20–27 Contents lists available at ScienceDirect Computational Materials Science journal homepage: www.elsevier.com/locate/commatsci

Correlating structure topological metrics with bulk composite properties via neural network analysis

Embed Size (px)

Citation preview

Page 1: Correlating structure topological metrics with bulk composite properties via neural network analysis

Computational Materials Science 91 (2014) 20–27

Contents lists available at ScienceDirect

Computational Materials Science

journal homepage: www.elsevier .com/locate /commatsci

Correlating structure topological metrics with bulk composite propertiesvia neural network analysis

http://dx.doi.org/10.1016/j.commatsci.2014.04.0140927-0256/� 2014 Elsevier B.V. All rights reserved.

⇑ Corresponding author at: Department of Mechanical Engineering, BrighamYoung University, 435 Crabtree Building, 84602 Provo, UT, USA. Tel.: +1 650 8470402.

E-mail address: [email protected] (D.D. Gerrard).

Dustin D. Gerrard a,b,⇑, David T. Fullwood a, Denise M. Halverson b

a Department of Mechanical Engineering, Brigham Young University, 435 Crabtree Building, 84602 Provo, UT, USAb Department of Mathematics, Brigham Young University, 275 TMCB, Provo, UT, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 19 April 2013Received in revised form 25 February 2014Accepted 4 April 2014

Keywords:HomologyTopologyStiffnessMicrostructureMachine learningData mining

Given a database of any quantifiable set of cause and effect, machine learning methods can be trained topredict future effects based upon an assumed set of causes. In this paper, neural networks are trained topredict the bulk Young’s modulus and electrical conductivity of a two-phase composite with high mate-rial property contrast, based upon a sample’s microstructure. Various structure metrics are used to quan-tify the topological connectivity and disorder of analytically generated heterogeneous samples. Theneural network is trained to predict the Young’s modulus and coefficient of electrical conductivity basedupon values calculated for a training set of samples using a finite element model. By repeating the processwith various subset of structure metrics we can determine which metrics—or combination of metrics—have the strongest influence in accurately predicting bulk material properties. Not only are neural netpredictions of bulk properties in good agreement with calculated values for the 2D two-phase compos-ites, but the insights into which metrics most strongly correlate with these properties (in this case, theconnectivity metrics) may help focus the development of improved structure–property relations.

� 2014 Elsevier B.V. All rights reserved.

1. Introduction

Composites composed of two or more constituent materials arestudied extensively in materials science and engineering. Homog-enizing the locally heterogeneous properties and structure toarrive at global properties for such a material is a common goal[1–6]. Understanding how the microstructure of a heterogeneousmaterial influences its bulk properties (such as Young’s modulusand electrical conductivity) may ultimately allow the microstruc-ture of the material to be designed and fabricated in such a wayas to produce specifically desired bulk properties in the material[7–9].

Exact solutions for the homogenization of spheres, ellipsoids,and other geometries of a material deposited within a matrix of asecond material have been determined analytically ([1,2]). How-ever, not all material arrangements have an analytical solutiondescribing homogenized behavior. To gain better understandingof these materials, theories have been developed to predict bulkmaterial properties as well as possible ranges of values. Effectivemedium theory is one such theory that works well in predicting

the homogenization values of a sample whose constituent materi-als have material property values similar to each other [10]. Effec-tive medium theory becomes less accurate for homogenization asthe constituent material properties become more and more dissim-ilar. For samples composed of dramatically different constituentmaterials percolation theory is a more appropriate tool to determinebulk properties [11–13]. Furthermore, various theories exist forproviding bounds on the range of possible values of homogenizedphysical properties for a heterogeneous material [14–17].

Homogenization methods can often produce an accurate esti-mation of bulk properties under certain conditions, but each fallsshort in pinpointing exact values consistently. This may be dueas much to insufficiency of structure information in the structuremetrics that are input to the models as it is to the overall failingsof the model itself to capture underlying physics. Typical descrip-tors of structural arrangement include n-point statistics [2,8], clus-tering [18] and percolation metrics [10,12,19,20]. More recently,interest has risen in more esoteric structure metrics, such ashomology [21–23] and geometrical entropy [24,25]. The aim of thispaper will be to assemble a variety of different structure metricsthat describe connectivity, entropy, etc. and feed these into amachine learning environment to determine which structuredescriptors have the strongest influence on the global properties,and to determine whether an accurate prediction of homogenizedproperties is possible using the full set of metrics.

Page 2: Correlating structure topological metrics with bulk composite properties via neural network analysis

D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27 21

2. Materials and methods

In order to assess the structure–property relations of a range of2D heterogeneous material samples, two-phase composites arecreated on a simple square lattice that is 128 by 128 pixels usingMatlab [26]. These two dimensional samples may be referred toas plates. The size of the plates was chosen to enable numerouscomputations to be undertaken in a reasonable amount of time,while also providing samples that are at least close to being repre-sentative of a bulk material geometry. In our investigation, all sam-ples are 50% matrix (black) and 50% particulate (white). Thecontrast between the properties of the two material phases is highfor both stiffness and electrical conductivity. The matrix has valuesof Young’s modulus and electrical conductivity of E1 = 107 Pa, andr1 = 105 S/m, respectively. The particles have the propertiesE2 = 7 ⁄ Pa, and r2 = 109 S/m. These properties represent a typicalorder of magnitude contrast that might be found in common com-posite materials.

Three different sample-generating algorithms are employed sothat a variety of geometrical arrangements can be analyzed. Thesealgorithms employ different methods for clustering particles. Thefirst algorithm creates microstructures that exhibited chains ofapproximately spherical particles that are ‘‘strung’’ together, thusthe name, ‘‘stringy clusters’’. In this algorithm, particles are seededby randomly selecting center points until 10% of the desired vol-ume fraction is reached. Then, as the additional particles areplaced, they are moved some percentage closer to their nearestneighbor. The second algorithm creates microstructures that fea-ture clusters that are circular in nature called ‘‘circle clusters’’. Inthis algorithm, particles are randomly placed into a virtual circle,which represented a cluster, until a specified volume fraction isreached. The cluster was then randomly placed in the samplespace. This process continues until the desired sample volume frac-tion is reached. Clusters never have a diameter more than half theheight of the sample plate (64 pixels). The third and final algorithmfor generating sample plates is to randomly assign each pixel of ourlattice to be either matrix or particle material. Fig. 1 shows exam-ples of plates generated from these algorithms. The volume frac-tion of each of these algorithms is adjustable; more detailsregarding these algorithms are given in Ref. [27].

3. Calculation

3.1. Microstructure metrics

The constituent materials of any heterogeneous sample canform into an essentially infinite number of geometric arrange-ments. Attempting to describe the precise geometric layout ofmaterial phases is impractical, and may even be impossible.

Fig. 1. Three different examples of geometrical arrangements of samples that might beplacement.

However, there are topological and large-scale behaviors that caneasily be described in a quantitative manner.

In this analysis nine metrics of the composite structure areused. These metrics relate to the connectivity/percolative nature,the homology and the entropy of the system. In the list of them,below, the first six are measures of topological connectivity relatedto homology; the last two are geometric measures; and theremaining metric, entropy, is a measurement of disorder. We alsoinclude the range of values observed in this study. See the Appen-dix for more details regarding the individual definitions, and exam-ples of metric values for particular structures.

1. Betti zero: the number of disconnected pieces of particulatematerial. b0(v) [1,291].

2. Betti one: the number of independent interior loops withinparticle clusters. b1(v) [0,1100].

3. Betti zero/Betti one: b0(v)/b1(v). [0.00658, 65535]; thepeak value is capped when b1(v) is zero.

4. Relative Betti zero homology: the Betti zero number for theplate with top and bottom edges of plate acting as particu-late material. b0(v, T + B) [0, 263].

5. Relative Betti one homolgy: the Betti one number for theplate with top and bottom edges of plate acting as particu-late material. b1(v, T + B) [0, 1156].

6. Percolation: equates to one if percolation occurs across theparticles, else zero. (P) [0, 1].

7. Entropy: the disorder of the materials within the sampleplate. (e) (see Appendix) [183.3, 5700.66].

8. Maximum cluster height: the height of the cluster of partic-ulates. (hmax) [33, 128].

9. Mean cluster height: the arithmetic mean height of all clus-ters of particulates. (havg) [2.94, 126].

3.2. Finite element analysis

The bulk properties are calculated using finite element softwareANSYS [28]. The two material properties that we examine, Young’smodulus and electrical conductivity, are calculated only in the ver-tical direction (see Fig. 2). Material one (black) has a Young’s mod-ulus of E1 = 1.0 ⁄ 107 Pa, and conductivity of r1 = 1.0 ⁄ 105 S/mmaterial two (white) has a Young’s modulus of E2 = 7.0 ⁄ 109 Pa,and conductivity of r2 = 1.0 ⁄ 109 S/m.

To calculate Young’s modulus for a 2-D plate, the bottom nodesare prevented from moving in the y-direction. The top nodes aredisplaced by an equal distance in the y-direction to produce a sam-ple strain of 0.01, thus keeping the top edge of the sample flat; theside nodes are free to move in x and y. The overall required force toproduce the strain is used to calculate Young’s modulus for theentire plate. Similarly, conductivity is found by applying a voltages

used. Left: stringy cluster algorithm. Middle: circle cluster algorithm. Right: random

Page 3: Correlating structure topological metrics with bulk composite properties via neural network analysis

Fig. 2. Top: sample plate. Bottom left: elastic strain (vertical direction). Bottom right: current density (vertical direction).

22 D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27

of 100 V to the top of the plate and 0 V to the bottom. The sides ofthe sample have zero flux passing across them. Current density isintegrated for each element along the top of the plate to determinethe bulk conductivity of the plate.

3.3. Neural network model

The role of the modeling component discussed in this section istwofold. Firstly, a model is required that connects the microstruc-tural parameters (the nine metrics) with the resultant properties(Young’s modulus or conductivity calculated via the FEA); thismodel must be sophisticated enough to capture subtle relation-ships between the metrics and properties. Secondly, the modelmust be able to quantify its own accuracy (based upon a suitabledata set). Machine learning provides an ideal environment fordeveloping such a model. Several machine learning approacheswere tested and compared in order to arrive at a suitable modelingapproach. These included logistic regression, support vectormachines, decision tree analysis and neural networks. The neuralnetwork approach, connecting the structure metrics (inputs tothe model) to the bulk properties of a given sample (the outputsof the model), consistently gave the most accurate relationship,and was therefore chosen as the base model. In order for the neuralnetwork to learn the relationships between the independent anddependent variables, a set of 1000 samples is used as training data.The built-in multilayer perceptron function in the open sourcemachine learning package, WEKA, was used to create a 10-foldcross-validated neural network [29]. The numeric values for eachof the nine material metrics (described in Section 3.1) are treatedas independent variables. Both Young’s modulus and bulk conduc-tivity, as calculated from the FE-analysis, are treated as dependentvariables. A further 1000 samples are subsequently given to theneural network to test the accuracy of the predicted values ofYoung’s modulus and electrical conductivity. The correspondingranges for each of the metrics, for the training data and the testdata, are the same. The WEKA environment provides a significantamount of information relating to the influence of each of the inputmetrics on the resultant properties. One of these parameters, thecorrelation coefficient, quantifies how well the model is able tocapture the relationship between the structure metrics and thebulk material properties, and was chosen as the simplest and mostdirect quantification of underlying relationships for the purposesof this paper.

3.4. Test procedures

The first neural network is created using all nine material met-rics as independent variables with Young’s modulus as the depen-dent variable. All data sets (other than those summarized in Table8) are one-third ‘‘stringy cluster’’ plates, one-third ‘‘circle cluster’’plates, and one-third random plates. To then determine whichmetrics are the best predictors of Young’s modulus, new neuralnetworks are created, leaving out one of the nine material metricsto observe how the correlation coefficient is affected. Based on thecorrelation coefficients we rank how well each of the material met-rics predicts Young’s modulus. Neural networks are also createdwith only one metric used as the independent variable, to againcompare how well each metric predicts Young’s modulus.

We then investigate how well certain combinations of metricspredict Young’s modulus. The correlation coefficient for the neuralnetwork based upon the three metrics that are the best predictorsof bulk properties is determined. Similarly, the correlation coeffi-cient for the neural network based upon the three metrics thatare the worst predictors of bulk properties is calculated.

Several other factors, relating to the practicality of quantifyingthe structure metrics, their detailed definitions, and the types ofstructure that they are applied to, are then explored.

Some metrics would be easier to calculate than others if weapplied this process to tangible materials. For example, percolation(P) and maximum cluster height (hmax) can be calculated very eas-ily in a material. However, entropy (e) and mean cluster height(havg) are much more difficult to calculate. We test how wellYoung’s modulus can be predicted when only the most accessiblemetrics, P, hmax, b0, b1, and b0(v)/b1(v) are used in the neuralnetwork.

Eight metrics are affected by the way connectedness is defined(b0(v), b1(v), b0(v)/b1(v), b0(v, T + B), b1(v, T + B), P, hmax, havg). Wenote at this point that ‘‘connectedness’’ of particulate pixels can bedefined in different ways. In the definitions of the structure metricsso far, two particulate pixels are assumed to be connected if theyshare a common vertex; an alternative definition of connectednessmight be based upon pixels sharing a common edge. The way con-nectedness is defined affects homology, percolation, and grainheight. We explore which definition of connectedness yields ahigher correlation coefficient.

The window size used to define entropy for the initial neuralnetwork was a two by two square. We look at six other window

Page 4: Correlating structure topological metrics with bulk composite properties via neural network analysis

Table 4Correlation coefficients for neural networks using different combinations of metrics.

Combination of metrics Correlation coefficientwhen determiningYoung’s modulus

Correlation coefficientwhen determiningconductivity

P, e, hmax 0.7952 0.9457b0(v), b1(v), b1(v, T + B) 0.6231 0.8411b0(v)/b1(v), b0(v, T + B), havg 0.589 0.8181

Table 6Correlation coefficients for neural networks with nine metrics used.

Correlation coefficient Correlation coefficient

Table 5Correlation coefficients for neural networks using different combinations of metrics.

Combination of metrics Correlation coefficientwhen determiningYoung’s modulus

Correlationcoefficient whendeterminingconductivity

P, hmax, b0, b1, and b0(v)/b1(v) (easy to measure inactual materials)

0.7839 0.9483

b0(v, T + B)/b1(v, T + B), e,and havg

0.6429 0.8478

Table 3Electrical conductivity correlation coefficients. Each metric is used as the onlyvariable in the neural network. A second neural network is made with one materialmetric excluded.

Metric Correlation coefficientwhen metric is theonly variable

Rank ofstrength

Correlationcoefficient whenmetric is excluded

Rank ofstrength

P 0.9446 1 0.8883 1hmax 0.9274 2 0.9452 4b1(v, T + B) 0.8404 3 0.9469 7b1(v) 0.8403 4 0.9465 6e 0.8379 5 0.9442 2havg 0.8305 6 0.9544 9b0(v, T + B) 0.7955 7 0.9456 5b0(v) 0.7879 8 0.945 3b0(v)/b1(v) 0.1442 9 0.9497 8

D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27 23

sizes/shapes for measuring entropy and determine which best pre-dicts Young’s modulus.

We also separate our samples by type (stringy, circle, and ran-dom) and see how well neural networks function for each materialtype.

4. Results

The first neural network was created using all nine materialmetrics as independent variables. Young’s modulus and electricalconductivity were treated as dependent variables. The followingtables describe how well various combinations of metrics correlatewith ‘‘measured’’ Young’s modulus and electrical conductivity.Table 1 presents the correlation coefficients when all defined met-rics are used in the neural net model. Tables 2 and 3 then look atthe correlations with either a single metric, or with one metricdeleted from the model, for Young’s modulus and conductivity,respectively. Tables 4 and 5 consider various subgroupings of met-rics to model the two physical properties. The highest and lowestranking three metrics (from the previous tables) are combined todetermine how well they perform as a group in Table 4, and theeasiest and most computationally expensive metrics to quantifyare compared in Table 5. Table 6 considers the case of pixels beingdefined as ‘‘connected’’ when they share only a vertex, comparedto a definition which requires a shared edge. Table 7 and Fig. 3present results for different window sizes in the definition ofentropy. Table 8 compares the correlation coefficients for differenttypes of clustering in structures. And Fig. 4 compares the perfor-mance of a single metric (b0) vs. the neural net for predictingYoung’s modulus for many samples with different structure types.The results are discussed in the next section.

5. Discussion

When all nine metrics are used for predicting Young’s modulusa correlation coefficient (c) of c = 0.8534 is achieved; when used forpredicting conductivity a value of c = 0.9453 is obtained. In general,the neural network returns a higher value of c when correlatingmaterial metrics to conductivity.

Rankings of the importance of each metric vary significantlydepending upon whether a single metric is the only independent

Table 1Correlation coefficients. All nine material metrics are used in the neural network.

Determining Young’smodulus

Determiningconductivity

Correlation coefficient(c)

0.8534 0.9453

when determiningYoung’s modulus

when determiningconductivity

Corners connected 0.8534 0.9453Corners disconnected 0.8576 0.9321

Table 2Young’s modulus correlation coefficients. Each metric is used as the only variable inthe neural network. A second neural network is made with one material metricexcluded.

Metric Correlation coefficientwhen metric is theonly variable

Rank ofstrength

Correlationcoefficient whenmetric is excluded

Rank ofstrength

P 0.7694 1 0.7879 1hmax 0.756 2 0.8534 4b1(v) 0.6213 3 0.8495 3b1(v, T + B) 0.6213 3 0.8543 6havg 0.5992 5 0.8621 9e 0.5688 6 0.8059 2b0(v, T + B) 0.513 7 0.8567 8b0(v) 0.493 8 0.8534 4b0(v)/b1(v) 0.1053 9 0.8561 7

Table 7Correlation coefficients for neural networks with nine metrics used and varyingentropy window.

Correlation coefficientwhen determiningYoung’s modulus

Correlation coefficientwhen determiningconductivity

e1 (2 � 2) 0.8534 0.9453e2 (4 � 4) 0.8454 0.9483e3 (8 � 8) 0.8854 0.9495e4 (16 � 2) 0.8582 0.9472e5 (16 � 16) 0.8756 0.9490e6 (32 � 2) 0.8532 0.9456e7 (32 � 32) 0.8484 0.9452

variable in the analysis or whether it is the only metric excludedfrom the analysis. In Table 2 percolation (P) proved to have thestrongest correlation in the determination of Young’s modulus,both when considering the case where it was the only metricand the case where it was the only metric excluded from the neuralnetwork. However, there is less agreement regarding rank of

Page 5: Correlating structure topological metrics with bulk composite properties via neural network analysis

2x2 4x4 8x8 16x16 32x320.84

0.845

0.85

0.855

0.86

0.865

0.87

0.875

0.88

0.885

0.89

Entropy Window Size

Cor

rela

tion

Coe

ffici

ent (

c)

Elasticity

2x2 4x4 8x8 16x16 32x320.945

0.9455

0.946

0.9465

0.947

0.9475

0.948

0.9485

0.949

0.9495

Entropy Window Size

Cor

rela

tion

Coe

ffici

ent (

c)

Conductivity

Fig. 3. Left: entropy window size and correlation coefficient for Young’s modulus. Right: entropy window size and correlation coefficient for conductivity.

Table 8Correlation coefficients for neural networks with all nine metrics used (approximately325 training samples and 325 test samples).

Material type Correlation coefficientwhen determiningYoung’s modulus

Correlation coefficientwhen determiningconductivity

Stringy clusters 0.6190 0.9883Circle clusters 0.7123 0.8291Random 0.2521 0.0394

24 D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27

importance for the other metrics. We see that when maximumgrain height (hmax) is the only metric used it is the second strongestmetric but when it is the only one excluded it ranks fourth instrength. It is unlikely that these two methods of ranking impor-tance of metrics will yield the same results, mainly because themetrics are not independent. The nine metrics that are used arean attempt to thoroughly describe the morphology of the micro-structure phases, however, these metrics may contain similarinformation and certainly do not provide a complete descriptionof structure. For example, if the second phase of a plate has perco-lated then the maximum grain height (hmax) will be the height ofthe plate. Other more subtle morphological characteristics maybe collected by more than one metric. Consider the two analysesof the metric b1(v, T + B) in Table 2. When b1(v, T + B) is the onlymetric used in a neural network its rank of strength is 3 but whenit is the only metric excluded its rank of strength is 6. This suggeststhat b1(v, T + B) is effective at correlating to Young’s modulus but

0 50 100 150 200 250 30010

7

108

109

1010

Betti-0 Number (β0)

Youn

g's

Mod

ulus

, E (P

a)

Elasticity vs. Betti-0 (β0)

Betti Zero DataRegression Line

Fig. 4. Left: correlation between b0 and Young’s modulus. R2 = 0.3872. Right: correlati

at least some of the information it gathers is also collected bythe other eight metrics.

The metrics correlate to conductivity in a similar manner. Per-colation (P) is the metric with the strongest correlation. Considerthe metric b0(v). It ranks 8th in strength when it is the only metricused but it ranks 3rd when it is the only one excluded. The rank ofstrength may also be affected by noise. When a metric is excludedfor a conductivity analysis the range of correlation coefficient (c) isfrom 0.8883 to 0.9544 whereas this range is 0.1446–0.9446 whenindividual metrics are used; the smaller range for the first rankingmakes it more susceptible to noise.

In general, whether correlating to Young’s modulus or conduc-tivity through including or excluding a metric, percolation (P)proves it is certainly the strongest metric amongst those consid-ered. The metric b0(v)/b1(v) is always quite weak.

The strength of several metrics combined is not always greaterthan the individual metric strengths. The combined metrics b0(v)/b1(v), b0(v, T + B) and havg have correlation coefficients c = 0.589,and c = 0.8181 for Young’s modulus, and conductivity respectively.Using havg by itself provides stronger correlation values ofc = 0.5992 for correlating Young’s modulus, and c = 0.8305 for cor-relating conductivity. Note also that the strengths of dependentmetrics can be significantly different using this approach; forexample b0(v)/b1(v) generally has a lower strength than either ofthe metrics used to form it. This illustrates the fact that the partic-ular model used may find it easier to relate certain combinations ofmetrics to the resultant properties than others.

107

108

109

107

108

109

Predicted Young's Modulus, E (Pa)

Youn

g's

Mod

ulus

, E (P

a)

Neural Network for Young's Modulus

Neural Nework DataRegression Line

on between predicted and actual Young’s modulus using neural net. R2 = 0.9063.

Page 6: Correlating structure topological metrics with bulk composite properties via neural network analysis

Fig. 5. Left: heterogeneous sample. Middle: independent grains highlighted by different colors (b0 = 7). Right: independent interior loops indicated by brown circles (b1 = 2).

Fig. 6. Left: b0(v, T + B) = 2, b1(v, T + B) = 2. Right: b0(v, T + B) = 0, b0(v, T + B) = 5.

Fig. 7. Left: percolation of the white phase has not occurred across the height of the sample. Right: percolation of the white phase has occurred across the height of the sample.

D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27 25

Table 5 divides the metrics into a set of metrics that can all beeasily measured on a real material and a set of metrics that cannotbe measured with as much ease. For example, percolation (P),maximum grain height (hmax), Betti zero (b0), Betti one (b1), andb0(v)/b1(v) can all be measured easily. Using all these metricsyields a stronger correlation than any individual metric. The met-rics relative-Betti-zero (b0(v, T + B)), relative-Betti-one (b1(v,T + B)), entropy (e), and average grain height (havg) require moreeffort and computational power to calculate but still provide arelatively strong correlation.

The samples used in this analysis were pixelated squares.Connectivity of phases can be described in two ways. Pixels ofthe same material that are connected by an edge are always

considered connected, but two pixels that only share a vertexcan be defined as connected or disconnected. We used a neuralnetwork to determining which definition of connectivity correlatedstrongest to material properties. The results in Table 6 indicate thatone definition is not always stronger than the other. Defining ashared vertex as being connected provides a stronger correlationwhen determining conductivity but weaker correlation whendetermining Young’s modulus (see also [30]). This might beexpected since electrical conductivity across a corner betweentwo pixels may be easier than stress transfer.

Similarly, neural networks can be used to determine which win-dow size/shape for measuring entropy gives a better predictor ofthe physical properties. Table 7 lists the correlation coefficient (c)

Page 7: Correlating structure topological metrics with bulk composite properties via neural network analysis

Fig. 8. Example 1.

Fig. 9. Example 2.

26 D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27

for all nine metrics with varying entropy windows. There are verysmall differences in c from one window to the other. The windowe3 (8 � 8) provides the strongest correlation when predictingYoung’s modulus and conductivity.

Table 8 separates samples into their structure type and all ninematerial metrics are used in the neural network. The conductivityof stringy clusters correlates very strongly, for example, but thecorrelation for conductivity of random materials is very low. Thisillustrates the fact that random samples are all very similar froma global property point of view; the other structures are much lesshomogeneous.

Finally, Fig. 4 is used to compare the performance of a group ofstructure metrics with a single homology metric used in a previousstudy [23]. The prior paper reported reasonable correlationbetween homology metrics and bulk modulus. The new results inFig. 4 dramatically emphasize the benefits of combining a rangeof metrics to accurately predict the bulk properties.

6. Conclusions

Neural networks have been demonstrated as an effectivemethod for both creating a model for capturing relationshipsbetween structure metrics and bulk properties of a composite,and for determining the relative strength of individual metrics inpredicting these properties. The presence of a percolating path ofparticles most influences both modulus and conductivity of the

samples tested in this study. The related metrics of maximum clus-ter size and relative homology similarly play a strong role. Theentropy and first Betti number (number of disconnected particu-late clusters) play much weaker roles.

The conductivity analysis results in higher correlation coeffi-cients than the elasticity analysis (see Tables 2 and 3). This isbecause elasticity depends on the geometry of rigidity whereas con-ductivity relies more on connectivity. We identified that the metricspercolation (P) and maximum great height (hmax) were strong pre-dictors of material properties which are metrics that hinge on thecontrast between material properties. The difference in Young’smodulus for the two materials is less than 3 orders of magnitudebut the difference in conductivity is 4 orders of magnitude. Ourset of nine metrics is more optimal for the conductivity analysis.

Combinations of various metrics provide more accurate modelsthan single metrics do alone. And the metrics that are easiest tomeasure (relating to connectivity) provide the best correlation.

The definition of connectedness of pixels (via a corner or a face)does not have a very strong influence on the resultant performanceof the metrics as property predictors, at least for the sample typesin this study. Similarly, the window shape and size used in the def-inition of entropy did not greatly affect the strength of entropy as ametric. However, an optimal window size was identified.

Finally, the type of clustering present in a sample significantlyaffected how well the structure metrics were able to capture thebulk properties. Most importantly, the variation in bulk propertiesfor random structures of the size used in this study does not varygreatly, hence the ability of the metrics to capture the bulk perfor-mance is reduced for this type of structure.

Neural net predictions of Young’s modulus and conductivityagree well with the finite element calculations that they are builtfrom. Comparing neural nets created from different subsets ofstructure metrics provides insight into which metrics moststrongly correlate with these bulk composite properties (in thiscase, the homology metrics) and may therefore facilitate develop-ment of improved structure–property relations.

Acknowledgements

The authors wish to thank Prof. Christophe Giraud-Carrier forhis helpful insights and discussions on the topic of machine learn-ing. D. Gerrard wishes to thank the BYU College of Physical andMathematic Sciences for funding. D. Fullwood acknowledges NSFfunding via grant #CMMI-1235365.

Appendix A. Definition of structure metrics

A.1. Betti numbers, b0, b1

Homology is a mathematical description of phase connectivitywithin a sample. The homology of a two dimensional sample canbe completely described using two numbers known as Betti num-bers. Betti zero (b0) indicates the number of connected pieces, andBetti one (b1) indicates the number of generators for the group ofloop classes (see [23,31]). Fig. 5 illustrates the physical meaningof these numbers. If one loop can be pushed without leaving theparticulate material, to a second loop, then the two loops are inthe same class (see Fig. 6).

A.2. Relative homology, b0(v, T + B), b1(v, T + B)

In our analysis we consider the elasticity and conductivity onlyin the vertical direction across the samples. Therefore, it is useful tohave a description of the number of paths in the particulate phasebetween the top and bottom of a sample. Relative homology is also

Page 8: Correlating structure topological metrics with bulk composite properties via neural network analysis

D.D. Gerrard et al. / Computational Materials Science 91 (2014) 20–27 27

described through the use of Betti numbers. The region occupiedby the particulate phase is denoted by v. The top and bottom edgesof the sample are denoted T and B. The relative homology ofv/(T + B) is the same as the reduced homology of v/(T + B), wherev/(T + B) is the quotient space v such that T + B is identified to apoint. Thus b1(v, T + B) = b1(v/(T + B)) and b0(v, T + B) =b0(v/(T + B))�1.

Relative homology can provide a good indication of the bulkproperties of a material but there is still a lot of information thatnot used in this metric. Including the areas and heights of the per-colated grains gives us more information and provides a strongercorrelation. The best results can be obtained when we use as muchas we can of the information we are given.

A.3. Percolation, P

A simple descriptor of material arrangement is whether or notpercolation has occurred. Percolation occurs if there exists a con-nected path between opposite edges of the plate through a chosenmaterial type. A sample where percolation has occurred behavesdifferently from a material where percolation has not occurred[10]. Fig. 7 shows two heterogeneous samples, percolation hasoccurred in one sample but not in the other.

We determine whether percolation has occurred across theheight or the width of the sample for both phases of material. A‘one’ is assigned if percolation has occurred, a zero is used otherwise.

A.4. Entropy, e

Geometric entropy is a measure of the disorder of a sample.Entropy is found by taking a window and moving it down andacross a sample. Each pixel within this window is one of twophases. These values are used to calculate the entropy using theequation

e ¼ lnðXðkÞÞ

where

XðkÞ ¼Y8i

n

ai

� �ð1Þ

and n is the area of the window in pixels and ai is the area of matrix(material one) in pixels [24]. We use a 2 � 2 pixelated window forall entropy calculations and an additional study on the effect ofvarying window size is given in Table 7.

A.5. Maximum cluster height, hmax

The height (in pixel lengths) of the connected cluster whoseheight is greater than that of all other clusters is the maximumcluster height.

A.6. Mean cluster height, havg

The arithmetic mean of all clusters heights (in pixel lengths) isthe mean cluster height.

A.7. Plate instance example

Example 1 (Fig. 8) is created with the ‘‘stringy cluster’’algorithm.

1. Betti zero: b0(v) = 12.2. Betti one: b1(v) = 7.3. Betti zero/Betti one: b0v

b1v¼ 1:724.

4. Relative Betti Zero Homology: b0(v, T + B) = 9.5. Relative Betti One Homolgy: b1(v, T + B) = 10.6. Percolation: P = 0.7. Entropy: e = 1.2621E + 03.8. Maximum cluster height: hmax = 23.9. Mean cluster height: havg = 32.055.

Example 2 (Fig. 9) is created with the ‘‘circle cluster’’ algorithm.

1. Betti zero: b0(v) = 10.2. Betti one: b1(v) = 58.3. Betti zero/Betti one: b0v

b1v¼ 1:724.

4. Relative Betti zero homology: b0(v, T + B) = 7.5. Relative Betti one homolgy: b1(v, T + B) = 60.6. Percolation: P = 0.7. Entropy: e = 1.0790E + 03.8. Maximum cluster height: hmax = 67.9. Mean cluster height: havg = 28.80.

References

[1] G.W. Milton, The Theory of Composites Cambridge, Cambridge UniversityPress, 2002.

[2] S. Torquato, Random Heterogeneous Materials, Springer-Verlag, New York,2002.

[3] R. Phillips, I. NetLibrary, Crystals, defects and microstructures modeling acrossscales, Cambridge University Press, Cambridge; New York, 2001.

[4] W.A. Curtin, R.E. Miller, Mater. Sci. Eng. 11 (2003) 33.[5] H.M. Zbib, T. Diaz de la Rubia, Int. J. Plast. 18 (2002) 1133.[6] R.E. Rudd, J.Q. Broughton, Phys. Status Solidi B 217 (2000) 251.[7] B.L. Adams, S.R. Kalidindi, D.T. Fullwood, Microstructure-Sensitive Design for

Performance Optimization, Butterworth-Heinemann, 2012.[8] D. Fullwood, S. Niezgoda, B. Adams, S. Kalidindi, Prog. Mater. Sci. 55 (2010)

477.[9] G.B. Olson, Science 277 (1997) 1237.

[10] Y. Chen, C.A. Schuh, Acta Mater. 54 (2006) 4709.[11] G. Grimmett, Percolation, second ed., Springer, 1999.[12] D. Stauffer, A. Aharony, Introduction to Percolation Theory, second ed., Taylor

and Francis, Bristol (PA), 1992.[13] M. Sahimi, J. Phys. C: Solid State Phys. 16 (1983) L521.[14] B. Paul, Trans. Metall. Soc. AIME 218 (1960) 36.[15] R. Hill, Ser. A Math. Phys. Sci. 65 (1952) 349.[16] Z. Hashin, S. Shtrikman, J. Mech. Phys. Solids 11 (1963) 127.[17] E. Kroner, J. Mech. Phys. Solids 25 (1977) 137.[18] S.E. Wilding, D.T. Fullwood, J. Comput. Mater. 50 (2011) 2262.[19] M. Frary, C.A. Schuh, Acta Mater. 53 (2005) 4323.[20] O. Johnson, C. Gardner, N. Mara, A. Dattelbaum, G. Kaschner, T. Mason, D.

Fullwood, Met. Trans. A 42 (2011) 3898.[21] M. Gameiro, K. Mischaikow, T. Wanner, Acta Mater. 53 (2005) 693.[22] T. Wanner Jr., Acta Mater. 58 (2010) 102.[23] D.D. Gerrard, D.T. Fullwood, D.M. Halverson, S.R. Niezgoda, Comput. Mater.

Continua 15 (2010) 129.[24] R. Piasecki, A. Plastino, Physica A 389 (2010) 397.[25] D.D. Gerrard, D.T. Fullwood, D.M. Halverson, S.R. Niezgoda, Quality Control of

Heterogeneous Materials Through Topological Homology and Entropy, SAMPE,Salt Lake City, 2010.

[26] Matlab. The Mathworks Inc, 2006.[27] Samuel E. Wilding, T. David, Comput. Mater. Sci. 50 (2011) 2262–2272.[28] ANSYS 11.0. ANSYS Inc, 2007.[29] Hall, Mark, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter

Reutemann, Ian H. Witten, The WEKA Data Mining Software: An Update;SIGKDD Explorations, vol. 11 (1) 2009.

[30] Y. Chen, C.A. Schuh, Phys. Rev. E (2009) 80.[31] T. Kaczynski, K. Mischaikow, M. Mrozek, Computational Homology, Springer,

New York, 2004.