Neuroevolution of a Hybrid Power Plant Simulator

Neuroevolution of a Hybrid Power Plant Simulator

Shauharda KhadkaOregon State University

[email protected]

Kagan TumerOregon State University

[email protected]

Mitch ColbyOregon State University

[email protected] Tucker

[email protected]

Paolo PezziniAMES Laboratory

[email protected]

Kenneth BrydenAMES Laboratory

[email protected]

ABSTRACTEver increasing energy demands are driving the develop-ment of high-efficiency power generation technologies suchas direct-fired fuel cell turbine hybrid systems. Due to lack ofan accurate system model, high nonlinearities and high cou-pling between system parameters, traditional control strate-gies are often inadequate. To resolve this problem, learningbased controllers trained using neuroevolution are currentlybeing developed. In order for the neuroevolution of thesecontrollers to be computationally tractable, a computation-ally efficient simulator of the plant is required. Despite theavailability of real-time sensor data from a physical plant,supervised learning techniques such as backpropagation aredeficient as minute errors at each step tend to propagateover time. In this paper, we implement a neuroevolutionarymethod in conjunction with backpropagation to amelioratethis problem. Furthermore, a novelty search method is im-plemented which is shown to diversify our neural networkbased-simulator, making it more robust to local optima. Re-sults show that our simulator is able to achieve an overallaverage error of 0.39% and a maximum error of 1.26% forany state variable averaged over the time-domain simulationof the hybrid power plant.

CCS Concepts•Hardware → Fuel-based energy;

KeywordsHybrid Performance Project; System Modeling; Evolution-ary Algorithms; Novelty Search

1. INTRODUCTIONTo keep up with the ever increasing demands for energy, aswell as to minimize emissions and environmental impacts,research on all aspects of power plants, ranging from com-bustion, gasification, turbines, fuel cells and carbon dioxide

ACM acknowledges that this contribution was authored or co-authored by an em-ployee, or contractor of the national government. As such, the Government retainsa nonexclusive, royalty-free right to publish or reproduce this article, or to allow oth-ers to do so, for Government purposes only. Permission to make digital or hard copiesfor personal or classroom use is granted. Copies must bear this notice and the full ci-tation on the first page. Copyrights for components of this work owned by others thanACM must be honored. To copy otherwise, distribute, republish, or post, requires priorspecific permission and/or a fee. Request permissions from [email protected].

GECCO ’16, July 20-24, 2016, Denver, CO, USAc© 2016 ACM. ISBN 978-1-4503-4206-3/16/07. . . $15.00

DOI: http://dx.doi.org/10.1145/2908812.2908948

separation and capture is being conducted. The U.S. De-partment of Energy’s Hybrid Performance Project (Hyper)is one such example that incorporates a gas turbine powercycle with a solid oxide fuel cell.A hybrid system emulated by Hyper comes with the costof increased complexity. Empirical system models are diffi-cult to obtain, and typically only describe limited operatingregimes. The challenge of controlling a hybrid system thuscreates fundamentally new problems in the areas of model-ing, sensing, optimization and control. Traditional controlstrategies are potentially inadequate due to high nonlineari-ties, coupling and the lack of an accurate system model [15].Neuroevolutionary controllers offer a promising solution tothe control of complex systems such as Hyper [2]. How-ever, for neuroevolutionary algorithm to be computation-ally tractable, a fast simulator is required in order to quicklyevaluate and assign fitness to different candidate controllers.In this paper, we develop a neural network based simulatorof the Hyper facility. Neural networks have been shown to beuniversal function approximators [1] successful in modelingcomplex system dynamics. A key advantage of a neuralnetwork is that once trained, the network can be used tomake predictions with minimal computational cost, allowingfor fast and efficient evaluations of controllers.The algorithm for developing the simulator is shown in Fig-ure 1. Initially, a network was trained with backpropagationusing a dataset obtained from a physical power plant. Thisnetwork was unable to accurately model Hyper. The failureoccurs because backpropagation places equal weight on alltraining points, and often led to large changes in the systemstate being ignored if the state subsequently returned to itsprevious value. In order to capture the dynamic response ofthe Hyper facility, we use a neuroevolutionary algorithm todevelop a Hyper simulator, using the networks trained frombackpropagation to seed the evolutionary search.A key issue in designing a Hyper model is the distribution oferror across state variables. In the case of simulation for test-ing controllers, it is critical that errors in the system modelare spread across state variables, rather than clustered inone or a few state variables. To address this concern, ourneuroevolutionary algorithm uses both a fitness metric basedon model accuracy, as well as a novelty metric based on howthe model error was distributed across state variables.The key contribution of this paper is the development ofthe Hyper simulator using both backpropagation and neu-roevolution, implementing novelty metrics to ensure that

917

Figure 1: Development of the Hyper simulator. Data wascollected from physical plant runs, and is used to train aneural network via backpropagation. The networks frombackpropagation are used to seed a novelty-based neuroevo-lutionary algorithm, in order to better capture transient re-sponses in the plant.

system error is distributed across state variables in the sys-tem model.The rest of this paper is organized as follows. Section 2 de-tails background on the Hyper facility and novelty search.Section 3 explains the algorithms used to develop the Hypersimulator. Section 4 show a number of experiments whichprobe the performance of our simulator and section 5 con-cludes by highlighting future areas of research for our simu-lator.

2. BACKGROUNDThe following sections provide detail on the Hyper facilityand novelty-based search.

2.1 Hybrid Performance Project (Hyper)The Hybrid Performance Project facility (Figure 2), Hy-per, is located at the Department of Energy’s National En-ergy Technology Laboratory (NETL) campus in Morgan-town, West Virginia. The purpose of this experimental plantis to study the complex interactions of the direct fired SolidOxide Fuel Cell (SOFC) gas turbine hybrid configuration,as well as to develop control strategies for such a system.Hyper is a small scale SOFC hybrid hardware simulation,capable of emulating 320 to 820 kW hybrid plants [17]. Hy-per contains a hardware simulation of a 200kW to 700kWsolid oxide fuel cell (SOFC) system coupled with a 120kWturbine (Figure 2) [16].The hardware based fuel cell simulation makes Hyper uniquein that it allows for a wide range of fuel cells to be simulatedwithout additional cost, and allows for testing of controlstrategies without risk of damaging a costly fuel cell. Thefuel cell model can be reset in software rather than beingrebuilt in hardware, allowing for significant progress in theunderstanding of such hybrid configurations [11, 13, 14].The Brayton cycle is the conventional gas turbine configura-tion used for power generation. Gas cycle turbines are effec-

Figure 2: Diagram for Hybrid Performance Project facilityat NETL.

tive in power generation because they are more efficient andflexible compared to steam turbine cycles, have fast start uptimes, can be built for a wide range of power outputs, andcan use readily available fuels such as natural gas. The Re-cuperation cycle, which is the fundamental building block ofthe Hyper facility is similar to Brayton cycle but adds func-tionality to recuperate turbine exhaust leading to greaterefficiency [17].Typically, atmospheric air is drawn into the system througha compressor. The pressurized air is then mixed with fuelin a combustion chamber and the fuel is ignited by addingthermal energy to the air mixture. The exhaust gases fromthe combustion chamber are then expanded through a gasturbine, generating mechanical work in a rotating shaft usedto drive the compressor and electric generator. A standardgas turbine then vents the exhaust gases from the turbineout into the atmosphere. In a turbine with regeneration,the hot exhaust from the turbine is used to preheat the airentering the combustion chamber with a heat exchanger,reducing the amount of fuel required to heat the air.Solid oxide fuel cells utilize a ceramic electrolyte to channeloxygen ions to react with hydrogen, producing an electriccurrent. Fuel cell turbine hybrids operate at very high effi-ciency, typically up to 60-75% and with low carbon emissions[19]. These fuel cells operate at a high temperature, reform-ing natural gas or other hydrocarbon fuels to produce thehydrogen needed for the reaction, and ionizing the hydrogenand oxygen to be transported across the electrolyte. Tem-peratures can reach up to 1000◦ C, much of which is wastedin the exhaust gas. Fuel cells are also typically slow to heatand start up, limiting their use in applications requiring fastplant start-up [17]. Pressurized air enters the cathode of thefuel cell, and fuel enters the anode. Hot exhaust leaves thefuel cell at a high temperature, along with any unconvertedfuel.The Hyper project places a fuel cell between the regenera-tion heat exchangers and combustion chamber of a typicalrecuperated Brayton cycle. Primary heat generation to runthe turbine comes from the fuel cell exhaust. The combus-tion chamber burns any unspent fuel, assists in start up,and regulates turbine inlet temperature. Exhaust from theturbine runs through a set of parallel heat exchangers to pre-heat air into the fuel cell. More than 200 sensors are located

918

across the plant designed to provide real-time information tothe controller and log the system state during experiments.

2.2 Novelty-based SearchNovelty search is a recently developed method in evolution-ary computation where individuals in a population are se-lected solely on how different they are to other solutions eval-uated so far [6]. The method is an alternative to objective-based searches that are prone to the pathology of local op-tima. The landscape drawn by objective functions (fitness)that measure the progress towards an objective, are often de-ceptive and may prevent the objective from being reached[8]. The stepping stones that may lead to the objective canbe in discord with this landscape and thus may be ignoredby objective-based search.For example, consider fingers stuck in a Chinese finger trap.The goal here is to free the fingers; a natural choice of fitnessmetric here is to reward the separation between fingers. Per-forming the direct action of pulling the fingers apart will fol-low the landscape drawn by this objective function but willyield no progress. The necessary precursor to solving thistrap is to push the two fingers together which enlarges thetrap’s openings and frees the finger. This action of pushingthe fingers together, however, initially only seems to entrapthe fingers more severely [8]. It is at odds with the land-scape drawn by the objective function and will likely not beexplored by an fitness-based search using the fitness metricdefined above.The main idea behind novelty search is that searching fornovelty instead of explicitly seeking an objective can helpcircumvent this deception. In the Chinese finger trap exam-ple, searching for novel behaviors would eventually lead thesearch to behaviors that push the fingers together. Initiallythis would only entrap the fingers more severely. Since nov-elty search does not consider the fitness alone, however, itis not dissuaded by this deception and will continue searchin this area of the behavior space eventually achieving thegoal of releasing the fingers.In order to prevent deception, complex search problems areoften carefully designed with layers of sequential tasks. Thegoal is broken down to a strictly defined series of objectives.This sort of decomposition is however ad hoc and requiresintimate domain knowledge [8] and is not always feasible.Novelty search, with its ability to circumvent deception, of-fers an interesting alternative in these domains.Evolutionary algorithms are suited for novelty search algo-rithms because of their population based method. The pri-mary change required to implement novelty search in evolu-tionary algorithms is to replace the fitness metric with a nov-elty metric. The definition of this metric is up to the user,but should capture and quantify differences among behav-iors. Novelty search succeeds where objective-based searchesfail by rewarding stepping stones that could lead to betterperformance. The novelty of a newly generated individualis determined by how different its behavior (phenotypicalexpression) is with respect to the behavior of an archive ofpast individuals. The goal is to quantify the novelty of theindividual’s behavior relative to the behaviors seen in thepast.Rewarding novelty pushes the population to constantly ex-plore new areas of the behavior space [7]. Thus the nov-elty metric should measure the sparseness of any point inthe behavior space. A simple metric often used to mea-

sure sparseness is to compute the average distance to thek-nearest neighbors of that point [7].

ρ(x) =

k∑i=1

dist(xi, µi) (1)

Here ρ represents the sparseness at any point, k is the num-ber of neighbors and dist computes the Euclidean distancebetween xi (candidate behavior) and µi (behavior in thearchive).An extension to novelty search is minimal criteria noveltysearch (MCNS) which enforces a minimal criteria that mustbe met by all individuals within the population [7]. If anindividual satisfies minimal criteria mo, it is assigned thenormal novelty score as described above. If an individualdoes not satisfy this criteria however, it is assigned a noveltyscore of 0. This harsh penalty means that the individual’schances for selection is drastically reduced. The individualsare only considered for selection if there exists no other in-dividuals in the population that meet the criteria, at whichpoint a random selection is used among these candidates.The minimal criteria effectively fractures the behavior spaceinto feasible space and infeasible space. In doing so, it re-duces the behavior space being explored and increases theefficiency of novelty search in large search problems [7].

score(i) =

{novi for wi ≤ mo

0 for wi ≤ mo

(2)

A further extension to this abstraction is the idea of a pro-gressive minimal criteria novelty search (PMCNS) [3] wherethe aforementioned minimal criteria is made progressivelystricter leading to faster convergence.

3. APPROACHThe data collected from the Hyper facility in Morgantwon,WV contained 30,626 data points sampled at 12.5Hz whichtranslates to approximately 10 minutes of real time plantoperation. The dataset pertains to an experiment involvinga characterization of the cold air valve. In this experiment,the cold air valve was opened to a position and was systemat-ically changed once the plant had reached steady state. Thevalve was changed between 10% and 80% open. Each resul-tant data point comprised of 19 plant state variables and 2control variables. To decrease the training time of our neu-ral network and reduce noise, the data was downsampled byaveraging over 25 data points. The resulting dataset com-prising of 1024 data points was split into training/testingsubsets consisting of 1000/224 data points respectively. Thedata was normalized to be between 0 and 1.In this section, we will detail the experimental approachused in our research. The Hyper abstract simulator was de-veloped using two distinct steps. First we utilize backprop-agation to train a neural network using the time-domaindata (Section 3.1). Then we use neuroevolution to evolvethe simulator (Section 3.2).

3.1 BackpropagationA single hidden layer neural network was created with 21 in-puts, 19 of which represented the current state of the plantwhile the remaining 2 represented the control inputs. Thehidden layer had 35 nodes and was followed by the outputlayer with 19 outputs which return the plant state at the

919

next time step. Our network used a fully connected archi-tecture with a sigmoid activation function. Initial networkweights are initiated from a Gaussian distribution with amean of 0 and a standard deviation of 0.5.Backpropagation was run for 120 training episodes whichtranslates to approximately five minutes of runtime. Adam[4], a first-order gradient-based method, was used for opti-mization with a batch size of 1. Learning rate was set to0.01 with a momentum of 0.6. The parameters given hererepresent the set of best performing ones optimized over aparameter sweep. To avoid convergence to local optima, thetraining process was run independently for 100 independenttrials and the best network was selected.

3.2 NeuroevolutionFollowing backpropagation, the Hyper abstract simulatorwas evolved using two distinct neuroevolutionary methods inparallel. First, an weakness-based neuroevolution, was usedto develop the simulator (Section 3.2.1). The second methodused novelty search for neuroevolution of our simulator (Sec-tion 3.2.2). For clarity, the simulator developed using onlybackpropagation will be referred to as BP-simulator. Thesimulator evolved using weakness-based neuroevolution andnovelty-based neuroevolution will be referred to as WEAK-simulator and NOV-simulator respectively.

3.2.1 Weakness-based SearchThe neural network obtained by backpropagation was usedas a seed for neuroevolution which was run for 25,000 genera-tions which translates to approximately 30 hours of runtime.The error at any time step was the absolute value of the dif-ference between the training value (y) and network output(t). The error vector was then defined and computed as thetime aggregate (to) sum of errors across the 19 plant statevariables; each component of which was defined as:

Ei =

to∑t=1

|ti − yi| (3)

The term weakness (w) was then computed as the squaredL2 norm of the error vector normalized for the total timesteps.

w =1

to‖E‖22 (4)

The weakness used here is akin to anti-fitness and the ob-jective of the evolutionary algorithm is to minimize it. Theweakness could have easily been inverted to represent fitnesslike most evolutionary algorithms but the term weakness inthis domain more intuitively represents the error and thegoal to minimize it. The population size was set to 100and mutation was carried out by introducing random per-turbations within the weights of the neural network. Theseverity of mutation was controlled by two parameters: mu-tation strength (s) and mutation quantity (q). Each indi-vidual perturbation of a weight represented an addition of aquantity drawn from a Gaussian distribution with mean zeroand standard deviation of mutation strength. The mutationstrength parameter determines the strength of each muta-tion while mutation quantity controls the number of weightsto be mutated within the neural network at each mutationevent. The values for q (integer) and s were picked randomly

during each mutation event from ranges (1,5) and (0.1, 1.0)respectively.

foreach Mutation event doAssign randomly picked value for s and qforeach iteration q do

Pick a random weight ww = w +N (0, s)

end

end

Algorithm 1: Mutation

3.2.2 Novelty-based SearchNovelty-based neuroevolution was implemented accordingto the algorithm presented in 2. The primary parameterssuch as mutation, population size and weakness computa-tion were kept identical to weakness-based neuroevolutiondescribed earlier. The distinguishing factor in novelty-basedneuroevolution is the novelty metric, which guides the searchtowards seeking novel behaviors rather than explicitly to-wards minimizing the weakness. The definition of behaviorthat shapes the novelty metric therefore is critical to the suc-cess of novelty search [5]. Previous work in the application ofnovelty search generally utilizes lower dimensional domainswith visualizable behaviors such as the maze task [12, 3, 8,10, 18]. Our simulator, however spans a very high dimensionspace without a readily visualizable behavior. Designinga behavior metric that was easily computable, distinguish-able and relatable to the simulator objective was important.Acknowledging these factors, we defined the behavior met-ric for our simulator to be the aforementioned error vectorwhich represents the average error distribution across the19 plant state variables. The idea behind this is that, apartfrom the magnitude of error, the distribution of errors acrossthe plant state variables is an important characteristic in theperformance of a simulator. This is particularly relevant ina time-domain simulator where minute errors at individualtime steps tend to propagate over time.The novelty score is then defined as the average k -neighborEuclidean distance between the error vector and the archive.The value of k was selected to be 15 after performing aparameter sweep and the archive size was limited to 1500.The archive was updated after each generation by addingthe two most novel error vectors from the population andremoving the oldest ones if the size limit was reached. Theminimal criteria (mo), which represents a higher weaknessthreshold in this work, was defined dynamically to be theconstraint coefficient (α) times the best weakness score (wb)of the population at each generation. This is a modifieddefinition from the original PMCNS algorithm [3].

mo = α× wb (5)

Novelty score is first used to select 85% of the population.In order to retain the best performing candidates in terms ofminimizing weakness, 10% of the population is then selectedbased on the lowest weakness scores available in the remain-ing population. This is a modification from the originalNovelty Search Algorithm which does not have an weakness-based portion like ours. This effectively gives the noveltysearch method an weakness-based search subcomponent which,

920

apart from retaining the best candidates also serves to ex-ploit promising portions of the behavior space found by theexploration inherent within the novelty search method.The penalization of candidates exceeding the weakness thresh-old (mo) was also modified from the original Minimal Cri-teria Novelty Search (MCNS) algorithm [7] such that eachcandidate exceeding the threshold would get a novelty scoreof -1.

score(i) =

{novi for wi ≤ mo

−1 for wi > mo

(6)

where novi represents the novelty scored computed usingk -nearest neighbor algorithm.This was necessary to separate the candidates who were pe-nalized by the minimal criteria from the candidates thatachieved a novelty score of 0. The candidates with a scoreof -1 now represent the infeasible part of the populationfragmented by the minimal criteria constraint. The infeasi-ble portion of the population often discarded by the mini-mal criteria constraint could sometimes serve as a steppingstone to a better solution [9]. Five percent of our populationis thus selected randomly from this infeasible set marked bytheir novelty score of -1. Preserving these solutions servesto diversify the population and increases the possibility forthe discovery of more novel behaviors.

Initialize a population of k neural networksforeach Epoch do

foreach candidate 1 to k doGenerate successor network using mutation

foreach Network doCalculate weaknessif weakness > mo then

Encode novelty score as -1else

Compute and encode novelty (novi)Rank the population based on novelty scoresExtract the first n candidates for survivalforeach iteration x do

Select a candidate with the lowest weaknessfrom the remaining population

Extract the x selected candidates for survivalforeach iteration y do

Select a candidate at random from the infeasiblereason

Extract the y selected candidates for survivalforeach iteration z do

Add z best candidate’s (based on novelty score)error distribution to the archive

if archive size > max archive size thenDelete oldest error distribution from the archive

if Epoch mod f == 0 thenDecay constraint coefficient α

Find the lowest weakness score (wb) in thepopulationmo = wb ∗ α

Algorithm 2: Novelty-based Search Algorithm

3.3 Neural Network SimulatorOnce a neural network is trained to map the current plantstate (st) and control action (at) to the next plant state(st+1), the network can be used to develop the time-domainsimulator. At each time step, a control action is chosen ac-cording to the current state. The current plant state andcontrol action is then fed into the neural network to obtainthe next plant state. This process can be repeated to per-form a time-domain simulation for the power plant. This isillustrated in Algorithm 3.

Initial plant state so and controller f(st) = atforeach time step t = 1 to t = tend do

find current state stfind control action at = f(st)find next state f(st + 1) = NN(st, at)

Algorithm 3: Neural network based simulator

4. RESULTSThe neural network simulator developed using each methodwas tested against the data obtained from physical Hyperexperiments. The plots show our simulator’s performanceagainst the actual data. First, we plot the time-domain BP-simulator’s performance for a full time-domain simulationagainst the same BP-simulator tasked with predicting onlyone time step ahead (1-step BP-simulator). It’s important tonote that our simulator requires full time-domain predictionand this 1-step BP simulator is just an artificial constructused here to illustrate the challenges of a time-domain sim-ulation.Figures 3a and 3b show the result for two representativevariables among our 19 state variables. BP-simulator per-forms very well when tasked with predicting one time stepahead. Given the training dataset, backpropagation is ableto achieve minute errors at each time step as illustrated bythe 1-step BP-simulator. Our simulator however needs tobe able to predict the entire time-domain using only thestarting state variable given at time step zero and the con-trol actions at each step. In accordance with this require-ment, when the same BP-simulator is tasked with a fulltime-domain simulation, the performance suffers drastically.Tiny errors at each time step across the 19 state variablespropagate over time leading to larger errors.Figure 4, 5 and 6 shows the BP-simulator’s performanceagainst the simulators constructed using neuroevolutionarymethods; NOV-simulator and WEAK-simulator. All of theplots represent full time-domain simulations where only theinitial state of the plant, and control actions at each timestep, are fed into the simulator.Figure 4 and 5 represent our simulator’s performance forstate variables FT-162 and PDT-158; both state variableshave a similar response in large parts of the time-domainsimulation, which resembles a steady state. The transi-tions that occur periodically however are the most salientparts and capturing the magnitude and location of these re-sponses is critical to a good time-domain simulator. The BP-simulator fails to capture most of these transitions in each ofthe modeled state variables. It is particularly insensitive tothe magnitude of these transitions. Backpropagation placesequal weights on all of its training examples which leads to

921

(a)

(b)

Figure 3: A full time-domain BP-simulator performanceagainst a 1-step BP-simulator. (a) FT-162 sensor measuresflow rate on a compressor outlet (b) FT-432 sensor measuresflow rate into the Fuel Cell Simulator Combustor

these transitions being diluted as the state subsequently re-turns to the steady state. This causes the stymied responseseen for the BP-simulator at these transition phases. Theneuroevolutionary methods (WEAK-Simulator and NOV-simulator) evade this dilution by use of a weakness and nov-elty metric respectively, which takes the entire time-domainsimulation into account. This conflation in its computationof error makes it sensitive to transitions even when they arefollowed by the steady state. This is a key factor in ourneuroevolutionary simulators, which leads it to perform sig-nificantly better than the BP-simulator.Figure 6 shows a particularly noisy dataset representing theFT-432 state variable. The noise in this dataset originatingfrom the physical sensor itself, poses a further challenge toour neural network based simulator. Similar to the responseseen before in Figures 4 and 5, the BP-simulator fails tocapture the magnitude of transition in the state variable. Italso suffers heavily from the noisy nature of the dataset. The

Figure 4: Full time-domain simulation for FT-162 sensorwhich measures flow rate on a compressor outlet

Figure 5: Full time-domain simulation for PDT-158 sensorwhich measures the pressure difference between compressoroutlet and the turbine inlet

neuroevolutionary simulators, however, are able to capturethe transition very well despite the noise.The neuroevolutionary simulators exhibited superior perfor-mance across all of the 19 plant state variables when com-pared to the BP-simulator. The neuroevolutionary method’suse of a weakness/novelty metric that captured the overallperformance of the simulator was key to its superior perfor-mance. The conflation of performance metric combined witha population-based search allowed the evolutionary methodto consider weight updates that improved in portions ofhigher variations while temporarily leading to more errorelsewhere. The survival of these candidates eventually ledto better solutions with superior overall performance.To further probe the differences between our methods, wetested each set of simulators developed using our methodsfor genetic diversity. For neuroevolutionary methods, thefinal population obtained was used to test for diversity. Forbackpropagation, a population equal in size with neuroevo-lutionary methods was constructed by running p indepen-dent trials where n is the population size of our evolution-ary methods. The genetic diversity for our method was cal-

922

Figure 6: Full time-domain simulation for FT-432 sensorwhich measures flow rate into the Fuel Cell Simulator Com-bustor

culated as the aggregate for variance in each weight valueacross its population size p

diversity =1

p

w∑i=1

p∑i=1

(wi − wavg))2 (7)

where w here represents the total number of weights acrossall the layers in our neural network

0

10

20

30

40

50

60

70

1

Weight V

ariance

Novelty-‐based method Weakness-‐based method Backpropoga@on only

Figure 7: Genetic diversity in the population

Figure 7 represents the genome diversity for a populationof simulators attained using each method. Unsurprisingly,novelty search yields the most diverse genome, followed byweakness-based search and finally by the method using back-propagation only. While genome diversity for a neural net-work based method with fully connected architecture is notparticularly informative, it demonstrates the search breadthof our methods. A more diverse genome such as one ob-tained by the novelty-based method is demonstrative of itsability to find competitive solutions with varying encoding.This successful exploration is a contributing reason for itssuperior performance in the design of our abstract simulator.To utilize this diversity of each population and probe for di-versity in the phenome space, we construct an quasi-ensembleneural network made out of the best performing candidatesfrom the final population for each plant state variable. Theresults show histograms of errors across the 19 plant statevariables averaged across all the time steps for these quasi-ensemble systems.

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Error p

ercentage (%

)

Plant state variable

NOV-‐simulator BP-‐simulator

Figure 8: Error histogram: NOV-simulator versus BP-simulator

As shown by Figure 8, the NOV-simulator clearly exhibitssuperior performance compared to the BP-simulator. Thetime-averaged percent error is lower for NOV-simulator thanthe BP-simulator for each of the 19 state variables. Thehighest time-averaged error percentage for both methods isseen for state variable 3. The BP-simulator achieves a timeaveraged error percentage of 19.60%±0.35% as compared to1.26%±0.01% achieved by the NOV-simulator. The averageerror percentage across all 19 state variables is 3.56% for theBP-simulator. The NOV-simulator is able to reduce thisconsiderably to achieve an average percent error of 0.39%.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Error p

ercentage (%

)

Plant state variable

NOV-‐simulator WEAK-‐simulator

Figure 9: Error histogram: NOV-simulator versus WEAK-simulator

Figure 9 shows the time averaged error distributions for theNOV-simulator compared with the WEAK-simulator. Itmust be noted that comparing these two search methods isnot the focus of this paper. The results shown above shouldthus not be interpreted as a rigorous comparison betweenthe two methods. For our simulator, the error distributionsachieved by both methods are very similar overall. As high-lighted most prominently in the error bars for state variable3, however, the novelty-based method is more robust to lo-cal optima and offers a more reliable performance. When weran the weakness-based method for 5 independent runs, weget one instance where it gets stuck in a local optima andconverges to an inferior performance value. Novelty-basedmethod, however, finds its equilibrium consistently makingit more robust to local optima. The differences are marginalbut in a time-domain simulation, even tiny errors and im-balances tend to propagate over time and are thus germaneto the discussion.

5. DISCUSSION AND FUTURE WORKIn this paper, we successfully constructed a time-domainsimulator of a hybrid power plant, Hyper, using a neuralnetwork trained with evolutionary-based methods in con-junction with backpropagation. The time-domain nature of

923

the simulator poses grave challenges to supervised learningtechniques like backpropagation, as minute errors at eachtime step tend to propagate over time. Backpropagation’stendency to equally weigh all its training examples also of-ten leads to large changes in the system being ignored whenthe state subsequently returns to its steady state. Our pro-posed algorithm ameliorates this problem by using an neu-roevolutionary method which utilizes both a fitness functionand novelty metric that takes into account the entire time-domain simulation. This conflation in its computation oferror makes it sensitive to transitions even when they arefollowed by the steady state. This is a key factor in the suc-cess of our neuroevolutionary simulators, which lead it tosignificantly outperform the BP-simulator. The population-based search methods also expands the breadth of searchand finds diverse encoding that eventually lead to bettersolutions.Our neural network simulator is able to map the plant stateand action to the next state virtually instantaneously andwith good accuracy. The simulator can be used as a fastfitness assignment operator which is a crucial component inoptimization methods. Due to high nonlinearities and cou-pling between system parameters, traditional control strate-gies are often inadequate for a hybrid system emulated byHyper. The fast fitness assignment operator facilitated byour simulator can now be used to train learning based neu-roevolutionary controllers. Further, the simulator can alsobe a helpful tool for existing control strategies in performingreal time simulations.Our simulator utilized a neural network with a fully con-nected architecture. Future work could examine the use ofa recurrent architecture as it may better fit the time-domainnature of our training set. Other future work involves test-ing and validating the simulator, using the simulator to traina neuroevolutionary Hyper controller, and testing and vali-dating the final controller at the Hyper facility.

6. ACKNOWLEDGMENTSThis research was supported in part by the US Departmentof Energy - Office of Fossil Energy under Contract No. DE-AC02-07CH11358 through the Ames Laboratory. This re-search was also supported in part by the US Departmentof Energy - National Energy Technology Laboratory underContract No. DE-FE0012302.

7. REFERENCES[1] M. K. Colby, E. M. Nasroullahi, and K. Tumer.

Optimizing ballast design of wave energy convertersusing evolutionary algorithms. In Proceedings of the13th annual conference on Genetic and evolutionarycomputation, pages 1739–1746. ACM, 2011.

[2] A. Gabler, M. Colby, and K. Tumer. Learning basedcontrol of hybrid fuel cell power plant. In Proceedingsof the 2015 International Society of AutomationPower Industry Division Symposium, 2015.

[3] J. Gomes, P. Urbano, and A. L. Christensen.Progressive minimal criteria novelty search. InAdvances in Artificial Intelligence–IBERAMIA 2012,pages 281–290. Springer, 2012.

[4] D. Kingma and J. Ba. Adam: A method for stochasticoptimization. In International Conference on LearningRepresentations (ICLR 2015), San Diego, 2015.

[5] S. Kistemaker and S. Whiteson. Critical factors in theperformance of novelty search. In Proceedings of the13th annual conference on Genetic and evolutionarycomputation, pages 965–972. ACM, 2011.

[6] J. Lehman and K. O. Stanley. Exploitingopen-endedness to solve problems through the searchfor novelty. In ALIFE, pages 329–336, 2008.

[7] J. Lehman and K. O. Stanley. Revising theevolutionary computation abstraction: minimalcriteria novelty search. In Proceedings of the 12thannual conference on Genetic and evolutionarycomputation, pages 103–110. ACM, 2010.

[8] J. Lehman and K. O. Stanley. Abandoning objectives:Evolution through the search for novelty alone.Evolutionary computation, 19(2):189–223, 2011.

[9] A. Liapis, G. N. Yannakakis, and J. Togelius.Enhancements to constrained novelty search:Two-population novelty search for generating gamecontent. In Proceedings of the 15th annual conferenceon Genetic and evolutionary computation, pages343–350. ACM, 2013.

[10] J. Mouret. Novelty-based multiobjectivization. Studiesin Computational Intelligence, Volume 341, page 139,2011.

[11] P. Pezzini, D. Tucker, and A. Traverso. Avoidingcompressor surge during emergency shutdown hyrbidturbine systems. Journal of Engineering for GasTurbines and Power, 2015.

[12] S. Risi, S. D. Vanderbleek, C. E. Hughes, and K. O.Stanley. How novelty search escapes the deceptive trapof learning to learn. In Proceedings of the 11th AnnualConference on Genetic and EvolutionaryComputation, pages 153–160. ACM, 2009.

[13] T. Smitch, C. Haynes, W. Wepfer, D. Tucker, andE. Liese. Hardware-based simulation of a fuel cellturbine hybrid response to imposed fuel cell loadtransients. ASME 2006 International MechanicalEngineering Congress and Exposition, 2006.

[14] A. Traverso, D. Tucker, and C. Haymes. Preliminaryexperimental results of igfc operation using hardwaresimulation. ASME J. Eng. Gas Turbines Power,134(7), 2011.

[15] A. Tsai, D. Tucker, and T. Emami. Adaptive controlof a nonlinear fuel cell-gas turbine balance of plantsimulation facility. Journal of Fuel Cell Science andTechnology, 11(6):061002, 2014.

[16] D. Tucker, L. Lawson, and R. Gemmen. Preliminaryresult of a cold flow test in a fuel cell gas turbinehybrid simulation facility. ASME Turbo Expo, 2003.

[17] D. Tucker, M. Shelton, and A. Manivannan. The roleof solid oxide fuel cells in advanced hybrid powersystems of the future. The Electrochemical SocietyInterface, 18(3):45, 2009.

[18] R. Velez and J. Clune. Novelty search creates robotswith general skills for exploration. In Proceedings ofthe 2014 conference on Genetic and evolutionarycomputation, pages 737–744. ACM, 2014.

[19] W. Winkler, P. Nehter, M. C. Williams, D. Tucker,and R. Gemmen. General fuel cell hybrid synergiesand hybrid system testing status. Journal of PowerSources, 159(1):656–666, 2006.

924

Documents

Neuroevolution of a Hybrid Power Plant Simulator