14
Model-based fault localization in bottling plants Tobias Voigt a,, Stefan Flad a , Peter Struss b a Technische Universität München, Food Packaging Technology, Weihenstephaner Steig 22, D-85354 Freising, Germany b Technische Universität München, Software & Systems Engineering, Boltzmannstr. 3, D-85748 Garching bei München, Germany article info Article history: Received 9 November 2012 Received in revised form 21 September 2014 Accepted 27 September 2014 Available online 16 October 2014 Keywords: Model-based fault localization Automatic fault diagnosis Consistency-based diagnosis Bottling plant Packaging line abstract The bottling of beverages is carried out in complex plants that consist of several machines and material flows. To realize an efficient bottling process and high quality products, operators try to avoid plant downtimes. With actual non-productive times of between 10% and 60%, the operators require diagnosis tools that allow them to locate plant components that cause downtime by exploiting automatically acquired machine data. This paper presents a model-based solution for automatic fault diagnosis in bottling plants. There are currently only a few plant-specific solutions (based on statistical calculations or artificial neural net- works) for automatic bottling plant diagnosis. In order to develop a customizable solution, we followed the model-based diagnosis approach which allows the automatic generation of diagnosis solutions for individual plants. The existing stochastic and discrete-event models for bottling plants are not adequate for model-based diagnosis. Therefore, we developed new first-principle models for the relevant plant components, validated them numerically, and abstracted them to qualitative diagnosis models. Based on the diagnosis engine OCC’M Raz’r, application systems for two real plants and one virtual plant (based on discrete-event simulation) were generated and evaluated. Compared to the reasons for downtime identified by experts, we obtained up to 87.1% of compliant diagnosis results. The diagnosis solution was tested by practitioners and judged as a useful tool for plant optimization. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction A bottling plant for filling beverages into returnable bottles is an assembly of different types of specialized machines and conveyors. They automatically handle the complete process, from pallets with crates containing empty bottles to the final output of pallets with (cleaned) crates and filled and labeled bottles. The plants can be large, distributed over several halls, and have a complex 3D layout, as illustrated in Fig. 1. This shows a bottling hall of a brewery with two large bottling plants. One can see the filling, labeling, packing, and cleaning machines at the back and conveying systems for empty, filled, and labeled bottles at the front. From an abstract point of view, which reflects the flows and manipulation of different types of objects, the basic schematic topology can be simplified as indicated in Fig. 2. There are lines for primary packaging (M 3 for cleaning bottles, M 4 for filling and capping bottles, and M 5 for labeling bottles), secondary packaging (unpacking and packing crates with M 2 and M 5 ) and tertiary pack- aging (de-palletizing and palletizing with M 1 and M 7 ), all organized as an automated branching but directed flow. Certain backward loops, such as re-submission of improperly cleaned or filled bottles to previous steps, are omitted in Fig. 2. Besides the machines (M 1 –M 7 ) shown in Fig. 2, there are additional machines for inspection and for sorting out improper objects. In order to prevent oxygen intake or microbiological contami- nation of the beverage, a major objective is to avoid interruptions to the filling process (M 4 ). Apart from internal reasons, the filling and capping machine will stop operating if there is a lack of input, i.e. bottles, or a tailback of filled bottles preventing further output, i.e. disturbances caused by other machines (Section 5 presents two examples). Due to the high speeds and output rates (up to 100,000 packages per hour), machines and conveyors are failure-sensitive with a degree of availability of 92–98% [1]. In order to avoid each disturbance of a machine in the line resulting in the filling machine stopping, the conveyors are designed as buffers (B B2-5 ,B C1,2,5 ,B P0,1 ), which should provide a continuous supply and output to/from other machines and, in particular, the filling and capping machine (M 4 ). This works in conjunction with a general operating principle: machines and conveyors upstream and downstream from the filler operate at higher throughput rates than the filling machine. This principle is usually the only global one. There is no global control and the machines are controlled individually (or, sometimes, as small aggregates). http://dx.doi.org/10.1016/j.aei.2014.09.007 1474-0346/Ó 2014 Elsevier Ltd. All rights reserved. Corresponding author. Advanced Engineering Informatics 29 (2015) 101–114 Contents lists available at ScienceDirect Advanced Engineering Informatics journal homepage: www.elsevier.com/locate/aei

Advanced Engineering Informatics - Model-Based Systems ...mqm.in.tum.de/publications/2015/Voigt et al/Model-based fault... · other machines and, in particular, the filling and capping

  • Upload
    hadung

  • View
    216

  • Download
    3

Embed Size (px)

Citation preview

Advanced Engineering Informatics 29 (2015) 101–114

Contents lists available at ScienceDirect

Advanced Engineering Informatics

journal homepage: www.elsevier .com/ locate/ae i

Model-based fault localization in bottling plants

http://dx.doi.org/10.1016/j.aei.2014.09.0071474-0346/� 2014 Elsevier Ltd. All rights reserved.

⇑ Corresponding author.

Tobias Voigt a,⇑, Stefan Flad a, Peter Struss b

a Technische Universität München, Food Packaging Technology, Weihenstephaner Steig 22, D-85354 Freising, Germanyb Technische Universität München, Software & Systems Engineering, Boltzmannstr. 3, D-85748 Garching bei München, Germany

a r t i c l e i n f o

Article history:Received 9 November 2012Received in revised form 21 September 2014Accepted 27 September 2014Available online 16 October 2014

Keywords:Model-based fault localizationAutomatic fault diagnosisConsistency-based diagnosisBottling plantPackaging line

a b s t r a c t

The bottling of beverages is carried out in complex plants that consist of several machines and materialflows. To realize an efficient bottling process and high quality products, operators try to avoid plantdowntimes. With actual non-productive times of between 10% and 60%, the operators require diagnosistools that allow them to locate plant components that cause downtime by exploiting automaticallyacquired machine data.

This paper presents a model-based solution for automatic fault diagnosis in bottling plants. There arecurrently only a few plant-specific solutions (based on statistical calculations or artificial neural net-works) for automatic bottling plant diagnosis. In order to develop a customizable solution, we followedthe model-based diagnosis approach which allows the automatic generation of diagnosis solutions forindividual plants. The existing stochastic and discrete-event models for bottling plants are not adequatefor model-based diagnosis. Therefore, we developed new first-principle models for the relevant plantcomponents, validated them numerically, and abstracted them to qualitative diagnosis models. Basedon the diagnosis engine OCC’M Raz’r, application systems for two real plants and one virtual plant (basedon discrete-event simulation) were generated and evaluated. Compared to the reasons for downtimeidentified by experts, we obtained up to 87.1% of compliant diagnosis results. The diagnosis solutionwas tested by practitioners and judged as a useful tool for plant optimization.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

A bottling plant for filling beverages into returnable bottles is anassembly of different types of specialized machines and conveyors.They automatically handle the complete process, from pallets withcrates containing empty bottles to the final output of pallets with(cleaned) crates and filled and labeled bottles. The plants can belarge, distributed over several halls, and have a complex 3D layout,as illustrated in Fig. 1. This shows a bottling hall of a brewery withtwo large bottling plants. One can see the filling, labeling, packing,and cleaning machines at the back and conveying systems forempty, filled, and labeled bottles at the front.

From an abstract point of view, which reflects the flows andmanipulation of different types of objects, the basic schematictopology can be simplified as indicated in Fig. 2. There are linesfor primary packaging (M3 for cleaning bottles, M4 for filling andcapping bottles, and M5 for labeling bottles), secondary packaging(unpacking and packing crates with M2 and M5) and tertiary pack-aging (de-palletizing and palletizing with M1 and M7), allorganized as an automated branching but directed flow. Certain

backward loops, such as re-submission of improperly cleaned orfilled bottles to previous steps, are omitted in Fig. 2. Besides themachines (M1–M7) shown in Fig. 2, there are additional machinesfor inspection and for sorting out improper objects.

In order to prevent oxygen intake or microbiological contami-nation of the beverage, a major objective is to avoid interruptionsto the filling process (M4). Apart from internal reasons, the fillingand capping machine will stop operating if there is a lack of input,i.e. bottles, or a tailback of filled bottles preventing further output,i.e. disturbances caused by other machines (Section 5 presents twoexamples). Due to the high speeds and output rates (up to 100,000packages per hour), machines and conveyors are failure-sensitivewith a degree of availability of 92–98% [1]. In order to avoid eachdisturbance of a machine in the line resulting in the filling machinestopping, the conveyors are designed as buffers (BB2-5, BC1,2,5, BP0,1),which should provide a continuous supply and output to/fromother machines and, in particular, the filling and capping machine(M4). This works in conjunction with a general operating principle:machines and conveyors upstream and downstream from the filleroperate at higher throughput rates than the filling machine. Thisprinciple is usually the only global one. There is no global controland the machines are controlled individually (or, sometimes, assmall aggregates).

M1 DepalletizingM2 UnpackingM3 Bottle cleaningM4 Bottle inspection,

filling and cappingM5 LabelingM6 PackingM7 Palletizing

BP0 Pallet feedingBP1 Pallet transport

incl. magazineBP7 Pallet releasing

BC1 Crate conveyorBC2 Empty crate conveyor incl.

cleaning and magazineBC6 Crate conveyor

BB2-5 Bottle conveyors

C1 BC6

C2

M1

BP0

B

BP7

M4

M2

M3

M7

M6

M5

BB5

BB4BB3

BB2

B

BP1

Fig. 2. Generic structure of a bottling plant for returnable bottles.

102 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

In practice, these provisions cannot guarantee avoidance ofunwanted idle time of the filling machine. Unplanned downtimeof the plant can be 10–60% ([2,3]) of the planned production time.Taking steps to reduce downtime by identifying frequent causesrequires statistics and an analysis based on the recorded operatingdata supplied by (some of) the machines. These operating datacapture only status information about the machine, such as nor-mal_operation, stopped (due to an internal cause or intervention),lack (stopped because of missing input material), and tailback(stopped because of output by the next component). Due to theinterlaced flows of the various object types, time offsets, the largescale of the plants, and the amount and often fragmentary natureof the data (for instance, there are no status data of conveyors), thisanalysis can be difficult and time-consuming. As a consequence,the bottle filling and packaging industries are very interested inan automated diagnosis tool for their plants that provides informa-tion about bottlenecks and weaknesses in the plant, regarding boththe physical performance and configuration and the control princi-ples and parameters.

Providing such a tool was the goal of the LineMod project that isdescribed in this paper. The project was carried out to find a solu-tion for automatic fault diagnosis of bottling plants. The projectaddressed industrial needs by localizing the causes of reduced per-formance of the plant (mainly tied to the output of the fillingmachine) based on the available production data of the machines(e.g. over a period of weeks or months). Many of the potentialend users, for example breweries, are small or medium-sizedenterprises that cannot afford spending many resources on theestablishment or adaptation of a tailored diagnosis system for theirplant. Another practical requirement was to be able to cheaplyaccommodate frequent changes to the structure of the line, dueto rearrangement or addition of machines. Additionally, a plant isa combination of machines from various manufacturers with dif-ferent instrumentation and data availability. These issues sug-gested a need for a model-based solution to diagnosis, whichallows adaptation to be performed by simply (re-)specifying theplant structure. The project focused on interruptions to transporta-tion that caused a total standstill of the filling machine due to alack or tailback of containers (as opposed to an automatic reduc-tion of machine speeds to prevent filler stops).

This paper starts with a survey of previous literature in order toposition our work (Section 2). Section 3 first summarizes the foun-dations of the chosen approach to diagnosis, called consistency-based diagnosis. Then it presents the key contribution of thispaper, the library of the component models that were developedfor the domain of bottling plants, and we describe the diagnosis

Fig. 1. Conveyors connecting machines of a bottling plant for returnable bottles[Photo: Deutscher Brauer-Bund e.V.].

solution. Both have been described in more detail in previous pub-lications (see [4–6]).

Section 4 discusses the validation of the component models andthe evaluation of the results of the implemented solution usingboth incidents on real plants and simulated incidents. We carriedout a validation of the numerical base model components andpresent exemplary results. To evaluate the final diagnosis solutionwe followed two routes, one based on simulation data acquired bydiscrete event simulation on a plant and the other based onrecorded data from two real world bottling plants and a compari-son with manual analysis. We present the results and explain themwith the help of sample scenarios. Finally, the project outcome isdiscussed with regard to industrial applicability.

2. Previous work

2.1. Modeling of bottling plants

Among the existing approaches for modeling bottling plants,several deal with statistical distributions of disturbances and fail-ures and aim to provide a basis for modeling or simulation of therespective plant structures ([7–9], or [10]). Models of chained pro-duction lines, which include the application area of bottling andpackaging considered here, can be found in queuing theory [11].Many other solutions pursue analytical approaches. Based on Mar-kov chains, filling degrees of buffer elements can be approximatelypredicted (see [12,13]). The approach is based on stochastic modelsdescribing state changes of a plant with differential equations.However, all stochastic approaches need simplifications to enablethe equation systems to be solved. Some important simplificationsare (for a more detailed listing see [14]):

� Disturbances and failures of machines occur randomly. Lacks ortailbacks in the flow of material caused by failures of down-stream or upstream machines have no effect on other machines.Hence, failures are not propagated.� The number of operating employees is always sufficient to

resolve a disturbance within its determined period ofmalfunction.� Disturbances only occur while machines are operating; if a

machine is in lack or tailback states, no disturbance can occur.� Along the whole plant, no objects are removed or rejected.

Due to these few but necessary simplifications, this approachbased on Markov chains is an inadequate solution for diagnosing

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 103

faulty behavior in bottling plants and is therefore unsuitable for adiagnosis tool. Further characteristics of the application domain,which are not considered, are:

� Besides there being a flow of primary packaging materials (bot-tles), closed flows of secondary packaging materials (crates) andtertiary packaging materials (pallets) also exist.� Machines can operate with different output rates and conveyors

can transport objects with different velocities; in modernplants, velocities and output rates are controlled.� Certain machines can be electronically or mechanically ‘‘locked’’

to form one machine (such as the filling machine and theinspector of the investigated plant structures).� Certain machines (such as bottle washing machines) can buffer

a considerable number of objects. This has a high impact on theoverall behavior of a plant because temporary failures ofupstream machines can be compensated due to buffer capaci-ties or failure propagation can be considerably delayed.� Inspector machines can remove or reject defective objects.

In industry, discrete-event-simulation models are also used fordetailed modeling of system behavior. Plants are planned anddesigned using these kinds of simulation models. Furthermore, ithas been attempted to raise the productivity by adjusting and opti-mizing single machine parameters, and output rates of machinesand plants are predicted ([15–18], or [10]). For research activities,simulations have also been used to model virtual systems withrealistic behavior ([19,20], or [21]). However, while discrete-eventmodels are useful for simulating the possible behavior of plantsunder some given initial conditions, they do not lend themselvesas diagnosis solutions where the task is to identify a specific actualfault scenario in the plant.

2.2. Diagnosis approaches

2.2.1. MethodsUp until now, only a few solutions for diagnosing filling

machine downtimes and finding the origins of these disturbancesin bottling plants have been described in the scientific literature.One solution [22] is described for a special bottling plant in Patras,Greece, based on a decision tree. In order to achieve an executablesolution, the decision rules were simplified considerably and onlydowntimes longer than one minute were diagnosed. The obtainedalgorithm seems to be inflexible, very much customized to the sin-gle plant under study, and no evaluation (e.g. regarding diagnosiscorrectness) is described. Generally, decision tree solutions thatwere also examined in preliminary studies regarding our LineModproject depend on the availability of complete data about all plantcomponents, a precondition that is rarely satisfied in practicalapplications. Another solution is based on artificial neural net-works (ANNs) which achieved good results when tested on simu-lated data. However for a real plant, coverage of training data isimpossible to achieve. Also, the flexibility of the solution turnedout to be unacceptable, since every reconfiguration and rearrange-ment of the plant structure implies a new training phase of theANN. For a complete description and evaluation of the solution,we refer to the closing report of the respective project [23].

2.2.2. Industrial solutionsThere are two commercial systems for diagnosis of filling

machine downtimes. The ‘‘Filler Stop Tracker’’ [24] developed byProLeit AG requires the operator to enter the cause of a filler stop.Although it is not an automatic solution, it demonstrates the needfor diagnosis solutions in this domain.

The ‘‘Downalyse KIT’’ was the first system for automatic faultlocalization in bottling plants in the marketplace [25]. It was

developed by Krones AG as a module within their ‘‘Line DiagnosisSystem’’ LDS. The ‘‘Downalyse KIT’’ algorithm uses the downtimesof all machines of a bottling plant to compare propagation timeand length with nominal values and is categorizing them by thisinto plant relevant and non plant relevant failures. It is limited tothe primary material flow (bottles) and ignores secondary materialflows (e.g. crates), tertiary material flows (e.g. pallets), and back-ward loops as well as the dynamic behavior of the buffering, whichleads to wrong diagnoses.

In summary, while there is a need for a diagnosis system in thisdomain, no acceptable solution is available as of now. The solutionsproposed to date are either not flexible enough to be adapted tonew plant configurations or their diagnosis quality is too low.

2.3. Model-based diagnosis

Model-based diagnosis [26] is most advanced within model-based systems with respect to theory and industrial applications.It was developed as a response to limitations of rule-baseddiagnostic expert systems when applied to engineered systems.Rule-based systems are aimed at capturing experiential knowledgeand are hence confined to faults or fault combinations, symptoms,and diagnostic contexts that were experienced in the past. In con-trast, the model-based approach attempts to represent first princi-ples’ knowledge about the physics of the systems to be diagnosedand thus provides a basis for diagnosing new devices under novelsituations, and even performing fault localization in the presenceof unknown faults. A key idea is the creation of libraries of modelsof system buildings blocks which can be re-used to configure plantmodels just like the building blocks themselves are aggregated toform a plant.

This progress is shown by a number of technical and industrialapplications. Success has been achieved, for example, with faultlocalization in power transmission networks [27], monitoringand fault detection in ballast water tank systems [28], and diagno-sis and self-reconfiguration for spacecraft of NASA [29]. Also,model-based fault detection and identification within the scopeof the commercial monitoring system for gas turbines TIGER [30],the generation of fault trees for the diagnosis of forklifts [31],diagnosis of dyeing factories [32], generation of decision trees foronboard diagnostics of dynamic automotive systems [33], andprocess fault detection in the chemical industry [34] were madepossible. Ref. [35] gives an additional overview of successfulapplications of model-based diagnosis.

3. Diagnosis approach

3.1. Consistency-based diagnosis

In the presented work we followed the consistency-basedapproach. Bottling plants contain a fixed set of components,COMPS, which interact in a fixed system structure. These are themachines and connecting conveyors displayed in Fig. 2. It isassumed that the system is well-designed, i.e. behaves andperforms as intended if all components behave correctly. A distur-bance of the entire system can be caused by a single faulty compo-nent or a set of faulty components. The diagnosis task is to decidewhether there are components that are not behaving as intended,i.e. are showing faulty behavior (fault detection). Furthermore, itcan be determined which components are operating in a faultmode (fault localization) and in which fault modes they operate(fault identification).Details about the theory or algorithmicaspects may be found in [26,28,36], or [37].

For consistency-based diagnosis, the system structure and thebehavior need to be represented in a model which allows

104 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

automation of the location and identification of causes of misbe-havior within the system. A model-based diagnosis algorithm thencompares the model’s predicted behavior with the observedbehavior of the real world system. The core step is to checkwhether or not the observed system behavior contradicts thepredictions generated by the system model (hence the nameconsistency-based diagnosis (see Fig. 3)). If the predictions of thenominal model, i.e. the combination of models of correct behaviorfor all relevant components, is inconsistent with the observations,a fault has been detected. However, consistency-based diagnosiscan derive more information if the model allows determinationof the causes of the inconsistency more precisely: if, for instance,two component models OK(C1) and OK(C2) give rise to inconsis-tency, it can be concluded that at least one of them must bedefective. If there are more such inconsistent submodels, theircombined evidence allows the fault localization to be narroweddown: if, in addition to {OK(C1), OK(C2)}, {OK(C1), OK(C3)} areinconsistent, then it can be concluded that C1 is a possible singlefault, while C2 is not, but is part of a potential double fault{OK(C1), OK(C2)}.

Please note that this illustrates that consistency-based diagno-sis is also able to perform fault localization based on a model ofthe system’s correct behavior (MODELOK) only. This means thatno assumptions about the possible faults need to be made and,hence, faults that have not been encountered before and/or arenot described can be localized. Secondly, the example shows thatconsistency-based diagnosis can easily diagnose multiple faults.Both capabilities are in contrast to an experience-based approachto diagnosis as used in rule-based expert systems.

When component models are supplemented by models of theirparticular failures, the principle described above also allows check-ing of the consistency of certain assumed faults with the givenobservations, refuting certain fault hypotheses, and hence alsoallows for fault identification.

The intuitive idea explained above can be turned into a rigorouslogical theory which enables us to precisely define the conceptsand goals of model-based diagnosis and to design solutionalgorithms and to prove their correctness. In the following, wesummarize these logical foundations (for details, see [26]).

For a particular plant, COMPS denotes the set of all componentsthe plant is composed of (In our application domain, this is the setof conveyors and machines listed in Fig. 2). To refer to normal orfaulty behavior, each component Ci has a set of behavior modes(Ci), where the mode OK(Ci), representing the intended normalbehavior of Ci, is always included in the set modes (Ci). All othermodes represent faults. This may simply be the negation of OK,:OK(Ci), which has no specific associated behavior, i.e. no faultmodel. Alternatively, there can also be particular fault modes, suchas conveyor_stopped, with an associated behavior model. A modeassignment MA is the assignment of one mode to each componentcontained in (a subset of) COMPS. MA is called complete if modesare assigned to all components in COMPS.

The required description of a system contains two elements:

System

Observations

Model

Predictions

modej(Ci)i = 1, 2, ...OK(C1),OK(C2)OK(C1),OK(C2)OK(C1),OK(C2)

Fig. 3. Principle of consistency-based diagnosis.

� LIB: A model library, assigning behavior models to componentmodes. From a logical point of view, the library contains a set ofimplications mode (Ci))model (Ci). The library representsdomain knowledge, which can be re-used within differentplant models.� STRUCTURE: The structural description of the specific plant,

which determines the variables that are shared by the compo-nent (mode) models according to the topology of the plant,i.e. the connectivity of the components.

Together, these elements infer a system model from a modeassignment, which can then be checked for consistency with theobservation. Based on this, we can define the concept of a consis-tency-based diagnosis for a set of observations OBS of the actualsystem behavior (in our case the recorded status messagesdescribed in Section 1) as a complete mode assignment MA thatis consistent with the structural description of the plant, thecomponent (mode) models and OBS. These elements can be statedas sets of first-order formulas and the criterion for MA being adiagnosis under the given set of observations as

STRUCTURE [ LIB [MAg [ OBS2 ?

where denotes ‘‘False’’ and =� means it is not entailed by the theory(in other words: it is consistent). For detecting faults, it is first of allassumed that all components operate correctly: in the mode assign-ment MAOK, OK modes are assigned to all components and the sys-tem behaves as intended. It then only has to be checked whetherthe model of the correctly behaving system is inconsistent withthe observations OBS:

STRUCTURE [ LIB [ fMAOKg [ OBS �?

If faulty components should be localized, i.e. the separation ofcorrectly operating components from faulty ones, it is often suffi-cient to restrict the modes of each component C to OK(C) and:OK(C). In particular, minimal fault localizations are of practicalinterest, i.e. a minimal set of faulty components which suffices toexplain a symptom. There is no need to assume additionalcomponents to be faulty.

To perform fault identification and/or refine fault localization,fault modes of components and their respective behavior modelscan be defined, as previously mentioned. The combinations of faultmodels span an entire space of models. The larger the space ofmodels grows, the more models need to be checked for consistencywith the observations. In order to respond to this situationadvanced search strategies and heuristics have been developed(see Ref. [26]).

A number of algorithms have been developed and described inthe literature such as the:

� General Diagnostic Engine (GDE) [38] which uses only OKmodels and fault probabilities.� GDE+ [39] which also exploits fault models.� Default-based Diagnostic Engine (DDE) ([28,40]) which allows

orders on fault modes to guide the search.

The diagnosis solution presented in this paper uses thecommercial diagnosis engine ‘‘OCC’M’s Raz’r’’ [41], which is animplementation of GDE+ basic component models.

3.2. Basic component models

3.2.1. AssumptionsWe firstly list the most important assumptions underlying the

material transportation models presented here, which are fulfilled

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 105

in our project domain (under normal conditions) but should alsoapply to a much broader class of problems.

� The transported objects (bottles) are rigid bodies with fixed spa-tial extensions and they are not significantly deformed bytransportation.� They are transported with a fixed orientation (e.g. crates), or the

orientation does not affect transportation times significantly(e.g. due to the symmetric cross-section, such as for bottles).� There is no interaction among the objects or between objects

and the components that has a significant impact on the trans-portation process (such as bouncing).� Objects can only move in the direction of motion of the trans-

portation means (or not at all).

3.2.2. Component modelsBased on these general assumptions, we generated a library of

component models which allows the configuration of models fordifferent plants. For typical bottling plants, we needed 4 basic com-ponent models and one virtual component model for connection:

� Material Transporter (MT) for conveyers or machines process-ing and/or transporting objects of one kind in a linear way (e.g.belt, labeling machine, inspector).� Split Element (SpE) and Merge Element (ME) that respectively

divide a flow of objects of one kind or join two flows.� Separate Element (SE) for machines splitting a flow of aggre-

gate objects into the flows of its constituent objects (crateunpacker and depalletizer).� Combine Element (CE) for machines assembling objects of two

kinds to create a flow of aggregate objects (crate packer andpalletizer).� Transportation Connector (TC), a virtual component to con-

nect the elements.

In the following section, we describe the modeling concept ofthe components using MT as an example. The other componentsare built in a similar way and are described in Refs. [42,43].

In order to present the essentials of the modeling approach, weconsider some sort of model archetype which can be specialized orextended to accommodate other kinds of machines. This is ageneric machine that:

� has one input and one output with vin, vout being the respectivevelocities of the means of transportation (e.g. belts),� possibly transforms or modifies one kind of object (for instance,

cleaning of bottles), but does not amalgamate several objects toform a new one,� has a buffer with a (constant) capacity C.

The process of buffering the objects can be fairly random. Forinstance, bottles may gather in bulk with gaps in between.However, it is assumed that (under normal behavior) no object isprevented from approaching the output unless it is blocked byother objects ahead, waiting for output. For instance, within thebottle conveyor, its shape and several parallel belts with differentspeeds ensure that bottles are not left in some corner, but arepushed towards the ‘‘ideal’’ fastest belt, if there is space. In the fol-lowing, the intuition behind the model can be best described interms of three fundamental concepts and four ‘‘behavior rules’’,each of which is first introduced informally and then turned intoequations. One of the problems to be solved stems from the factthat a local machine model in isolation cannot determine whetheran actual flow occurs at its input and output. However, it can andhas to express the limits on the machine’s potential to take in or

output objects. Table 1 summarizes the variables and equationsof the model of an MT and Fig. 4 illustrates it.

Concept 1: The potential input and output flow, in.qpot andout.qpot, represent the maximal flow the machine can accept orgenerate, dependent on its internal state.

The actual flows are represented by two different variables,in.qact and out.qact. The first restriction is determined by:

Rule 1: The potential input flow is given by the input speed ofthe Material Transporter, unless the buffer is full. In this case, itcannot be higher than the actual output flow.

In the mathematical model (see Table 1), this rule is formalizedby

in:qpotðtÞ ¼ v inðtÞ=d0 if BðtÞ < C

in:qpotðtÞ ¼minðv inðtÞ=d0; out:qactðtÞÞ if BðtÞ ¼ Cð1Þ

where d0 denotes the diameter of the object cross-section and B isthe filling degree of the buffer (in terms of number of objects). Itinvolves the assumption that an actual outflow generates thepotential for intake instantaneously, which is not true in practiceand is hence another reason for expressing tolerance intervals withvalues and time. Note that we assume all velocities and flows arepositive, as their sign is determined by their association with theintrinsic direction of the Material Transporter. Computing B isstraightforward:

Rule 2: The change in the total number of buffered objects isdetermined by the actual input and output flows.

The respective equation

dB=dt ¼ in:qactðtÞ � out:qactðtÞ ð2Þ

indicates that B is computed by integrating the difference of theactual flows. Setting up the model fragments for the potential out-put flow is based on the second key idea:

Concept 2: Bout denotes the number of buffered output objectsat time t, i.e. the number of objects that can possibly be subject tooutput at this time. Before we clarify this crucial concept, we useits intuitive understanding and the third concept to formulatethe rule for the potential output flow.

Concept 3: The minimal transportation time td is the time anobject needs to get directly from the input to the output, i.e. if itis not delayed by other objects that are piling up.

Rule 3: The potential output flow is determined solely by theoutput speed, if there is more than one buffered output object.Otherwise, it cannot be higher than the actual input flow at thetime, reduced by the minimal transportation time.

One should be aware that in the second case each single objectmay (potentially) leave the output with speed vout. However, if theinput flow at the time when it entered was lower, there will be agap after the output of the object, which makes the (average) flowlower than vout. As a special case, the potential output flowbecomes zero if the actual input flow was zero at the respectivetime.

Again, the respective equation

out:qpotðtÞ ¼ voutðtÞ=d0 if BoutðtÞ � 1out:qpotðtÞ ¼minðin:qactðt � tdÞ;voutðtÞ=d0Þ else

ð3Þ

formalizes this. Computing Bout also involves the minimal transpor-tation time td. If an object entered the Material Transporter laterthan time t � td, it cannot possibly reach the output at time t andhence cannot become part of the buffered output objects. If itentered earlier, it may or may not have already left the outputbefore t, dependent on how the actual output flow reduced Bout.This consideration is captured by:

Rule 4: The change in the number of buffered output objects attime t is determined by the actual input flow at time t � td

diminished by the actual outflow at time t.

Table 1Model of the Material Transporter.

State variablesB Objects in buffer (#objects)Bout Objects buffered for immediate output (#objects)vin Velocity of input transportation means (m/s)vout Velocity of output transportation means (m/s)td Minimal transportation time (s)

Parametersd0 Diameter of transported object (in transportation plane) (m)C Capacity (#objects)

Interface variablesin.qpot Potential inflow (#objects/s)out.qpot Potential outflow (#objects/s)in.qact Actual inflow (#objects/s)out.qact Actual outflow (#objects/s)

Equations(1) in.qpot(t) = vin(t)/d0 if B(t) < C

in.qpot(t) = min (vin(t)/d0, out.qact(t)) if B(t) = C(2) dB/dt = in.qact(t) � out.qact(t)(3) out.qpot(t) = vout(t)/d0 if Bout(t) P 1

out.qpot(t) = min (in.qact(t � td), vout(t)/d0)else

(4) dBout(t)/dt = in.qact(t � td) � out.qact(t)

106 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

Hence, Bout is also obtained by integration according to equation

dBoutðtÞ=dt ¼ in:qactðt � tdÞ � out:qactðtÞ ð4Þ

which completes the model of the Material Transporter with buffer.Note that Bout is not necessarily the number of objects that form acontiguous pile in front of the output. It could be less, because thelast objects that joined the pile entered later than t � td.

Another class of machines produces an output by combiningobjects (CE) of different kinds, such as for instance the packagingof 20 bottles in a crate. The ratio of the number of different objectsparticipating in this combination is usually not arbitrary, butexactly specified. This ratio links the various potential and actualinflows and the outflow, which is then limited by the ‘‘slowest’’input flow (relative to the ratio of the respective object type).

The counterpart to this very generic combination element is theSeparate Element (SE), with unpackers being a subclass, in whichthe slowest actual outflow of a separation result limits the poten-tial inflow of the composite object.

Finally, one flow of objects may be split (SpE) into (usually) twoflows (either randomly or with a preference) or generated bymerging objects from two input flows (ME).

Each of the respective models describes the behavior of anindividual instance of the respective element class but does not

d0

Fig. 4. Schematic model of t

determine the actual flows, because this can only be determinedfrom the interaction of two (or more) connected elements. Forinstance, if one element has a non-empty output buffer and couldproduce an output (out.qpot > 0), but the next element is blockedand cannot receive input (in.qpot = 0), then there is no actual flowat this point: out.qact = 0. The laws for this interaction are modeledby the virtual component Transportation Connector TC. TC linksthe potential and actual flows of two elements.

Rule TC: The actual output flow of a machine MTn is limited byboth its own potential output flow and the potential input flow ofthe following machine MTn+1 (and equal to the actual input flow ofthis machine):

MTn:out:qactðtÞ ¼minðMTnþ1:in:qpotðtÞ;MTn:out:qpotðtÞÞMTn:out:qactðtÞ ¼MTnþ1:in:qactðtÞ

ð5Þ

This relatively small set of fairly generic model types turns outto cover the variety of machines in a bottling plant and also, moregenerally, the machines in food packaging plants that we haveencountered.

3.3. Abstraction to diagnosis models

Using the model presented in Section 3.2 directly for diagnosisis not appropriate. Firstly, as for all numerical models, its accuracyis only ‘‘pretended’’ in many respects, for example in assumingconservation laws to hold and in ignoring the imprecision in theavailable data, for example when flows are determined via coun-ters or the speed of belts. Secondly, the diagnostic task requiresthe analysis of qualitative rather than arbitrarily small numericaldeviations from the nominal behavior and hence needs to beaddressed by an appropriate level of abstraction in the model.Finally, the status messages indicate only whether a machine isrunning or has been stopped and, hence, whether a flow is zeroor non-zero, and any model that requires more detailed informa-tion cannot exploit these observations and is useless.

This level of model abstraction is appropriate for the intendedgoal of the diagnosis: we focused on ‘‘hard’’ failures (namely stop-ping of the filling machine) caused by hard faults (blockage ofanother machine), which can be based on distinguishing zero fromnon-zero flow only. For capturing ‘‘soft’’ faults (deviating behav-iors) that lead, perhaps in combination, to a hard failure or anon-optimal behavior, a different model will be required. The totalinterruption of the flow requires distinctions between zero andnon-zero flows only.

Hence, a transformation R ? Sign from the domain of realnumbers, R, to the domain Sign = {�, 0, +} is applied, which intro-duces a straightforward transformation of the numerical modelintroduced above to a qualitative model over the Sign domain.Each ‘‘qualitative equation’’ of the latter has a finite number ofsolutions, i.e. a finite relation over qualitative variables, whichcan be represented as a table. Checking the consistency of a set

he Material Transporter.

Material Transporter (MT) with Buffer

[1] [in.qpot(t)] = [vin(t)] if C-B(t) > 0

[in.qpot(t)] = min ([vin(t)] , [out.qact(t)]) if C-B(t) = 0

[in.qpot(t)] [vin(t)] [out.qact(t)] [C-B(t)]

0 0 * +

+ + * +

+ + + 0

0 0 + 0

0 + 0 0

[3] [out.qpot(t)] = [vout(t)] if Bout(t)-1≥0

[out.qpot(t)] = min ([in.qact(t - td)] ,[ vout(t)]) if Bout(t)-1<0

[out.qpot(t)] [ vout(t)] [in.qact(t - td)] [Bout(t)-1]

0 0 * 0

0 0 * +

+ + * 0

+ + * +

0 0 + -

0 + 0 -

+ + + -

Transportation Connector (TC) connecting two Material Transporters MTn and MTn+1

[5] [MTn.out.qact(t)] = min ([MTn+1.in.qpot(t)] , [MTn.out.qpot(t)])

[MTn.out.qact(t)] = [MTn+1.in.qact(t)]

[MTn.out.qact(t)] [MTn+1.in.qpot(t)] [MTn.out.qpot(t)]

0 0 +

0 + 0

+ + +

Fig. 5. Sign-based qualitative models of the Material Transporter and TransportConnector. [x] denotes the sign of x. ‘‘⁄’’ in a row means ‘‘no restriction’’ and hencethe entire row represents multiple tuples.

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 107

of finite relations can be performed by various algorithms, includ-ing so-called constraint satisfaction algorithms developed in Artifi-cial Intelligence (see [44]).

Fig. 5 shows the result of the Sign abstraction of the numericalmodel: Eqs. (1) and (3) explained above yield the Constraints [1,3]on the qualitative variables, which are stated as both theabstracted qualitative equations and the respective finite relations(remember that flows and speeds cannot be negative). Eqs. (2) and(4) are omitted, because, although required for numerical simula-tion, they are difficult or impossible to exploit in their qualitativeversions, because B(t) and Bout(t) can be neither observed nor pre-dicted properly.

Fig. 5 also shows the Constraints [5] as the qualitative model ofTransportation Connector (TC) obtained from Eq. (5). These virtualcomponents appear between any two connected Material Trans-porters and machines.

The abstraction of Combination Elements (CE, such as the cratepacker outlined in Section 3.2) includes the application of the threemodel fragments of Fig. 5 to all individual inflows as well as a con-straint simply stating the qualitative equality of all inflows (theratio of the flows drops out, because it is a positive number):

in1:qpotðtÞ� ¼ ½in2:qpotðtÞ� ¼ . . . ¼ ½ink:qpotðtÞ�

This captures, for instance, the fact that one lacking input will stopall other inputs as well. The same applies to the outputs of Separa-tion Elements.

We briefly demonstrate that the inferential power of the model,despite its simplicity, suffices for handling the class of faults andfailures under consideration: assume that a Material TransporterMTn with a single speed vin(t) = vout(t) produces an output, i.e.[MTn.out.qact(t)] = +, but has no inflow, [MTn.in.qact(t)] = 0. Thenthe constraints (stated as bracketed numbers) yield:

½MTn:out:qactðtÞ�¼þ ½5�)½MTn:out:qpotðtÞ�¼þ½3�)½MTn:voutðtÞ�¼ ½MTn:v inðtÞ�¼þ

½MTn:out:qactðtÞ�¼þ^½MTn:v inðtÞ�¼þ ½1�)½MTn:in:qpotðtÞ�¼þ½MTn:in:qpotðtÞ�¼þ^½MTn:in:qactðtÞ�¼0 ½5�)½MTn�1:out:qpotðtÞ�¼0

If MTn�1 is operational, which implies [MTn�1.vout(t)] = +, then:

½MTn�1:out:qpotðtÞ�¼0^½MTn�1:voutðtÞ�¼þ ½3�) ½MTn�1:in:qactðt� tdÞ�¼0

This means, even without information about the buffers, that thelack is propagated backwards across the models of correct elements(but will be consistent with a ‘‘blocked’’ mode, for instance), asexpected.

3.4. The diagnosis solution

3.4.1. The challengesA solution to fault localization in bottling plants based on the

models and consistency-based diagnosis as introduced above hasto address some fundamental problems:

� Focusing on relevant faults: disruption in the operation of thevarious machines and conveyors occur frequently, due to man-ual intervention or physical problems. Most of them are irrele-vant, because they do not affect the crucial component, i.e. thefilling machine. Hence, there is no use in globally searching thedata for evidence for disturbances in the components. Only ifthe filler is interrupted due to an external cause, a search forthis cause is triggered.� Diagnosis with incomplete observations: while many of the

controlled machines supply data about their status, not all ofthem do, and sometimes the messages are not compliant with

the standard and cannot be used for diagnostic purposes. In par-ticular, conveyors usually do not provide data. Also, as statedbefore, the available status data imply only binary (no numeri-cal) information about the presence of a flow of objects.� Coping with delays: interruptions of flows propagate across the

components of the plant with non-negligible delays, as cap-tured by the model of Material Transporter in Table 1. Thedelays can range for fractions of a second, e.g. in the labelingmachines, to the order of 15 min in the conveyors with largebuffers. The space of the relevant time points is practically infi-nite. This creates a problem for exploiting finite constraint sat-isfaction techniques for the consistency check that is needed forconsistency-based diagnosis. We solved this by a technique forfactorizing out time and use timeless constraints for the consis-tency check.� Handling uncertainty of delays: the delay times are highly

uncertain, due to noise (e.g. removed bottles) and unknownstates of components (in particular, the number of storedobjects in buffers, which is neither measurable nor predictable).

108 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

One has to assume (conservatively) intervals for the value of theactual delays. This raises the problem of growing intervals uponpropagation across several components. Our solution focuses onpropagating only the start of some interrupted flow and exploitsretrieved observations to narrow down the relevant timeintervals.

3.4.2. Architecture and diagnosis processFig. 6 shows the general architecture, modules, and information

flow of the diagnosis solution, which exploits the diagnosis modelspresented in Section 3.3. Firstly, the Symptom Scanner requeststhe Data Interpreter to check for the presence of a set of symp-toms in a certain time period. The Data Interpreter takes therequest to search for evidence for these symptoms in a standard-ized database, and returns all time periods for a confirmed symp-tom, a negative result, or some undecided status. Currently, theonly symptom considered is a stop of the filling machine and thedatabase is scanned for the presence of this kind of symptom only.Whenever a symptom has been detected, it is passed, along withtemporal information, to Diagnosis, which generates fault hypoth-eses that could explain a symptom. It does so, in a loop, by lettingthe Predictor constrain more values based on the next connectedcomponent models. With these values, the Predictor is searchingthe database for evidence or refuting information in terms of statusdata of the various machines indicating either a local disturbanceor the effect of a propagated one (lack or tailback) in an appropriatetime window. From the retrieved data (or, rather their abstractmeaning), the DiagnosisEngine, RAZ’R [41], a consistency-baseddiagnosis engine as mentioned in Section 3.1, generates faulthypotheses, if possible. Otherwise, predictions are continued alongthe chain of components. In this case, the model of correct behav-ior generates predictions concerning the behavior of the nextmachine(s), which are then checked against data retrieved fromthe database. Propagation is also continued if there is no informa-tion available or if it is too weak.

When refuting or confirming observations are found, their tem-poral extensions are used for the next prediction step, replacingthe original predictions. This is important, since model-based tem-poral prediction has to be very conservative, i.e. generate largeintervals due to the uncertainty in the delays of propagated effectsin order to guarantee that no evidence is missed. Using theobserved time periods wherever available restricts widening ofthe time intervals significantly.

3.4.3. Behavior modesEach component (Ci) has only two behavior modes OK(Ci) and

Faulty(Ci), which have fairly simple models. Besides Constraints[1] and [3], the OK model of a Material Transporter with single

SymptomScanner Symptom

Symptom? Symptom(t)

Data Interpreter

Database

O

Predic

Fig. 6. Architecture of the Lin

input and output that it is operational. This means there is eitheran unrestricted flow (row 1 in Table 2) or, otherwise, there is eithera lack at the input causing zero output (row 2) or a tailback, whichis propagated to the input (row 3). Note that the latter two casesare undesirable, but due to a cause external to the componentand, hence, part of its OK mode.

The fault mode of a machine, i.e. Material Transporter withoutbuffer) simply states that there is no potential (and, therefore, noactual flow) (Table 3) while a conveyor (a Material Transporterwith buffer) that is blocked somewhere may still allow an inflow(filling the buffer) or outflow (emptying it), as shown in Table 4.

Models of components with multiple inputs or outputs areslightly more complex, but not different in the underlyingprinciples.

3.4.4. Encoding the available observationsIn order to be processed by the model, status messages in the

recorded data have to be encoded as value restrictions on modelvariables. This is straightforward as shown in Table 5 for singleinput and output.

The table indicates that the first three rows are consistent withthe OK model (Table 2), while the last one is identical to the faultmodel (Table 3).

For multiple inputs or outputs, the status message may indicatewhich of them suffers from a tailback or lack.

3.4.5. An exampleWe illustrate the operation of the system by considering two

steps in the loop (depicted in Fig. 7). We assume that for componentC17, its observed message ‘‘Lack’’ indicates that the actual inflow tothe component is zero (C17.in.qact = 0) for some time intervali17 = [si17, ei17] and non-zero before. Since C17.in.qpot = +(becauseC17 was not stopped for internal problems and could have taken inobjects), the model predicts B16.out.qpot = 0 for the conveyor B16,which supplies C17. Conveyors usually do not produce data, andthe OK model of B16 propagates the lack, i.e. predicts B16.in.qpot =+ and B16.in.qact = 0 for some time interval in i16 = [si17, si17] + d16 =[si17, si17] + [sd16, ed16] = [si17 + sd16, si17 + ed16], where d16 is aninterval that is guaranteed to contain the actual delay across B16.This implies C15.out.qpot = 0 for a subinterval of i16, i.e. C15 was notable to supply the connected conveyor B16.

Having reached the downstream component C15, there are fourdifferent cases to be considered:

1. There are no data available about C15 (either in general or forthe relevant period), then its OK model will propagate the lackjust as B16 did (and expand the interval further). The respec-tive prediction has the support {OK(B16), OK(C15)}. Whenever

Diagnosis

Fault(t)

bs(t) Obs(t)

tor Diagnosis Engine

eMod Diagnosis Solution.

Table 2OK model of a machine with one input and output.

in.qact in.qpot out.qact out.qpot Comment

+ + + + Normal operation0 + 0 0 Suffering from a lack0 0 0 + Suffering from a tailback

Table 3Fault model of a machine.

in.qact in.qpot out.qact out.qpot Comment

0 0 0 0 No operation

Table 5Encoding of status messages for a machine with single input and output (‘‘⁄’’ means‘‘no restriction’’).

in.qact in.qpot out.qact out.qpot Status message

+ + + + Operation0 + ⁄ ⁄ Lack⁄ ⁄ 0 + Tailback0 0 0 0 All others

C17

B16

C15

Predic�on Observa�on Time

in.qact=0in.qpot=+

out.qpot=0

in.qact=0in.qpot=+

out.qpot=0

in.qact=0in.qpot=+

OK(B16)

i17

i16

i15

Fig. 7. Illustration of two steps of the LineMod Diagnosis Solution.

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 109

the resulting model-based predictions contradict observationsin a future step, these two components (and, perhaps, addi-tional ones) may occur in fault hypotheses.

2. There are data2.1. Refuting the predictions (if the status of C15 is ‘‘Operat-

ing’’ or ‘‘Taliback’’, this implies C15.out.qpot = +). Then{OK(B16)} is inconsistent and the DiagnosisEngine pro-duces the only fault hypothesis that B16 is the cause ofthe disturbance.

2.2. Consistent with the predictions. Then2.2.1. either the data state an internal fault of C15, and

the DiagnosisEngine generates the fault hypothe-sis :OK(C15),

2.2.2. or C15 is in state ‘‘Lack’’. Then there is an observedtime interval i15, for which C15.in.qact = 0 andC15.in.qpot = +. This is the same situation as theone we started from and the system reiteratesthe steps. This case is depicted in Fig. 7.

The example also motivates the applied technique of temporalfactorization: we separate the calculation of the temporal relation-ships (needed to focus the access to the data base) from the consis-tency check, which only needs to test whether the respectivevariable values are consistent. In other words, we directly usethe third table in Fig. 5 and ignore the temporal dependency, i.e.we do not associate a temporal index with the values.

Of course, this will cause spurious inconsistencies, if the propa-gation may infer or retrieve values for the same variable at differ-ent times, which could be different. We address this by duplicatingcomponents (and, thus, their variables) that the scheme describedabove may visit more than once. In our application, this is veryrestricted regarding time and structure, and can be analyzedbeforehand.

4. Results

4.1. Validation of the base model

Firstly we validated the basic component models described inSection 3.2. We implemented them as numerical simulation mod-els in MATLAB/SIMULINK� [45] and compared the simulatedbehavior (using the solver ‘‘ode4’’ (Runge–Kutta) with a fixed step

Table 4Model of a blocked conveyor (‘‘⁄’’ means ‘‘no restriction’’).

in.qact in.qpot out.qact

0 0 0⁄ + 00 0 ⁄0 0 ⁄

size of one second) to the behavior of real plants. Every componentwas modeled using the relevant equations (see Table 1), and themodel was tested in isolation to check whether it was adequateand stated in a context-independent manner, which is a prerequi-site for compositionality. In a second step, a model of a completeplant was configured using the validated components.

To test the individual components, values of single parametersand variables were varied and the response of the simulatedbehavior was monitored. For example, the predicted changes inthe buffered material B of a component for different values of theinput speed vin and the output speed vout are shown in Fig. 8. Itis shown that the buffer fills as long as the input speed is higherthan the output speed (assuming a sufficient supply), whereaswhen the input speed reduces to its minimum of 0.1 and the out-put speed is still high the quantity of buffered objects decreases.

Due to the minimal transportation time td of the component,the buffer is not completely emptied, as long as there is input avail-able. Furthermore, only the objects represented by the variable Bout

determine the existence of an output flow. Another real character-istic behavior can be reproduced by increasing the input speedwhilst keeping the output speed constant. Although vin is stillhigher than vout, the buffer filling degree remains constant after acertain time, because it is limited by the maximum capacity ofthe component. Similar results were achieved by testing the othercomponent type models, providing evidence that the models cap-ture the features relevant to the diagnostic task and do not violatecontext-independence.

The second challenge was validation by comparing the simu-lated behavior of a plant model to the behavior of a real plant

out.qpot Comment

0 Completely blocked: produces tailback and lack0 Produces only lack at succeeding component+ Produces only tailback for preceding component+ Produces only tailback for preceding component

1.11

0.90.80.70.60.50.40.30.2

900800700600500400300200100

0 1000 2000 3000 4000

1000 2000 3000 4000

v [m/s]

B [objects]

Time in s

Time in s

Fig. 8. Buffer response (lower graph) to variation of vin and vout.

110 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

(Plant A). Several test cases were constructed, based on real-worlddowntime scenarios of the bottling plant whose topology is shownin Fig. 9.

The simulated plant consists of a primary flow of bottles and asecondary object flow of crates. In one test case, the downtimepropagation of a failure of the crate washer was simulated and ana-lyzed. This failure affects both object flows. After some delay, miss-ing input occurs at the crate packer. Also the unpacker stops atsome point, due to one of its outputs being blocked. The detailsof the propagation of failure depend on the capacities and fillingdegrees of the various buffers connecting the machines. Forinstance, if the crate magazine is empty and all other buffers arefilled to a sufficient degree, the lack of crates will rapidly reachthe crate packer. This causes blockage of the labeling machineand the bottle filling machine (because the packer is not able to

Bo�lingplant A

Palle�zer

Crate pa

Bo�le filler

Bo�l

Empty bo�le inspector

Crate magazine

Fig. 9. The structure o

process the bottles) before the lack of bottles in the primary flow(caused by the inoperable unpacker) reaches the filling machine.In contrast, if the crate magazine is completely full, the cratepacker keeps working for some time, and the filling machine willstop due to a lack of bottles. Even for this complex scenario, thesimulation model reproduces the behavior of the real world plant.Similarly, the characteristics of fault propagation occurring in realplants were predicted for other relevant scenarios.

4.2. Evaluation of the implemented diagnosis solution

Secondly we evaluated the diagnosis solution based on theimplemented qualitative component models (see Section 3.3).We followed two routes. One was comparison with fault localiza-tion on two real plants (Plants ‘‘A’’ and ‘‘B’’) performed by human

Depalle�zercker Crate unpacker

Snifferblock

Labelingmachine

e washer

Crate washer

f the test plant A.

Table 6Evaluation of results for the diagnosis solution.

Real plant A Real plant B Sum real plants A and B Simulated plant C

Cases Percent Cases Percent Cases Percent Percent

Failures Caused by filler 56 83.6 256 61.7 312 64.7 0 0.0Lack 4 6.0 122 29.4 126 26.1 46 59.7Tailback 7 10.4 37 8.9 44 9.1 31 40.3

Diagnosis correctness All cases Compliance 63 94.0 397 95.7 460 95.4 80 100.0No compliance 4 6.0 18 4.3 22 4.6 0 0.0

Non-trivial cases Compliance 7 63.6 141 88.7 148 87.1 80 100.0No compliance 4 36.4 18 11.3 22 12.9 0 0.0

The numbers in italics show the diagnosis compliance results of the two real plant case studies (plant A and plant B). This is the essential result of the evaluation of thepresented diagnosis solution related to practical applicability and therefore illustrated in special letters.

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 111

experts. Since it is difficult and time-consuming for them to do thisbased on the recorded data (which is actually what motivated theproject), the reference diagnoses were obtained by direct on-siteobservation. A number of experts observed the plant and immedi-ately tried to identify the causes of filling machine stops. Thus, aprotocol was created that contains disturbances of the Bottle Fill-ing machine as symptoms with relevant time intervals, and withmanually associated causes of each symptom. This protocol servedas the reference symptom cause for assessing the quality of thediagnosis result produced by the diagnosis solution. Obviously thiscould not be done for longer periods than a few days, but itnevertheless produced a set of more than a hundred relevant casesfor the evaluation.

As one cannot expect to encounter all kinds of interesting casesin a limited time (and in the chosen plants), the second way ofevaluation was performed by simulation. For this purpose, amodel of the real plant A (see Fig. 9) was created for discrete eventsimulation and named Plant ‘‘C’’. It was represented in the softwaretool ‘‘Plant Simulation’’ [46] and its behavior was verified againstthe real plant. With the help of this simulation environment, wegenerated data for a set of 77 additional non-trivial virtual casesof disturbances for comparison with the results. The evaluationchecked for what we call compliance, i.e. whether the relevantsymptoms (filling machine stops) were accurately detected andwhether the real (observed or simulated) causes were diagnosed.Table 6 summarizes the statistical evaluation of the results.

Fig. 10. Gantt chart of failure scenario 1. Legend: red = failure, yellow = tailback, blue = lreferred to the web version of this article.)

During the observation period only 67, mostly trivial, fillingmachine stops of plant A occurred. Unfortunately plant A produceda fairly insufficient database. Bad data included even erroneousstate messages of machines. Despite this, 7 of 11 non-trivial symp-tom causes could be diagnosed correctly.

Plant B was designed in accordance with a data acquisitionstandard (see [47]) which resulted in much better quality of theautomatically acquired machine data. Nevertheless, even for thisplant some obviously bad data had to be processed. Due to the factthat data was available for four production days, the statisticalevaluation of the diagnosis results of plant B can be seen as a rep-resentative example indicating the quality of results based on anappropriate database. Indeed, 88.7% of the non-trivial symptomcauses were diagnosed correctly.

In sum, the two real plant case studies (plant A and plant B)showed a compliance of 87.1%. So 87.1% were correctly diagnosed.

The data of the simulated plant C have been generated com-pletely and free of errors. Based on this all of the 80 simulatednon-trivial symptom causes of plant C were correctly diagnosed.

4.3. Non-trivial diagnosis scenarios

To illustrate the complexity of the diagnosis task, we presenttwo exemplary non-trivial scenarios from the evaluation. Whileblockage of machines or conveyors directly downstream orupstream from the filling machine are frequent causes of lacks or

ack. (For interpretation of the references to color in this figure legend, the reader is

CF

CF

CF

BF

BF

BF

BF

BF

BF

Downstream processes

Crate packer

Labelingmachine

Bo�le fillerlack of bo�les

Crate washerFailure

Empty bo�leinspector

lack of bo�les

Bo�le washerlack of bo�les

Sniffer blocklack of bo�les

Crate unpackertailback of

crates

Upstreamprocesses

Crate magazine

Failurepropaga�on

Crate flow (CF)Bo�le flow (BF)

Fig. 11. Failure propagation (red arrows) via a sub-branch of the production linecausing a lack of bottles at the filler. (For interpretation of the references to color inthis figure legend, the reader is referred to the web version of this article.)

112 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

tailbacks, disturbances may propagate via several machines andconveyors not only on the main-branch of the production line(involving the bottles) but also via its sub-branches. In the first sce-nario taken from the simulated plant (‘‘plant C’’), a disturbance atthe crate washer was the origin of a lack of bottles at the bottlefilling machine via the following causal chain displayed in Fig. 9:

� the disturbance at the crate washer causes a tailback of crates,which ultimately reaches and stops the crate unpacker;� the stopped crate unpacker interrupts the flow of bottles, caus-

ing a lack which propagates successively to the sniffer block,bottle washer, empty bottle inspector, and, finally, to the bottlefilling machine.

Fig. 10 shows a Gantt chart of the failure scenario. This is themain aid today for experts in their diagnosis process. Every linerepresents the operating states of a certain machine within adefined time interval. The lines marked with arrows are of partic-ular interest because they show the failure propagation throughthe plant. The manual localization is considered complicatedbecause the disturbance originated from the crate line up to thebottle filling machine. The diagnosis solution correctly localizedthe crate washer as the origin of the symptom.

The second scenario (shown in Fig. 11) is taken from the diag-nostic data of the real plant A, which was quite poor data quality.It illustrates the capabilities of the model-based diagnosis solutioneven under such conditions. A defective checkmat (a crate inspec-tion machine) upstream from the crate unpacker caused a tailbackof bottles at the bottle filling machine as follows:

� a lack of crates propagates and successively stops the crateunpacker, crate washer, crate magazine, and crate packer;� the stopped crate packer produces a tailback of bottles, which

propagates via the labeling machine to the bottle fillingmachine.

Despite the completely non-observable crate line between thecrate unpacker and crate packer (Crate flow (CF) section of

Fig. 11), the system model produced predictions for this sectionand enabled the diagnosis system to produce a proper fault local-ization, namely the upstream process component which containsthe defective checkmat, but did not allow further discriminationdue to the lack of data. This example shows that the diagnosis solu-tion provides the best possible results even if the available data isincomplete or not available for some components.

5. Discussion and outlook

The results have triggered significant interest from both endusers (breweries, etc.) and suppliers. This demonstrates that thediagnosis solution addresses the needs explained above by localiz-ing the interruptions of transportation causing downtime of thefilling machine based on the available data from the machines(collected over a period of days to months) stored in a database.The goal of the research project was to find a solution for automaticfault diagnosis in bottling plants. This solution needed to beaffordable for the potential end users, which are mostly small ormedium-sized enterprises. Additionally it had to be easilyadaptable to different plant layouts and to changes to the plantstructure. We met these requirements by following the model-based approach and developing a new method for considering timepropagation. This enabled us to use a readily available standardalgorithm for consistency-based diagnosis. The necessary flexibil-ity was gained by conceptual separation of the representation ofthe plant structure and the diagnosis algorithm.

The correctness of the diagnosis (an average of 87.1% was deter-mined for the two real plants that were considered) already makes ita useful tool in practice, but there is still scope for further improve-ment. Wrong diagnosis results were often caused by errors in thedatabase, inexact manual downtime analysis, or plant-specificcritical points which were just too difficult to find automatically(missing observation data for the specific problem). If there aremissing data, for example, due to a communication failure of amachine, the diagnosis algorithm may find several fault hypothesesthat are compliant with the data. A plant usually comprisesmachines from different suppliers, all coming with their own con-trol systems and specific data. Since it is not feasible for end usersto generate a homogeneous set of data from these different sources,work was undertaken by one of the project partners to establish astandard for production data acquisition (PDA) in bottling plants,resulting in the so-called ‘‘Weihenstephaner Standard 2005’’(WS2005, [47]). This standard is now widely accepted and will bean essential prerequisite for successful diagnosis solutions in thefuture. The project’s results have already led to extension of thisstandard for diagnosis purposes. The extended WS2005 standardincludes data points such as:

� Operating states of machines according to WS2005.� Maximal output rate setting of machines.� Object counters (input, output, or rejection).� Information about mechanical barriers.� Assignment of jam switches or buffer filling degrees, if

available.� Transportation or machine speeds, if available.

The standard specifies an optimal set of data but in practicemany machines provide only a small subset or no data at all, differ-ent suppliers may interpret and handle the data differently (e.g.counters in a cumulative or incremental way), and erroneous datamay even be present.

The LineMod research project discussed here solely consideredreasons for filling machine downtimes, and only those within theplant, and thus has a number of limitations:

T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114 113

� LineMod focused only on ‘‘hard’’ failures (stop of bottling fillingmachine) caused by hard faults (zerozero flow). For capturing‘‘soft’’ faults (e.g. reduced output) which lead, perhaps in com-bination, to a hard failure or non-optimal behavior, a differentmodel is required.� Many efficiency losses are not caused by technical components

but rather by logistic processes downstream or upstream of theplant. To analyze such situations, we need to extend the plantmodel.� Sometimes machine downtimes could not be assigned to the

actual cause, because a machine treated objects incorrectly(e.g. improper cleaning of dirty bottles). The effect of the defec-tive treatment may become evident only after some time periodat another machine (e.g. the empty bottle inspector, which willreject a large number of bottles, thus potentially reducing theavailable input to the filling machine). The impact of the processsteps on the objects is not included in the LineMod model,which prevents the algorithm from identifying the real down-time cause.� Filling machine downtimes are often caused by a combination

of disturbances of different machines. In such situations, thediagnosis has to generate more sophisticated fault hypotheses.

To address these additional challenges, we have started a follow-upresearch project called LineMET (model-based efficiency analysistool for complex and interconnected filling and packaging lines).This project will extend the system model and further optimizethe diagnosis algorithm. Additionally, we will work on a userfriendly demo application to be tested by practitioners in manybottling plants. This will pave the way for commercial applicationsof the model-based fault localization approach.

Acknowledgments

We would like to thank all colleagues involved in the LineModproject from the Model-based Systems and Qualitative ModelingGroup (MQM) and the Department of Food Packaging Technology(LVT). Special thanks go to B. Ertl who produced the first versionunder high time pressure and to A. Kather for his outstanding workfor this project. We are also indebted to the end users and suppliersfor their outstanding support. O. Dressler from OCC’M Software,which provided the diagnosis engine and also gave technical sup-port, deserves a special mention and contributed greatly to thesuccess of the project. Finally, we wish to thank the Federal Minis-try of Economics and Technology – Germany for their financialsupport. The LineMod (233 ZBG/1) research project was sponsoredvia the AiF by the IVLV, Wifoe, and FKM under the program to pro-mote joint industrial research and development (IGF).

References

[1] T. Al-Hawari, F. Aqlan, M. Al-Buhaisi, Z. Al-Faqeer, Simulation-based analysisand productivity improvement of a fully automatic bottle-filling productionsystem: a practical case study, in: Second International Conference onComputer Modeling and Simulation, ICCMS ’10, 2010, pp. 195–199.

[2] P. Tsarouhas, Evaluation of overall equipment effectiveness in the beverageindustry: a case study, Int. J. Product. Res. 51:2 (2013) 515–523.

[3] F. Pereira Castro, F. Oliveira de Araujo, Proposal for OEE (Overall EquipmentEffectiveness) indicator deployment in a beverage plant, proposal for OEE(Overall Equipment Effectiveness), Braz. J. Oper. Product. Manage. 9 (1) (2012)71–84.

[4] P. Struss, A. Kather, D. Schneider, T. Voigt, Qualitative modeling for diagnosis ofmachines transporting rigid object, in: L. Bradley, L. Trave-Massuyes (Eds.),22nd International Workshop on Qualitative Reasoning QR08, Boulder Co,2008.

[5] P. Struss, Extensions to ATMS-based diagnosis, in: J.S. Gero (Ed.), ArtificialIntelligence in Engineering: Diagnosis and Learning, 1988, pp. 3–27.

[6] P. Struss, A. Kather, D. Schneider, T. Voigt, Qualitative modeling for diagnosis ofmachines transporting rigid objects, in: A. Grastien, M. Stumptner (Eds.),

Proceedings 19th International Workshop on Principles of Diagnosis, BlueMountains, Australia, 2008.

[7] J. Buzacott, Stochastic Models of Manufacturing Systems, Prentice Hall,Englewood Cliffs, 1993.

[8] G. Liberopoulos, P. Tsarouhas, Reliability analysis of an automated pizzaproduction line, J. Food Eng. 69 (2005) 79–96.

[9] T. Rädler, Modellierung und Simulation von Abfüllinien, Dissertation,Technische Universität München, 1999.

[10] M. Weiß, Simulationsgestützte schwachstellenanalyse – produktionsreservenin anlagen erschließen, Ind. Anz. 118 (49) (1996) 46–47.

[11] H.T. Papadopoulos, C. Heavey, Queueing theory in manufacturing systemsanalysis and design: a classification of models for production and transferlines, Eur. J. Oper. Res. 92 (1) (1996) 1–27.

[12] A. Dolgui, A. Eremeev, A. Kolokolov, V. Sigaev, A genetic algorithm for theallocation of buffer storage capacities in a production line with unreliablemachines, J. Math. Model. Algor. 1 (2) (2002) 89–104.

[13] D. Spinellis, C.T. Papadopolous, A simulated annealing approach for bufferallocation in reliable production lines, Ann. Oper. Res. 93 (2000) 373–384.

[14] A. Kather, Fehlerlokalisierung in verketteten Produktionslinien am Beispielvon Lebensmittelverpackungsanlagen, Dissertation, Technische UniversitätMünchen, 2008.

[15] E. Bottani et al., Sizing and design of a bottling plant by means of a simulationmodel, Ind. Delle Bevande 201 (1) (2006) 1–10.

[16] R. Füssmann, Anlagentechnik – Simulation, KHS – Maschinen und AnlagenbauAG, 2005 <www.khs-ag.com>.

[17] W. Krug, Modellierung, Simulation und Optimierung für Prozesse derFertigung, Organisation und Logistik, Ghent u.a., SCS-Europe, 2001.

[18] J. Sedlaczek, Projektbegleitende Anlagensimulation spart Zeit und Kosten,Brauwelt 140 (8) (2000) 311–315.

[19] A. Kumpf, Anforderungsgerechte Modellierung von Materialflusssystemen zurplanungsbegleitenden Simulation, Dissertation, Technische UniversitätMünchen, 2001.

[20] B. Schmidt, Systemanalyse und Modellaufbau – Grundlagen derSimulationstechnik, Springer-Verlag, Heidelberg u.a., 1985.

[21] T. Voigt, Neue Methoden für den Einsatz der Informationstechnologie beiGetränkeabfüllanlagen, Dissertation, Technische Universität München 2004.

[22] D. Troupis, S. Manesis, N.T. Koussoulas, T. Chronopoulos, Computer integratedmonitoring, fault identification and control for a bottling line, in: ConferenceRecord of the 1995 IEEE Industry Applications Conference, vol. 2, 1995, pp.1549–1556.

[23] T. Voigt, M. Schmidt, H. Weisser, Entwickeln eines wissensbasiertenWerkzeugs zum Zuordnen füllerrelevanter Störungen, Projekt report,Technische Universität München, 2003.

[24] N.N., Automation Solutions for the Brewing Industry, Proleit AG, January 2012<http://www.proleit.com/ag/main/industries/breweries/grolsch-brewery-netherlands/>.

[25] N.N., Krones Information Technology KIT – IT Solutions for OptimizingProduction, Krones AG, January 2012 <http://www.krones.com/downloads/it_kit_e.pdf>.

[26] P. Struss, Model-based problem solving, in: F. van Harmelen, V. Lifschitz, B. Porter(Eds.), Handbook of Knowledge Representation, Elsevier, 2008, pp. 395–465.

[27] A. Beschta, O. Dressler, H. Freitag, M. Montag, P. Struss, A model-basedapproach to fault localization in power transmission networks, Intell. Syst.Eng. 2 (1) (1993) 3–14.

[28] O. Dressler, P. Struss, The consistency-based approach to automated diagnosisof devices, in: G. Brewka (Ed.), Principles of Knowledge Representation, CSLIPublications, Stanford, 1996, pp. 267–311.

[29] B. Williams, P. Nayak, A model-based approach to reactive self-configuringsystems, in: 7th International Workshop on Principles of Diagnosis (DX96),Montreal, 1996.

[30] L. Trave-Massuyes, R. Milne, Gas turbine condition monitoring usingqualitative model based diagnosis, IEEE Exp. Mag. (1997).

[31] R. Cunis, Modellbasierte Diagnose mit DiaMon, in: L. Hotz, T. Guckenbiehl, P.Struss (Eds.), Intelligente Diagnose in der industriellen Anwendung, Shaker,Aachen, 2000.

[32] T. Guckenbiehl, B. Münker, Überwachung und Diagnose in Färbereianlagen, in:L. Hotz, T. Guckenbiehl, P. Struss (Eds.), Intelligente Diagnose in derindustriellen Anwendung, Shaker Verlag, Aachen, 2000.

[33] F. Cascio, L. Console, M. Guagliumi, M. Osella, A. Panati, S. Sottano, D.Theseider-Dupré, Strategies for on-board diagnostics of dynamic automotivesystems using qualitative models, AI Commun. (1999).

[34] V. Venkatasubramaniana, R. Rengaswamyb, S.N. Kavuric, K. Yind, A review ofprocess fault detection and diagnosis, Part III, Comput. Chem. Eng. 27 (3)(2003) 293–346.

[35] B. Peischl, F. Wotawa, Model-based diagnosis or reasoning from firstprinciples, IEEE Intell. Syst. 18 (3) (2003) 32–37.

[36] Y. El Fattah, R. Dechter, Diagnosing tree-decomposable circuits, in:Proceedings of the International Joint Conferences on Artificial Intelligence,1995.

[37] M. Stumptner, F. Wotawa, Diagnosing tree-structured systems, Artif. Intell.127 (2001) 1–29.

[38] J. De Kleer, B.C. Williams, Diagnosing multiple faults, Artif. Intell. 32 (1) (1987)97–130.

[39] P. Struss, O. Dressler, Physical negotiation – integrating fault models into thegeneral diagnostic engine, in: Proceeding of the 11th International JointConference on Artificial Intelligence, 1989, pp. 153–158.

114 T. Voigt et al. / Advanced Engineering Informatics 29 (2015) 101–114

[40] O. Dressler, P. Struss, Model-based diagnosis with the default-based diagnosticengine, effective control strategies that work in practice, in: 11th EuropeanConference on Artificial Intelligence, ECAI-94, 1994.

[41] OCC’M Software GmbH <www.occm.de>.[42] S. Flad, T. Voigt, P. Struss, Automatic detection of critical points in bottling

plants with a model-based diagnosis algorithm, J. Inst. Brew. 116 (4) (2010)354–359.

[43] N.N., Diagnosemodelle für verkettete Abfüll- und Verpackungslinien in derLebensmittelindustrie (LineMod), Closing report, AiF ZUTECH Project (233ZBG), 2008, obtainable via <www.ivlv.org>.

[44] F. Rossi, P. van Beek, T. Walsh, Constraint Programming, in: F. van Harmelen,V. Lifschitz, B. Porter (Eds.), Handbook of Knowledge Representation,Elsevier, 2008, pp. 181–212.

[45] Math Works <http://www.mathworks.com/products>.[46] Siemens PLM Software <www.emPlant.com>.[47] A. Kather, C. Kreikler, T. Voigt, Weihenstephan Standards for Production Data

Acquisition, Version 2005.05, Technische Universität München, Chair of FoodPackaging Technology, 2010.