Benchmarking and robust multi-agent-based production planning and control

Engineering Applications of Artificial Intelligence 16 (2003) 307–320

ARTICLE IN PRESS

*Correspondi

7141.

E-mail addre

0952-1976/$ - see

doi:10.1016/S095

Benchmarking and robust multi-agent-based production planningand control

Daniel Freya,*, Jens Nimisb, Heinz W .orna, Peter Lockemannb

a Institute for Process Control and Robotics, Universit .at Karlsruhe (TH), Engler-Bunte-Ring 8, Karlsruhe 76131, Germanyb Institute for Program Structures and Data Organisation, Universit .at Karlsruhe (TH), Am Fasanengarten 5, Karlsruhe 76128, Germany

Abstract

Multi-agent systems (MAS) offer new perspectives compared to conventional, centrally organised architectures in the scope of

production planning and control. They are expected to be more flexible and robust while dealing with a turbulent production

environment and disturbances. In this paper, an MAS is developed and compared to an Operations Research Job-Shop algorithm

using a simulation-based benchmarking scenario. Environmental constraints for a successful application of MAS are identified and

classified. Furthermore, the topic of MAS robustness is addressed by applying database technologies on the basis of transactions.

r 2003 Elsevier Ltd. All rights reserved.

Keywords: Modelling; Simulation; Production planning; Scheduling; Flexibility; Reliability and multi-agent approach

1. Introduction

Companies nowadays have to face a global marketcharacterised by numerous competitors, a steadilyincreasing complexity of business processes and a highlyturbulent production environment. Consequently, man-ufacturing systems have to provide the flexibility andreliability that is required to stay competitive.Decentralised planning and controlling approaches

offer interesting perspectives compared to conventionalcentralised architectures. In the scope of productionplanning and control (PPC), multi-agent systems (MAS)are expected to be more flexible than centrally organisedsystems. Nevertheless, they lack of reliability androbustness that is necessary for an industrial deploy-ment.To prove or disprove the thesis of MAS being more

flexible and thus being able to increase the planningquality for well-defined shop floor scenarios, a simula-tion-based benchmarking platform on the basis of a realtest case scenario was developed at the University ofKarlsruhe in the scope of the Karlsruhe Robust AgentSHell (KRASH) project. A performance measurementsystem is included to provide not only qualitative, but

ng author. Tel.: +49-721-608-4264; fax: +49-721-608-

ss: [email protected] (D. Frey).

front matter r 2003 Elsevier Ltd. All rights reserved.

2-1976(03)00075-7

also quantitative results. The platform is used tocompare existing PPC approaches based on OperationsResearch (OR) algorithms with decentralised MASapproaches. Furthermore, different scenarios can besimulated with various levels of complexity. This makesit possible to set up a map that identifies applicationscenarios, where MAS provide a real benefit to potentialindustrial users. In the next step, abstract rules can beextracted from these results to gather further knowledgeabout the preferences of MAS.Besides the quality of the planning results, robustness

is a very important aspect of a manufacturing system,especially since the focus of the project is set uponhandling disturbances like machine troubles or tardinesscaused by external suppliers. On the shop floor,reliability may be guaranteed by sophisticated planningalgorithms.On the other hand, the software implementation of

the MAS has to be robust, too. Due to the distributedarchitecture consisting of autonomous and intelligententities, MAS are more error-prone compared to centralapproaches. Thus, special attention has to be paid totechnical robustness issues. Robustness and reliabilityare common features of modern database systems.Consequently, database technologies are used to provideservices that guarantee the robust execution of agenttasks. The implementation of robust MAS is simplifiedby defining a framework for the transaction-based

ARTICLE IN PRESSD. Frey et al. / Engineering Applications of Artificial Intelligence 16 (2003) 307–320308

execution of agent tasks. Local and dispersed agentplans are executed in a robust way by using atransaction service. Its performance and scalability wasevaluated by using simulation technologies.In Section 3, the simulation-based benchmarking

platform is described. The developed MAS approachis presented in Section 4. Section 5 shows the results ofthe comparison of the centralised and decentralisedplanning approaches and draws conclusions of theresults. The transaction-based robustness service andsimulation results are introduced in Section 6. Section 7summarises this paper.

2. Benchmarking scenario

In this project, the suitability of MAS in the range ofproduction planning and control is analysed. Thetransferability of the results upon industrial shop floorscenarios is one of the major prerequisites of theKRASH project. Thus, the evaluation has to beperformed on the basis of a real, or at least a realistic,production scenario. Besides the industrial relevance,the application of MAS has to be motivated. Intechnical literature (for example Weigelt and Mertens,1999; Spieck et al., 1995 or Cavalieri, 2000), MAS aredescribed to be more flexible and robust in a dynamic,turbulent production environment compared to centra-lised approaches. In addition, they are able to handlecomplex production planning problems more effectivelyby dividing them into less complex partial planningproblems. As a consequence, the scenario has to becharacterised by

* a sufficient production planning complexity,* the occurrence of short-term disturbances like ma-

chine failures, and

Fig. 1. Shop flo

* features like robustness and flexibility have to be keyrequirements.

The benchmarking scenario is based upon a circuitbreaker production plant. The shop floor layout isdepicted in Fig. 1. Within this plant, a productionarea (‘‘Unit Assembly Area’’) is chosen, wherecomponents are assembled that are used in the finalassembly later. The area consists of 13 assembly lines.Six different component families and four sub-compo-nent families, that are part of the components, areassembled. Thus, the test case represents a multi-levelassembly.The material flow is controlled by a Kanban system

(Ohno, 1993). As it is mentioned above, the componentassembly is the predecessor of the final assembly. Theraw material consumption in the final assembly deter-mines the production in the component assembly. Theorders in this area, including start dates, product IDsand quantities have been extracted in a simulation studybeforehand, so this part of the plant can be analysedseparately to reduce to complexity of the task in areasonable way. The correctness of the order data hasbeen approved by the industrial partner that runs theproduction plant.A Kanban system is especially suitable for the

integration of a MAS. Both systems are highlydistributed, since Kanban consists of decentralised,self-controlling control cycles. In this project, one ofthe control cycles is internally planned and controlled bya MAS. To level the workload of the machines, a linebalancing has to be performed. The main goal whilesetting up a Kanban system is the minimisation ofthe internal buffer stock. The two parameters thatdirectly affect the buffer stock are the maximumconsumption rate of the raw material and the maximumreplenishment lead time (Ohno, 1993). The problem of

or layout.

ARTICLE IN PRESSD. Frey et al. / Engineering Applications of Artificial Intelligence 16 (2003) 307–320 309

minimising the consumption rate is handled by aproduction smoothing (dispatching of a suitable productmix, so that the consumption rate of the raw materialsgets almost constant), whereas the replenishment leadtime depends on the internal production planning andcontrol and the corresponding process structure (pro-duction of small lot sizes and material flow-orientedshop floor layout). In the case of disturbances, thestandard deviation of the replenishment lead time has tobe minimised (and thus its maximum). MAS areexpected to deal with this task in a more effectivemanner due to their flexible and robust behaviour.

Fig. 2. Platform components.

3. Benchmarking platform

As it is already mentioned above, the solutionsgenerated by MAS are expected to be more flexibleand robust than conventional centralised approaches.On the other hand, centralised OR algorithms shouldprovide better results in non-disturbed productionenvironments. In addition, MAS are mostly developedin the scope of academic projects, thus there are onlyfew perceptions related to real-world scenarios and therelated shop floor complexity.One of the goals of the KRASH project is to prove or

disprove the above assertion and compare the twovarying approaches not only on a qualitative, but alsoon a quantitative basis. Thus benchmarking scenarioshave to defined and implemented. Cavalieri et al. (1999)and Cavalieri et al. (2000) define a benchmarkingframework and provide a common platform, so theresults of different planning approaches are comparabletaking into account the requirements for qualifiedbenchmarks. The benchmarking platform developed inthis project is used to perform the comparison task onthe basis of simulation and thus provides the necessarydynamical behaviour. The simulation model maps a realproduction scenario of a circuit breaker assembly. Theplanning task is a mixed-model assembly line balancingproblem (MALBP) and a mixed-model sequencingproblem (MSP). They differ in the planning horizon,whereas the first one is a long-term and the second one ashort-term decision problem (Scholl, 1999).The formal representation of the scenario requires the

definition of a meta-model. CIMOSA (Open SystemArchitecture for CIM) (CIMOSA Association, 1996)was chosen as the modelling methodology, since it ispublicly available, not restricted to a certain tool, and itis well-documented. eMPlant was chosen as the event-driven simulation tool including a building blocklibrary. The tool VICTOR (VIrtual FaCTORy lab)(Reithofer, 1995 and Reithofer, 1996) was developed atthe Institute for Process Control and Robotics. Itmerges both CIMOSA and eMPlant. The modellingprocess is performed using CIMOSA elements that are

mapped on the original eMPlant building blocks. In thenext step, an executable eMPlant model is createdautomatically. The CIMOSA model itself is representedin a textual format.Cavalieri et al. (2003) present design guidelines for

suitable production benchmarks. In addition, a highlysophisticated web-based modelling framework wasdeveloped that enables the user to define scenarios andprocesses that are afterwards stored in a test caselibrary. Both the static and the dynamic features of thescenario are modelled. In addition, measures of perfor-mance are defined. This approach is capable ofproviding a wide range of benchmarking scenariosrepresenting various PPC problem classes. An interest-ing extension would be the integration of a simulationcomponent that maps the non-deterministic dynamicbehaviour of the scenarios.The KRASH benchmarking platform (see Fig. 2)

consists of

* a process model (CIMOSA Function and Informa-tion View),

* a performance measurement system and* interfaces for the MAS (accessing the CIMOSA

Function and Information View).

The process model is built up by a building blocklibrary. This enables the user to create his own scenariosby combining the building bocks in an adequatemanner. The process structure of the building blocksis represented by the CIMOSA Function View. Theinformation related to the scenario is mapped onthe CIMOSA Information View. The definition of theinterfaces is based on a PPC database structure thatis presented in Fig. 3. The technical specifications

ARTICLE IN PRESS

Order

Disturbance Profile

Production Order MachineMachine List

OperationOperation List

Product BoM BoM Entry

Schedule

Order

Disturbance Profile

Production Order MachineMachine List

OperationOperation List

Product BoM BoM Entry

Schedule

Fig. 3. PPC database structure.

Fig. 4. System architecture.

D. Frey et al. / Engineering Applications of Artificial Intelligence 16 (2003) 307–320310

represented by database schemas are available at DBSpecifications (2003).The following elements of the scenario’s information

view are mapped within the tables of the database:

* Master Data (including the bill of materials and theoperation list for each product).

* Operation (assignment of machines to operations).* Order (list of the customer orders).* Production Order (production plan generated by the

PPC module).* Disturbance Profile (disturbance profile for each

machine).* Schedule (schedule/waiting queue for each machine).

The table Master Data contains product-specific data.The bill of materials lists all parts that are necessary toassemble the product, whereas the operation listrepresents the single worksteps of the product assembly.

Operation is a table that assign operations tomachines that are able to perform this particularworkstep (including a potential set-up process).The Order table lists all customer orders. Besides the

product and the quantity, the due dates or starting datesof the orders are defined.Based on the orders and the other parameters, the

Production Order table is the result of the planningprocess and finally determines the production dates,facilities and quantities.The Disturbance Profile table is machine-specific and

rests upon disturbance histories gathered from aMachine Data Acquisition (MDA) or a ProductionData Acquisition (PDA) system and rules of thumb.The planning results are stored in the Schedule table.

It contains the planning results of the PPC modules andthe current status of the machines’ waiting queues.The CIMOSA Information View contains static

information like master data, order data, productiondata and dynamic statistical data gathered within thesimulation. Based on this data, performance measureslike average buffer stock, consumption rates, through-put times, transportation times of the AGV system and

processing times are implemented (see Frey and W .orn,2001).Tailored interfaces enable other external MAS to

access the benchmarking platform. In the first version ofthe benchmarking platform, the integration is per-formed using standardised interfaces of eMPlant. TheMAS has to be directly integrated into the simulationenvironment by using the socket or C++ interface orimplementing the MAS in the eMPlant-specific object-oriented programming language Simtalk. This proce-dure is not very comfortable for the MAS developers,yet it is possible to use the discrete event mechanism ofthe simulation software.The second approach is a more sophisticated one. Its

system architecture is depicted in Fig. 4.In this approach, the MAS is de-coupled from the

benchmarking platform. Thus, it is not apparent for theMAS if it is working in a real application scenario or asimulation model which guarantees the portability ofresults.The current work is focussing on the synchronisation

of states and events on the basis of database triggers.They are used to synchronise the simulation and theMAS. In addition, numerous MAS that are working inone integrated scenario can be synchronised, too. In thiscase, the agent systems are co-operatively dealing withdifferent tasks.Another major goal of the project is the management

of disturbances, since MAS are a promising technologyto handle them more effectively compared to existingapproaches. Thus, a complex parameterisation of thescenario is possible. It affects two dimensions.

1. Complexity of the scenario.2. Homogeneity of the problem space.

Both dimensions are expected to have an effect on theplanning system. They can be varied by individuallysetting up the following parameters:

* lot size;* number of operations;

ARTICLE IN PRESS

KRASHMAS

Pla

nnin

gE

xecu

tion

Tra

ckin

g

intraplant interplant external

Fig. 5. Positioning of the KRASH MAS.

D. Frey et al. / Engineering Applications of Artificial Intelligence 16 (2003) 307–320 311

* number of machines;* workload;* disruption profile;* disruption interval and* disruption duration.

The lot size represents the amount of pallets that isproduced within one order. It affects the amount ofpartial orders that are generated from one customerorder. The number of machines varies the amount ofproduction facilities that are able to perform a specialoperation, whereas the workload factor extends theperiod between the start dates of two consecutivecustomer orders by a factor. The previous twoparameters provide additional production capacity.The disruption interval is the time gap between theoccurrence of two consecutive disruptions. Along withthe disruption duration, it makes up the disruptionprofile of a machine.During the simulation studies, these parameters have

been varied continuously and a map has been set up thatmarks scenarios that prefer a MAS treatment.

4. Multi-agent system

The developed multi-agent system was intended tosolve a mixed-model sequencing problem (MSP). Withrespect to the throughput time, the correspondingcentralised OR algorithm produces optimum solutionsfor the non-disturbed case, whereas its performance islimited in the case of disruptions.The problem of disturbance handling and MAS-based

scheduling has been addressed by the MASCADAproject at the KU Leuven. Van Brussel et al. (1998)present a MAS based on the Product, Resource, Orderand Staff Agent (PROSA) architecture. Jain and Foley(2002) analyse the effects of interruptions on flexiblemanufacturing systems and deduct guidelines for thedevelopment and implementation of schedules to copewith these uncertainties.Within the KRASH project, a MAS was developed to

handle a line balancing decision problem. The systemoperates on an intraplant level (see Fig. 5), whereasthere are interfaces to external suppliers and customers(orders). It performs production planning and controltasks, however the focus it set upon the executionfunctionality. The MAS was directly integrated into thesimulation environment due to performance and main-tainability reasons. Additionally, an implementation inFIPA-OS and JADE is available. Three PPC algorithmshave been implemented:

* Job-Shop (long-term planning horizon).* MAS pre (short-term planning horizon).* MAS act (no planning horizon).

They differ within the planning horizon. The planninghorizon is the part of the waiting queue that is ‘‘visible’’to the algorithm and thus may be used for optimisationpurposes. ‘‘Job-Shop’’ is an MSP line balancing algo-rithm (Scholl, 1999) and performs the planning task atthe beginning of the simulation. Consequently, it is notable to react to changes.Having n machines, the order is assigned to the

machine that obeys

CDx;mx�1 ¼ minðCDi;mi�1Þ; i ¼ 1yn; ð1Þ

where CDi;j is the planned completion date of order i atmachine i; n the number of machines that are able toprocess the current order, m the current position of theorder within the waiting queue.The completion dates are calculated using the

processing times of an order. In the non-disturbed case,this approach presents a forward scheduling algorithmthat will lead to an optimum solution with respect to thethroughput time.MAS pre assigns the orders to the production facility

as soon as they are available (see Fig. 6). This leads to ashort-term planning horizon. MAS pre is able to reactto changes within the planning horizon.On the other hand, MAS act is an exclusively reactive

system. The machine agents ask for new orders as soonas they finish the current order (see Fig. 7). Aperformance comparison of both reactive and plan-ning-based control architectures is also performed byBrennan (2000). Similar to the results presented in thispaper, the role of the planning horizon with respect tothe performance of the system is analysed.The communication of both of the MAS approaches

is performed using a protocol similar to the well-knownContractNet protocol (Parunak, 1987). The systemconsists of machine agents and order agents. Eachmachine agent represents a production facility andapplies for a assembly order if it is available. The orderagent represents a production order and is responsible

ARTICLE IN PRESS

OrderAgent MachineAgent

refuse

not-understood

propose

MAS_pre

[new order]

accept-proposal

reject-proposal

call-for-proposal

Fig. 6. AUML diagram of the MAS pre approach.

OrderAgent MachineAgent

refuse

not-understood

propose

MAS_act

[new order][done last job]

accept-proposal

reject-proposal

request

call-for-proposal

Fig. 7. AUML diagram of the MAS act approach.


for the material and information flow of the system. Dueto the reactive behaviour of the MAS, the system is notable to reach globally optimum solutions. Archimedeand Coudert (2001) address this challenge by presentingthe Supervisor, Customers, Environment, Producers(SCEP) framework for a reactive MAS that improvesthe range of co-operation, yet sustaining the ability toreact to disturbances.The rating process for the machine agents in this

project is based on a performance measurement number(PMN, see the below equation).

PMN ¼1

Pni¼1 RPTi þMDT

� �=60

; ð2Þ

where n is the number of orders in the waiting queue ofthe machine agent, RPTI is the remaining processingtime of order i; MDT is the average disruption time.The remaining processing time of an order is

calculated on the basis of the current schedule and theprocessing time for each product. The primary goal ofthe planning strategy was the minimisation of thethroughput time and an effective line balancing of thevarious machines. Besides the throughput time itself, itsstandard deviation is of interest, since it determines thepredictability of the system. This is an important featurefor a production planner, since MAS are assumed to actunpredictably due to their distributed architecture.The performance is decisively affected by this

performance measurement number. Similar to the Job-Shop algorithm, the processing time of the remainingorders in the waiting queue is calculated. The ratingresults in the inverse processing time and is scaled to 1,i.e. 0pxp1; whereas 1 is the highest rating. As aconsequence, the order is assigned to the machine agentthat will finish its order first. If the machine agent isdisrupted, then the disruption time has to be consideredin addition to the processing time. Normally, anestimated disruption time is not known. To avoid thisproblem, each machine agent contains a disturbancehistory gathered during the simulation. In reality, thisinformation may be extracted from an MDA or a PDAsystem. The global effects of several local balancingobjectives in flexible manufacturing systems on perfor-mance measurement numbers like throughput time,makespan, mean flow time and mean tardiness isanalysed by Kumar and Shanker (2002).

5. Benchmarking results

5.1. Simulation studies

Two targets were set by the KRASH project:

1. To compare a decentralised MAS and a centralisedOR approach on a quantitative basis.

2. To gather further knowledge on and acquire a deeperunderstanding of the behaviour of MAS.

Within the simulation studies, the complexity of thescenario and the homogeneity of the problem space werecontinuously varied by changing the correspondingproduction parameters. Discrete values for the disrup-tion duration, the number of operations, the lot size, thedisruption interval and the workload were chosen. Itwas investigated if the behaviour of the MAS iscontinuous and predictable, depending on the environ-mental production constraints, or if the MAS performsnon-deterministically.

ARTICLE IN PRESS

Table 1

Parameter units

Parameter Unit

Disruption duration Minutes

Number of operations Number

Lot size Pallets

Disruption interval Processing time for one lot size

Workload Factor, that extends the interval between the

dispatching of two consecutive orders


For each configuration, represented by a tuple of thefive parameters listed above in Table 1, the correspond-ing average throughput time and the standard deviationof the throughput time were analysed, which led to morethan 1000 simulation runs. Within the evaluationprocess, the results were abstracted with respect to onlytwo parameters to reduce the complexity of the resultsand increase their clearness. The scaled differencecalculated by the amount of scenarios with the MASperforming better, minus the amount of scenarios withthe Job-Shop algorithm performing better represents thefinal decision variable. The evaluation procedure may bedescribed as follows:

1. Choose two of the parameters.2. Fixing those two parameters, perform simulation

runs with the other three varying.3. For theses simulation runs, calculate the amount of

scenarios with the MAS performing better, minus theamount of scenarios with the Job-Shop algorithmperforming better.

4. Scale the result, so that �2pxp2; x being the finaldecision variable.

The benchmarking results are presented in the form ofarea diagrams, the two axes mapping the two para-meters mentioned above. The diagrams are intuitivegraphical representations of the decision variable x: Inaddition, the colour gradient enables the recognition ofgeneral tendencies concerning the behaviour of MAS.Within the area diagrams, the bright areas identifyscenarios where a centralised Job-Shop algorithmsproduces better results concerning the average through-put time or its standard deviation (xo0). The dark areasrepresent boundary conditions where the MAS issuperior (x > 0).

5.2. Preliminary analysis

The aim of the preliminary analysis is the limitation ofthe evaluation space. The effects of the number ofadditional machines and the variation of the disruptionprofiles are investigated to perform further evaluationson the basis of these results while fixing these twoparameters.

In the first step, the complexity of the planning taskwas increased by increasing the number of operationsthat are necessary to assemble a product, introducingadditional production facilities and increasing thedisruption intervals and duration for the individualmachines. Similar machines that are able to handle thesame operations were assigned the same disruptionprofiles. Surprisingly, the introduction of disturbanceshad almost no effect on the results. On the contrary, theJob-Shop algorithm even produced better results. Closerinvestigations of the results revealed the reason for thisbehaviour. The production plant ran on its capacitylimits. The workload of the machines is almost 100%during day shift. Consequently, there is not vacantcapacity for rescheduling activities. When handlingdisturbances, vacant production capacities are oneprerequisite for the successful operational use of areactive MAS. Otherwise, Job-Shop algorithms producebetter results.But even for this rather disadvantageous scenario for

a MAS, it is clearly identifiable that the standarddeviation of the throughput time is decreasing with thecomplexity of the scenario increasing. Small lot sizesincrease the complexity of the planning process andoffer more degrees of freedom. This leads to a morecomplex decision space. On the other hand, moreproduction orders are generated which lead to a highlydynamic situation on the shop floor when disruptionsoccur. The results for the standard deviation of thethroughput time depending on the complexity of thesystem show that the MAS produces almost constantthroughput times. The standard deviation is obviouslysmaller compared to the centralised approach. This is ageneral result of the project. Even if the average MASresults are worse than those of the Job-Shop algorithm,the standard deviation is normally smaller.In the next step, additional production capacity was

introduced.Fig. 8 shows that no clear pattern is noticeable. The

MAS performs well and gets slightly better when theinterval between the dispatching of two consecutiveorders is extended (resembles the factor increasing; seeTable 1), which leads to additional vacant productioncapacity. Now the MAS has the opportunity toreschedule orders as soon as disturbances occur onone of the machines. But even now, the results got onlyslightly better compared to the centralised approach.Since redundant production facilities had the samedisruption profile, rescheduling activities led to theeffect, that the order was often disrupted on the machinethat was chosen by the rescheduling algorithm. Thesuperior planning quality of the Job-Shop algorithmand the flexibility of the MAS nearly compensated eachother.In the third step, both the complexity of the scenario

and the homogeneity of the problem space were varied.

ARTICLE IN PRESS

0.5 1 1.51

3

5

Workload

Lot Size

Throughput Time

1-1.5

0.5-1

0-0.5

-1-0

0.5 1 1.51

3

5

Workload

Lot Size

Standard Deviation

1-1.5

0.5-1

0-0.5

-1-0

ig. 9. Average throughput time and standard deviation of the

roughput time (homogeneity of the problem space).

0.5 1 1.51

3

5

Workload

Lot Size

Throughput Time

0.5-1

0-0.5

-1-0

0.5 1 1.51

3

5

Workload

Lot Size

Standard Deviation

1-1.5

0.5-1

0-0.5

-0.5-0

-1--0.5

Fig. 8. Average throughput time and standard deviation of the

throughput time (additional production capacity).

0.5 1 1.560 min

30 min

5 min

Workload

DisruptionDuration

Throughput Time

1-1.5

0.5-1

0-0.5

0.5 1 1.560 min

30 min

5 min

Workload

DisruptionDuration

Standard Deviation

1-1.5

0.5-1

0-0.5

-1-0


throughput time (disruption duration).


Now the MAS could fully benefit from its flexibility andits reactive behaviour. The redundant machines gotdifferent disruption profiles representing different reli-abilities. In reality, this corresponds to a mixed shopfloor consisting of more than one generation ofmachines, as it is quite common. The older machinesare less reliable than the newer ones. In addition, theplanning complexity was also increased.For this class of scenarios (different disruption

profiles), the application of MAS clearly makes animpact (see Fig. 9). The results get better with decreasinglot sizes and increasing workflow factor, as supposedbeforehand and explained above.Due to the hitherto results, the rest of the evaluation

process is performed on the basis of a scenario classwithout additional machines (i.e. the original circuitbreaker assembly production plant) and different dis-ruption profiles for the machines. This limits theevaluation space to a reasonable degree and the generalfeatures of MAS are expected to show up moreevidently.

5.3. Disruption duration

The room for improve of a decentralised MASapproach is positively affected by the disruptionduration. While the centralised Job-Shop approach isnot able to react to machine failures, the MAS candispatch the orders adaptively. However, vacant pro-duction capacity is a prerequisite.

F

th

Fig. 10 clearly points out that the MAS producessteadily increasing results compared to the centralisedOR algorithm. This assertion is also valid for the


standard deviation of the throughput time. Further-more, it is apparent that the MAS gets slightly betterwith the workload decreasing. The factor 1.5 leads tovacant production capacity. The standard deviationnearly remains unaffected by the workload. It is definedby the dispatching mechanism that is not changed by theworkload factor.

5.4. Lot size

The lot size is an input variable for the productionplanning process. An order in the order list is split intoorder quantity=lot size� �

partial orders. The amount ofpartial orders is inversely proportional to the lot size.Small lot sizes entail many orders being dispatchedwhich leads to a rise of the complexity of the shop floorscenario.Due to the huge amount of partial orders and the long

planning horizon, the planned start dates and comple-tion dates are differing more and more from the realones when disturbances occur. The centralised Job-Shopapproach is not able to regulate this deviation (seeFig. 11). The original goal of a line balancing algorithm,to minimise the completion dates by effectively balan-cing the orders, is no longer feasible. The reason for thisfact is the missing consideration of disturbances in theplanning process. The less reliable machines are chargedin a disproportionate way. The effect is even moreobvious when looking at the standard deviation. The

5 min 30 min 60 min1

3

5

Disruption Duration

Lot Size

Throughput Time

1-1.5

0.5-1

0-0.5

5 min 30 min 60 min1

3

5

DisruptionDuration

Lot Size

Standard Deviation

1-1.5

0.5-1

0-0.5

-1-0


throughput time (lot size).

throughput time is calculated by summarising theprocessing and the waiting time of the correspondingorder. The waiting times for the machines are almostconstant when using a MAS, which is caused by thehighly effective treatment of waiting queues. Conse-quently, the completion dates can be estimated moreprecisely.

5.5. Number of operations

The number of operations is, like the lot size, aparameter within the production planning process. Onecustomer order is divided into several partial orders,whereby in this case, the partial orders have to keep apre-defined assembly sequence.The disposition into partial orders is handled differ-

ently for the lot size and the number of operations (seeFig. 12). In the first case, the partial orders aredispatched in parallel. This leads to long waiting queuesthat are handled more effectively by the MAS. In thesecond case, the partial orders are sequentially dis-persed. This leads to short waiting queues, and thepositive features of the MAS do not show up asobviously as before.

5.6. Disruption interval

Besides the disruption duration, the disruption inter-val is the second most important parameter that buildsup the disruption profile.

1 2 31.5

1

0.5

Number ofOperations

Workload

Throughput Time

1-1.5

0.5-1

0-0.5

1 2 31.5

1

0.5

Number ofOperations

Workload

Standard Deviation

0.5-1

0-0.5

ig. 12. Average throughput time and standard deviation of the

roughput time (number of operations).

F

th

ARTICLE IN PRESS

1 2 360 min

30 min

5 min

Disruption Interval

Disruption Duration

Throughput Time

1-1.5

0.5-1

0-0.5

1 2 360 min

30 min

5 min

Disruption Interval

DisruptionDuration

Standard Deviation

0.5-1

0-0.5


throughput time (disruption interval).

0.5 1 1.51

3

5

Workload

Lot Size

Throughput Time

1-1.5

0.5-1

0-0.5

0.5 1 1.51

3

5

Workload

Lot Size

Standard Deviation

0.5-1

0-0.5


throughput time (workload).


At first sight, the results in Fig. 13 make no sense. Thedisruption interval seems to have almost no effect uponthe suitability of MAS. In fact, this is not true. Thedisruption interval is just not a dominant parameter. Itaffects the results in a qualitative, but not in aquantitative way. The average throughput times in factare smaller for small disruption intervals. Unfortu-nately, this performance number is not directly dis-played in Fig. 13. It represents a qualitative comparisonof the two approaches (based on the throughput time).The above assertion is validated on a qualitative basiswhen looking at the original simulation results (beforethe representation as an area diagram).

5.7. Workload

The workload factor controls the time interval definedby the start dates of two consecutive orders. Factor 1resembles the original data. By dispersing the startdates, additional production capacity is realised that isavailable for the MAS when dispatching the orders inthe case of disruptions.The presentation of the results in Fig. 14 points out

that the dynamic dispatching of orders is handledslightly more efficient with decreasing workload. Non-deterministic vacant production capacity gets accessibleto the MAS, whereas the centralised approach is notable to make use of it. The general advantage of MAS iscaused by two factors. On the one hand, MAS candispatch orders upon vacant machines dynamically. On

the other hand, they are able to handle waiting queuesmore efficiently compared to centralised approaches.Depending on the order density, these MAS featurescome into conflict with each other, which leads to thealmost constant behaviour depicted in Fig. 14. Thisexample once more points out that several parametersaffect the behaviour of a MAS at the same time.

5.8. Conclusion

The inhomogeneity of the problem space turned outto be the decisive factor for the suitability of a MAS inthe scope of production planning and control. In thiscase, the quality of the solutions are dynamicallyevolving and may not be predetermined. The complexityof the planning task is the other important factor. In thisproject, the complexity was mapped upon productionparameters, that are listed in Section 2. During theevaluation, it was possible to assess the parametersaccording to their impact upon the results:

1. disruption duration;2. number of operations;3. lot size;4. disruption interval;5. workload.

Interestingly, the order was identical for all of theperformance figures (throughput time, processing timeand their deviations).

ARTICLE IN PRESS

Planning Agent

Execution Agent

LocalPlanLocalPlan

ProductionDatabase

Planning Agent

Execution Agent

LocalPlanLocalPlan

Shared Goal

Shared Plan

Co-operationCo-operation

Co-ordinationCo-ordination

Action(DB-Transaction)

Perception

ExecutionPerception

RobustnessService

ProductionDatabase


Two key features, that explain the superiority ofMAS in a turbulent production environment, wereidentified. First of all, MAS have the ability to ‘‘follow’’good results. Due to the short planning horizon, themachine agents are able to consider time-dependentplanning variables for their ratings (see Section 3),which leads to more precise results. On the other hand,the waiting queues of the lines are handled moreefficiently when disturbances occur. The line balancingprocess is more effective and thus the medium through-put times and its standard deviations are smaller.The second factor is important with respect to thepredictability of the results. These two key featuresaffect each other, which finally explains the behaviour ofthe MAS.

Fig. 15. Robust MAS architecture (Nagi et al., 2001).

Action Node

Control Node

Synchronisation Node

Action NodeAction Node

Control NodeControl Node

Synchronisation NodeSynchronisation Node

Sequential Execution

Alternative Execution

Parallel Execution

Sequential ExecutionSequential Execution

Alternative ExecutionAlternative Execution

Parallel ExecutionParallel Execution

Co-ordination Messageto other Execution Agent

Fig. 16. Local open-nested transaction tree (Nagi et al., 2001).

6. Technical robustness

6.1. Robustness service

As we have shown in the last sections, from afunctional point of view, MAS are superior to centra-lised systems in certain complex environment settings.For the implementation of MAS in mission-criticalindustrial-scale applications, two additional non-func-tional aspects have to be considered. These are thetechnical robustness of the MAS itself and the usabilityof possible robustness services. Robustness services arecommon in database technologies. Consequently, it is alogical conclusion to try to apply the mechanisms thatare used in this area to MAS. Nevertheless, somechallenges have to be faced.Within the first approach, the robust execution of

agent plans was realised by representing them togetherwith their possible contingency behaviour in a commonstructure, namely the so-called open-nested transactiontrees (for a closer examination, see Nagi, 2001b). Weentrust the robust execution of these trees to specialcomponents within our MAS, called Execution

Agents. As plans are developed co-operatively by severalagents, e.g. order agents and machine agents, theexecution of the local parts of a negotiated distributedplan must be co-ordinated by the correspondingexecution agents. Fig. 15 shows the resulting systemarchitecture.In the presented system architecture, open-nested

transaction trees play a crucial role, as they are thegeneric means to submit agent plans between thelayer of planning agents and the layer of the executionagents. In these trees, every single agent action isencapsulated in a so-called ACID transaction (Weikumand Vossen, 2002), which accesses the underlyingdatabases. ACID transactions satisfy the Atomicity,

Consistency, Isolation and Durability conditions, thatguarantee the absence of side effects in error-free

operation and recoverability in case of technicaldisturbances. An example of a local open-nestedtransaction tree is depicted in Fig. 16.Besides the definition of agent actions in so-called

Control Nodes, several other properties are definedwithin the transaction tree, like the contingencybehaviour, that is executed in case of disturbances,control flow rules, co-ordination rules and various othercontrol parameters.This structure enables the developer to map the

original negotiated agent plans onto transaction trees,

ARTICLE IN PRESS

F C

0

0.002

0.004

0.006

0 5 10 15 20 25

Group Size

Tra

nsa

ctio

ns

/ (S

ec. *

Ag

ent)

0 0.51 2

3 6

F

0

0.002

0.004

0.006

0 5 10 15 20 25

Group Size

Tra

nsa

ctio

ns

/ (S

ec. *

Ag

ent)

0 0.51 2

3 6

Fig. 17. Throughput of the robustness service (Nagi, 2001a).

MAS Platform IMAS Platform I MAS Platform IIMAS Platform II

TA-TaskTA-Task

UserTask A1

TA-TaskTA-Task

Database Database

Transaction ManagerTransaction Manager

Agent AAgent A Agent BAgent B

Standard FIPA-Conversations

UserTask B1

MAS Platform IMAS Platform I MAS Platform IIMAS Platform II

TA-TaskTA-Task

UserTask A1

TA-TaskTA-Task

Database Database

Transaction ManagerTransaction Manager

Agent AAgent A Agent BAgent B

Standard FIPA-Conversations

UserTask B1

Fig. 18. System architecture of conversation-based robustness service.


which allow the execution agents to maintain the globalcorrectness criterion, even in the case of disturbances:

* the complete execution of the transaction tree, or* the transaction tree appears to not have been

executed at all.

The usability of the approach is provided in threeways. First of all, the generality of the mechanism isdescribe in (Nagi et al., 2001) by giving exemplarymappings for different plan structures, like

* task delegation,* task co-ordination and* termination co-ordination.

Secondly, the conformance to standards is guaran-teed, as the transaction service is implemented on aFIPA-compliant (Foundation for InteroPerable Agents)basis (Nimis, 2001). For an industrial application,standardisation is a decisive argument for the introduc-tion of new technologies.Thirdly, as shown in the next subsection, the

architecture is highly scalable.

6.2. Scalability

The scalability of the approach was proved for localplans (Nagi, 2000) and distributed plans (Nagi, 2001a).In these experiments, the different parameters of thetransaction model, like the average number of childnodes in a transaction tree, were varied to reflectdifferent agent planning strategies and their mappingon transaction trees. Under these variations, differentperformance indices of the MAS robustness service, likethe throughput, the response time or the ratio ofcompensated actions to terminated ones, were mea-sured. The results of the studies showed that the servicescales in the expected predictable manner.To give an example of the performed measurements,

Fig. 17 shows the throughput of the robustness servicefor different plan conflict ratios (FC) and group sizes ofagent groups sharing a common plan. Thus, it gives anindication for choosing the optimal group size. Thismust be considered when creating new schedulingalgorithms by delegating partial tasks to agent groups.

6.3. Conclusions

The presented approach has a mentionable disadvan-tage regarding the usability. Agents have to representtheir plans and potential compensation activities withinthe format of open-nested transactions, i.e. the robust-ness of the MAS is achieved by demanding an additionalimplementation effort from the developer, especially ifdistributed agent plans are considered. This motivatesthe request for a more ‘‘natural’’ and ‘‘transparent’’

integration of the robustness service into the MASimplementation process.As a consequence, in our current research, a different

approach is chosen, based on distributed ACID-transactions, which are identified with conversationsbetween negotiating planning agents. The relatedadvantage is a natural and seamless integration of themechanism into standardised MAS platforms likeFIPA-OS. The system architecture is depicted in Fig. 18.In the new approach, the original User Tasks, that

handle the planning conversations in each involvedagent, are embedded into transactional tasks (TA

Tasks). They later automatically handle all necessaryadditional duties caused by the use of the robustnessservice for their embedded User Tasks. As a result, therealised approach only interferes with the communica-tion layer and does not affect processes within the agentsthemselves.On the other hand, distributed ACID-transactions do

not make use of the complete semantics of the agentconversation, i.e. they are probably too restrictive. Thismay lead to a decrease in performance or cascadingrecoveries. These considerations are focused by ourcurrent research activities.


The final goal comprises a flexible, conversation-based transaction model that also takes into account theinternal agent states. A combined mechanism, consistingof the first and second approach described above andincluding knowledge about the semantics of themessages inside a conversation, may be a promisingnew approach.

7. Summary

To summarise this paper, MAS-based decentralisedplanning approaches produce better planning resultsthan centralised Job-Shop algorithms when the decisionspace for a MSP problem is both complex andinhomogeneous. MAS have the ability to ‘‘follow’’ goodresults in the decision space by taking into account time-dependent planning parameters. In addition, the waitingqueues of the lines are handled more efficiently whendisturbances occur. On the other hand, MAS normallyprovide less optimal solutions due to their localapproaches of problem solving through negotiation.OR algorithms are highly sophisticated and effective inthe non-disturbed case due to their much longerplanning horizon and the global point of view. Yet thecorresponding re-scheduling algorithms that are appliedin the case of disruptions are not sufficiently perfor-mant. The planning process for real industrial applica-tions takes too much time to comply with real-timedemands.Another aspect, that has not been mentioned yet is the

maintainability of the different manufacturing systems.Due to its modular design, MAS are easier to maintainand the extensibility is guaranteed by a simple plug andplay mechanism. On the other hand, the developmenteffort to implement a MAS is higher, mainly caused bycommunication topics. MAS and rescheduling algo-rithms present a mean to guarantee robustness on theshop floor. On the other hand, technical robustness hasto be handled, too, due to the increased complexity ofMAS compared to conventional centralised approaches.Transaction mechanism known from database technol-ogy offer interesting perspectives to the MAS user.Nevertheless, additional migration work is necessary tomerge these two technologies. A service-oriented archi-tecture, integrating transaction-based recovery mechan-isms into FIPA-compliant MAS platforms, is apromising and, even more important, user-friendlynew approach.Standardisation is a crucial task for the industrial

propagation of agent-based technologies. This affectsboth the technological platform layer and the applica-tion layer. Standardised MAS solutions for well-definedPPC problems, including realistic and comprehensiblebenchmarks, are key factors for a the successfuldissemination.

Acknowledgements

The KRASH project is funded by the DFG (DeutscheForschungsgemeinschaft - German Research Founda-tion). The project is part of the priority researchprogram 1083 ‘‘Intelligente Softwareagenten und be-triebswirtschaftliche Anwedungsszenarien’’ (‘‘IntelligentSoftware Agents and Realistic Application Scenarios’’).The results of the projects involved are publiclyavailable at RealAgentS (2003).

Technical Resources

CIMOSA Association (1996) CIMOSA-Open SystemArchitecture for CIM, Technical Baseline; Version 3.2.Private Publication.DB Specifications (2003) ‘‘DB Specifications on

RealAgentS’’, [Internet], SPP 1083. Available from:http://www.realagents.org/public.php?ID=901&div1=879&action=open [Accessed 28.1.03]RealAgentS (2003) ‘‘RealAgentS’’, [Internet], SPP 1083.

Available from: http://www.realagents.org [Accessed 28.1.03]

References

Archimede, B., Coudert, T., 2001. Reactive scheduling using a multi-

agent model: the SCEP framework. Engineering Applications of

Artificial Intelligence 14 (5), 667–683.

Brennan, R.W., 2000. Performance comparison and analysis of

reactive and planning-based control architectures for manufactur-

ing. Robotics and Computer-Integrated Manufacturing 16 (2–3),

191–200.

Cavalieri, S., 2000. Improving Performance of a Flexible Manufactur-

ing System by Petri Net based Modelling and Simulation. In:

Proceedings of the 26th Annual Conference of the IEEE IECON

2000, Nagoya, pp. 1298–1305.

Cavalieri, S., Bongaerts, L., Macchi, M., Taisch, M., Wyns, J.,1999. A

benchmark framework for manufacturing control. In: Proceedings

of the Second International Workshop on Intelligent Manufactur-

ing Systems, Leuven, pp. 225–236.

Cavalieri, S., Garetti, M., Macchi, M., Taisch, M., 2000. An

experimental benchmarking of two multi-agent architectures for

production scheduling and control. Computers in Industry 43 (2),

139–152.

Cavalieri, S., Macchi, M., Valckenaers, P., 2003. Benchmarking the

performance of manufacturing control systems: design principles

for a web-based simulated testbed. Journal of Intelligent Manu-

facturing. 14 (1), 43–58.

Frey, D., W .orn, H., 2001. Development of a complex production

scenario for the application of mas in the scope of production

planning and control. In: Proceedings of the Third DFG

colloquium of the German Priority Research Program on

Intelligent Agents and Realistic Commercial Application Scenar-

ios, Hameln (in German).

Jain, S., Foley, W.J., 2002. Impact of interruptions on schedule

execution in flexible manufacturing systems. International Journal

of Flexible Manufacturing Systems. 14 (4), 319–344.

Kumar, N., Shanker, K., 2002. Comparing the effectiveness of

workload balancing objectives in FMS loading. International

Journal of Production Research 39 (5), 843–871.

http://www.realagents.org/public.php?ID=901&div1=879&action=open




http://www.realagents.org

ARTICLE IN PRESSD. Frey et al. / Engineering Applications of Artificial Intelligence 16 (2003) 307–320320

Nagi, K., 2000. Scalability of a transactional infrastructure for

multiagent systems. In: Proceedings of the First workshop on

Infrastructure for Scalable Multiagent Systems at Autonomous

Agents 2000 (Agents2000), Barcelona.

Nagi, K., 2001a. Modeling and simulation of cooperative multiagents

in transactional database environments. In: Proceedings

of the Second workshop on Infrastructures for Scalable

Multiagent Systems at Autonomous Agents 2001 (Agents2001),

Montreal.

Nagi, K., 2001b. Transactional agents—towards a robust multi-agent

system. Lecture Notes in Computer Science, Vol. 2249. Ph.D.

Thesis, Heidelberg, Springer.

Nagi, K., Nimis, J., Lockemann, P., 2001. Transactional support for

cooperation in multiagent-based information systems. In: Proceed-

ings of the Joint Conference on Distributed Information Systems

on the basis of Objects, Components and Agents (VertIS),

Bamberg.

Nimis, J., 2001. Integration of a Transaction-based Robustness Service

in the FIPA-OS Platform. In: Proceedings of the Third DFG

colloquium of the German Priority Research Program on

Intelligent Agents and Realistic Commercial Application Scenar-

ios, Hameln (in German).

Ohno, T., 1993. Toyota Production System. Productivity Press

Incorporated Portland, OH.

Parunak, H., 1987. Manufacturing experience with the contract net.

In: Huhns, M.N. (Ed.), Distributed artificial intelligence. Morgan

Kaufmann Publishers, Los Altos.

Reithofer, W., 1995. VICTOR—designing the factory of the future. In:

Proceedings of the 11th International Conference on CAD/CAM,

Robotics and Factories of the Future (CARS&FOF), Pereira.

Reithofer, W., 1996. Bottom-up Modelling with CIMOSA. In:

Proceedings of the Second International Conference on the Design

of Information Infrastructure Systems for Manufacturing

(DIISM), Kaatsheuvel.

Scholl, A., 1999. Balancing and Sequencing of Assembly Lines.

Physica-Verlag, Heidelberg.

Spieck, S., Weigelt, M., Falk, J., Mertens, P., 1995. Decentralized

problem solving in logistics and production with partly intelligent

agents and comparison with alternative approaches. In: Nuna-

maker Jr., J.F., Sprague Jr., R.H. (Eds.), Proceedings of the 28th

Hawaii International Conference on System Sciences, Maui,

p. 52.

Van Brussel, H., Wyns, J., Valckenaers, P., Bongaerts, L., Peeters, P.,

1998. Reference architecture for holonic manufacturing systems:

PROSA. Computers in Industry 37 (3 special issue on IMS),

255–274.

Weigelt, M., Mertens, P., 1999. Comparison of decentralised and

centralised computer based production control. In: Baskin, A.B.,

Kovacs, G.L., Jacucci, G. (Eds.), Cooperative Knowledge Proces-

sing for Engineering Design IFIP, Vol. 137. Kluwer, Dordrecht,

pp. 83–92.

Weikum, G., Vossen, G., 2002. Transactional Information Systems:

Theory, Algorithms, and the Practice of Concurrency Control and

Recovery. Morgan Kaufmann Publishers, Los Altos, CA.

Documents

Benchmarking and robust multi-agent-based production planning and control