14
A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of Human Behaviour Marcel Kvassay * Institute of Informatics Slovak Academy of Sciences Dúbravska cesta 9, 845 07 Bratislava, Slovakia [email protected] Abstract This dissertation investigates causal relationships leading to emergence in agent-based models of human behaviour. A new method based on nonlinear structural causality is formulated and applied to a model of human behaviour from project EUSAS for European Defence Agency EDA. The method is based on the concept of a causal partition of a model variable which quantifies the contribution of various factors to its numerical value. Quantifying and separating their effects makes it possible to judge their relative importance in crucial early periods during which the emergent behaviour begins to form. Moreover, causal partitions can be used as the predictors of emergence: the time evolution of the predictive accuracy of their compo- nents hints at the deeper causes of emergence and at the possibilities to bring it under control. Categories and Subject Descriptors I.2.11 [Artificial Intelligence]: Distributed Artificial Intelligence—Intelligent agents, Multiagent systems ; I.6.6 [Simulation and Modeling]: Simulation Output Anal- ysis Keywords human behaviour modelling, agent-based models, multia- gent simulation, emergence, causal analysis, data mining and machine learning 1. Introduction The recent European Agent Technology Roadmap [15] views agent-based computing as “one of the most vibrant and important areas of research and development” in IT. * Recommended by thesis supervisor: Assoc. Prof. Ladislav Hluch´ y To be defended at Faculty of Informatics and Informa- tion Technologies, Slovak University of Technology in Bratislava on July 7, 2017. c Copyright 2017. All rights reserved. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy other- wise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific per- mission and/or a fee. Permissions may be requested from STU Press, Vazovova 5, 811 07 Bratislava, Slovakia. The Roadmap considers agent technologies from three perspectives: (a) agents as design metaphor; (b) agents as a source of technologies; and (c) agents as simulation. This dissertation is concerned mainly with the third per- spective in which agents figure as natural models for many complex real-world phenomena. In particular, their abil- ity to generate surprising emergent behaviours from sim- ple rules governing individual agents has made them an attractive modelling tool for social sciences. Yet the un- derlying causes and the actual processes by which the emergent behaviours arise have often remained a mys- tery. The Roadmap states that“understanding the mech- anisms that can be used to model, assess and engineer self-organisation and emergence in multi-agent systems is an issue of major interest.” In a similar vein, a UK gov- ernmental report [3] declares that the “difficulty in form- ing rigorous causal characterisations of the aggregate be- haviour of a complex system (rather than the absence of regularity or predictability in this aggregate behaviour) . . . is the more legitimate barrier to adopting complex- systems approaches in an ICT engineering context.” This dissertation is an attempt to contribute towards the dis- covery and exploration of causal relationships leading to emergence in agent-based models of human behaviour. This goal has been realised in three steps. First, a generic agent architecture has been designed for easy incorpora- tion and exploration of various models of human nature and behaviour. Second, a new analytical method was pro- posed on the basis of the theory of nonlinear structural causality, which is a nonlinear and nonparametric form of Structural Equation Modelling – SEM. Third, the new method was demonstrated through an exemplary applica- tion to the simulation data from experiments conducted in the proposed agent architecture. As a result, the dissertation spans four large fields: (a) human behaviour modelling; (b) agent-based simulations; (c) nonlinear structural causality; and (d) data mining and machine learning. The first has provided the in- vestigated model, the second various technologies for its software implementation and simulation, the third gen- eral analytical principles, and the fourth the means for applying those analytical principles to large amounts of computer-generated simulation data. Each of these four fields is large enough to merit a dedicated dissertation of its own. In order to keep the dissertation manageable, it was therefore necessary to restrict the attention only to

A Contribution Towards the Discovery of Causal ...acmbulletin.fiit.stuba.sk › abstracts › kvassay2017.pdf · second phase, the short listed candidates were evaluated in depth

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • A Contribution Towards the Discovery of CausalRelationships in Agent-Based Models of Human

    Behaviour

    Marcel Kvassay∗

    Institute of InformaticsSlovak Academy of Sciences

    Dúbravska cesta 9, 845 07 Bratislava, [email protected]

    AbstractThis dissertation investigates causal relationships leadingto emergence in agent-based models of human behaviour.A new method based on nonlinear structural causality isformulated and applied to a model of human behaviourfrom project EUSAS for European Defence Agency EDA.The method is based on the concept of a causal partitionof a model variable which quantifies the contribution ofvarious factors to its numerical value. Quantifying andseparating their effects makes it possible to judge theirrelative importance in crucial early periods during whichthe emergent behaviour begins to form. Moreover, causalpartitions can be used as the predictors of emergence: thetime evolution of the predictive accuracy of their compo-nents hints at the deeper causes of emergence and at thepossibilities to bring it under control.

    Categories and Subject DescriptorsI.2.11 [Artificial Intelligence]: Distributed ArtificialIntelligence—Intelligent agents, Multiagent systems; I.6.6[Simulation and Modeling]: Simulation Output Anal-ysis

    Keywordshuman behaviour modelling, agent-based models, multia-gent simulation, emergence, causal analysis, data miningand machine learning

    1. IntroductionThe recent European Agent Technology Roadmap [15]views agent-based computing as “one of the most vibrantand important areas of research and development” in IT.

    ∗Recommended by thesis supervisor: Assoc. Prof.Ladislav HluchýTo be defended at Faculty of Informatics and Informa-tion Technologies, Slovak University of Technology inBratislava on July 7, 2017.

    c© Copyright 2017. All rights reserved. Permission to make digitalor hard copies of part or all of this work for personal or classroom useis granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies show this notice onthe first page or initial screen of a display along with the full citation.Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy other-wise, to republish, to post on servers, to redistribute to lists, or to useany component of this work in other works requires prior specific per-mission and/or a fee. Permissions may be requested from STU Press,Vazovova 5, 811 07 Bratislava, Slovakia.

    The Roadmap considers agent technologies from threeperspectives: (a) agents as design metaphor; (b) agentsas a source of technologies; and (c) agents as simulation.

    This dissertation is concerned mainly with the third per-spective in which agents figure as natural models for manycomplex real-world phenomena. In particular, their abil-ity to generate surprising emergent behaviours from sim-ple rules governing individual agents has made them anattractive modelling tool for social sciences. Yet the un-derlying causes and the actual processes by which theemergent behaviours arise have often remained a mys-tery. The Roadmap states that “understanding the mech-anisms that can be used to model, assess and engineerself-organisation and emergence in multi-agent systems isan issue of major interest.” In a similar vein, a UK gov-ernmental report [3] declares that the “difficulty in form-ing rigorous causal characterisations of the aggregate be-haviour of a complex system (rather than the absence ofregularity or predictability in this aggregate behaviour). . . is the more legitimate barrier to adopting complex-systems approaches in an ICT engineering context.” Thisdissertation is an attempt to contribute towards the dis-covery and exploration of causal relationships leading toemergence in agent-based models of human behaviour.

    This goal has been realised in three steps. First, a genericagent architecture has been designed for easy incorpora-tion and exploration of various models of human natureand behaviour. Second, a new analytical method was pro-posed on the basis of the theory of nonlinear structuralcausality, which is a nonlinear and nonparametric formof Structural Equation Modelling – SEM. Third, the newmethod was demonstrated through an exemplary applica-tion to the simulation data from experiments conductedin the proposed agent architecture.

    As a result, the dissertation spans four large fields: (a)human behaviour modelling; (b) agent-based simulations;(c) nonlinear structural causality; and (d) data miningand machine learning. The first has provided the in-vestigated model, the second various technologies for itssoftware implementation and simulation, the third gen-eral analytical principles, and the fourth the means forapplying those analytical principles to large amounts ofcomputer-generated simulation data. Each of these fourfields is large enough to merit a dedicated dissertation ofits own. In order to keep the dissertation manageable, itwas therefore necessary to restrict the attention only to

  • 2 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    the most relevant aspects of each field. The focus was pro-vided by the needs and requirements of project EUSAS(“European Urban Simulation for Asymmetric Scenarios”)for European Defence Agency EDA, in the context ofwhich a major part of this dissertation was realised. Theproject dealt with asymmetric security threats in whichsecurity forces face rioting crowds, insurgents or terroristsrather than regular military forces [9]. Its main goal wasto build a virtual training environment for security forcesin which real people (security forces) would interact withsimulated characters (civilians) in a highly realistic 3-Dcyber environment in real time.

    The rest of this article is organised as follows. Section 2briefly reviews the four areas relevant for the dissertation,and section 3 lists the main thesis objectives. Section 4describes the investigated model from project EUSAS andthe simulation scenario that served as the motivating use-case of the dissertation. It also briefly introduces the de-sign of a generic agent architecture for human behaviourmodels as a contribution toward the fulfilment of the firstthesis objective. Section 5 then introduces the concept ofa causal partition, section 6 provides a generalized ver-sion of causal partitioning, and section 7 summarises theresults achieved by the application of causal partitioningto the investigated model. Finally, section 8 characterisesthe main scientific contribution of this dissertation andoutlines ideas for further research.

    2. State of the Art2.1 Human Behaviour ModellingIn the relevant subfield of human behaviour modelling forsecurity and military applications, the focus is often onmodelling the military combatants and higher-level mil-itary units [22]. Project EUSAS on the other hand fo-cused on modelling the non-military participants – civil-ians in their various roles as rioters, insurgents or casualbystanders – and the emergence of collective aggression onthe civilian side. Collective aggression means the instru-mental use of violence by people who identify themselvesas members of a group against another group or set of in-dividuals, in order to achieve political, economic or socialobjectives [28]. Metzger [16] notes that collective violenceshould not be equated with the aggregation of individualaggressive behaviours. Individual aggression mostly tar-gets an individual person, whereas collective aggressiontargets another collective. According to Nolting [17], in-dividual violence is self-motivated, i.e. the individual de-cides and acts on his own. By contrast, collective violenceis mostly caused by external motivations (e.g. obeyingcommands or following role models) and compunction isoften reduced because of anonymity, distribution of re-sponsibility and group ideology. Moreover, “the activitiesof most functioning groups are not determined by the mo-tivations of each member in equal proportion to the othersin his group” [2]. This implies that individual and collec-tive violence need to be distinguished. Based on theseobservations, a socio-psychological model for the emer-gence of collective aggression was developed for projectEUSAS by a team from Cassidian (now Airbus Defence& Space)1. This model underlies the motivating use caseof this dissertation and is presented in detail in section 4.

    2.2 Agent-Based Simulation Platforms

    1 http://airbusdefenceandspace.com/

    The choice of a suitable agent-based simulation platformfor project EUSAS was a two-step process. The first phaseconsisted of perusing existing surveys, such as [5, 1, 24],and short listing the most promising candidates. In thesecond phase, the short listed candidates were evaluatedin depth through the implementation of a simplified ex-emplary scenario. Simulation platforms mentioned in thesurveys included NetLogo, MASON, Repast, Swarm andJava Swarm. Out of these, NetLogo and MASON wereshortlisted for an in-depth evaluation by implementation.Based on the exemplary implementation, MASON2 – aJava-based open-source simulation library – was selectedas the most suitable agent-based platform for project EU-SAS.

    Since MASON was not geared specifically towards mod-elling human behaviour, the PECS reference model [30,25] was chosen for the purpose. According to [25], “PECSis a multi-purpose reference model for the simulation ofhuman behaviour in a social environment.” The “PECS”acronym stands for Physical conditions, Emotional state,C ognitive capabilities and Social status – the four kindsof internal factors that need to be modelled in order toachieve realistic agent behaviour. At the same time, thePECS model only provides these four empty slots with-out specifying the factors to be modelled or the level ofmodelling detail: these decisions are left to the modelleras they depend on the task at hand. The PECS simplyrequires that these factors be modelled as state variableswith associated state transition functions conforming togeneral systems theory.

    The“canonical”internal structure of PECS agents is shownin Figure 1. Black arrows represent causal dependencies,red arrows the actual flow of information. Following gen-eral systems theory, PECS agents are structured into in-put, internal state and output. Sensor and Perceptioncomponents receive and pre-process the information fromthe environment, so they represent input. The middlefour components (Social Status, Cognition, Emotion andPhysis) together represent the internal state of the agent.Each of them has its own internal state variables (Z) alongwith their state transition functions (F). Cognition com-ponent additionally includes goals (G), action plans (P)and thinking functions. Behaviour and Actor componentsrepresent output: the first determines the actions to beexecuted, the second executes them.

    Figure 1: Structure of a PECS Agent (reproducedfrom [25])

    2http://www.cs.gmu.edu/ eclab/projects/mason/

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 3

    The PECS also requires that some state variables produceor act as motives, i.e. forces that drive agents to action.A good example is the level of physical energy (a statevariable in the “physical conditions” slot) and hunger –a motive driving us to search for food and replenish theenergy. In the PECS model the motives compete for con-trol over the behaviour of the agent: the strongest winsand becomes action-guiding. The motives thus need to bemutually comparable, which is typically achieved by nor-malizing and restricting their values to the closed interval[0, 1].

    Agent behaviours are conceptualized in PECS as sequencesof atomic, uninterruptible elementary actions, e.g. onestep in a certain direction or one stone-throw. When anew motive becomes action-guiding, it only takes effectafter the current elementary action is completed: the cur-rent behaviour pattern is then cancelled and a new oneactivated.

    2.3 Causal AnalysisThis dissertation deals mainly with practical aspects ofcausality, such as the relationship between causality andmanipulability [32] or between causality and scientific ex-planation [33]. The focus is on computational approachesto causality surveyed for example in [26]. Although thereexist several competing accounts of causation, a promi-nent place among them belongs to the comprehensive non-linear structural approach formulated by Judea Pearl andothers [20, 6, 7], which subsumes and unifies the proba-bilistic, manipulative, counterfactual, and other special-ized approaches. Its comprehensiveness makes it partic-ularly suitable for the purposes of this dissertation. Thissection broadly follows Pearl’s account from [20, 18, 19].

    The structural approach to causality is usually termedstructural equation modelling (SEM). It first appeared in1921 in the work of American geneticist Sewall Wright andwas further developed by Trygve Haavelmo and TjallingKoopmans in the 1940s and 1950s. These early structuralmodels were linear.

    The extension of structural approach from linear to non-linear systems was primarily due to Judea Pearl and hiscollaborators. In his seminal book on structural causal-ity [20], Pearl considers a very general system describedby the structural equations of the form

    xk = fk(pak, uk), k = 1 . . . n, (1)

    where pak stands for the set of the parent variables of xk,uk for the errors (or disturbances) due to omitted factors,and fk for an unspecified nonlinear function encoding thereal or hypothesised causal mechanism determining thevalue of xk. Pearl and his collaborators used the lan-guage and formalism of directed acyclic graphs (DAGs)to establish a number of strong results for such systemswithout any need to restrict the form of fk. As a result,this approach is not simply nonlinear, but also nonpara-metric in the sense that we do not need to estimate anyhypothesised internal coefficients of fk.

    As in the linear case, the causal mechanisms encoded infk (and, in consequence, the equations as well) shouldbe autonomous so that any of them could be changed

    by external intervention without affecting the remainingones [21]. A set of such equations is called a “structuralmodel.” If, in addition, each variable (apart from theerror terms uk) has a distinct equation in which it appearson the left-hand side, then the model is called a “causalmodel.”

    In order to illustrate the concept of a structural equation,let us consider two simple electrical circuits shown in Fig-ure 2. Both include a variable resistor connected to anideal source of (a) voltage or (b) current.

    Figure 2: Two simple electrical circuits

    The relationship between the current I passing throughthe resistor and the voltage U on its terminals dependson its resistance R and conforms to the Ohm’s law, whichcan be expressed in many nearly equivalent ways:

    I ·R = U ; U/I = R; I = U/R; R = U/I.

    From the algebraic point of view all these formulations arepermissible and any of them could be said to apply to bothcircuits. Structural causal theory, by adding extra rulesgoverning the form of equations, manages to encode inthem additional information about the flow of causality.Essentially, it treats the equality sign as an assignmentoperator in programming languages. Thus, on the left-hand side of a structural equation there can be only onevariable. Moreover, this variable has to be genuinely “de-pendent” on the right-hand side, which is meant to cap-ture the causal mechanism determining (or “assigning”)its value in the system under investigation. Interpretedin this way, each of the two circuits can be representedby just one form of the Ohm’s law. In order to identifythe correct “structural” forms, we need to contemplatethe effect of an “intervention” by an experimenter: whatwould happen if he or she changed the value of the resistorfrom r0 to r1? We can immediately see that for (a) it isthe current that would change, while for (b) it is voltage.Thus the structural equations describing the circuits are(a) I = U/R and (b) U = I ·R.

    Let us now consider a slightly modified version of Fig-ure 2(b): instead of an ideal current source we might havea photovoltaic cell producing current dependent on theintensity of light (E), and our resistor might be a ther-mistor whose resistance depends on temperature (T ). Asimplified structural model of such a circuit might look asfollows:

  • 4 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    I = g(E) (2a)

    R = h(T ) (2b)

    U = I ·R (2c)

    where g, h are unspecified nonlinear functions. By in-cluding only these equations in the model we stipulatethat there are no other significant dependencies amongthe model variables, at least in the space of parametervalues under consideration. It is not our intention to de-fend this model as realistic; it serves merely to illustratethe principles of structural causality and our new methodof causal partitions.

    The variables that occur on the left-hand side of struc-tural equations are called endogenous, i.e. determined bythe model. In our example, these are {I,R, U}. The re-maining variables {E, T} are exogenous: in their case weare not interested in modelling the mechanisms that settheir values and simply consider their values as given.The value assignment to the exogenous variables, e.g.(E = e1, T = τ1), is called the background or contextin which we try to solve the model equations.

    In structural theory, causation is interpreted as a relationbetween events. A primitive event is defined as a modelvariable assuming a value from its permitted range, e.g.R = r1 [6]. More complex events can be expressed asBoolean combinations of primitive events. We normallyspeak of an event when something happens as a conse-quence of model equations, e.g. in our case the eventR = r1 would imply that the temperature T assumed avalue τ1 such that h(τ1) evaluated to r1. Causal thinkingalso requires a special kind of event called intervention(sometimes also action): this is when we intervene from“outside”and impose the value assignment R = r1 regard-less of the temperature or the causal mechanism h(T ), e.g.by replacing the thermistor with a normal resistor whoseresistance equals r1. An intervention means that we wipeout the affected structural equation from the model andreplace it with another (typically a straightforward valuesubstitution). The solution of this modified system ofequations represents the response of the model to the in-tervention. In our case, equation (2b) would be replacedwith R = r1, giving the model response U = g(e1) · r1.

    Structural approach to causality enables us to inquirewhether one event (X = x) is a cause of another (Y = y)in a given context C = c. Unfortunately, for causal mod-els with real-valued variables, it will typically lead to theconclusion that each event X = x is a cause of the eventY = y, so long as (a) X is a parent variable of Y (i.e. Xoccurs on the right-hand side of the structural equationwhich has Y on the left-hand side), and (b) X = x wasobserved in the system alongside Y = y. And while suchan answer may be formally correct, it is not really helpfulfor understanding the dynamic behaviour of the modelledsystem. A key contribution of this dissertation is the wayto extract additional meaningful information from suchmodels through causal partitioning of its model variables,which we introduce in section 5.

    2.4 Data Mining and Machine LearningIn this dissertation, data mining and machine learningare used as auxiliary techniques in order to determine the

    predictive accuracy of causal partitions with respect tothe investigated phenomena. Drawing on [31], we reca-pitulate here their salient aspects.

    Data mining can be viewed as the extraction of informa-tion from raw data, i.e. of regularities or patterns fromrecorded facts. Machine learning is the technical basisof data mining – the algorithms used to extract informa-tion and to express it comprehensibly. Machine learningtries to infer the structure underlying the raw data – inother words, to perform abstraction or generalization. Itcan be characterised as the acquisition of structural de-scriptions from examples. These structural descriptionssupport not only prediction of future data but also expla-nation and understanding of the existing data from whichthe description was derived.

    The input to the machine learner typically takes the formof a set of examples or instances. Each instance usu-ally belongs to one, and only one, class. Instances arecharacterized by the values of attributes (features) thatmeasure their different aspects. There are many differenttypes of attributes, although typical data mining schemesdeal only with numeric and nominal (also called “categor-ical”) ones. When the measured attributes are too lowlevel, they may be augmented by intermediate concepts(i.e. functions of basic attributes) which are defined inconsultation with experts and embody some causal do-main knowledge.

    In general, there are four types of machine learning. Inclassification learning, the learning scheme is presentedwith a set of classified examples from which it is expectedto learn a way of classifying unseen examples. In associ-ation learning, any association among features is sought,not just ones that predict a particular class value. Inclustering, groups of examples that belong together aresought. In numeric prediction, the outcome to be pre-dicted is not a discrete class but a numeric quantity. Re-gardless of the type of learning involved, we call the thingto be learned the concept and the output produced bya learning scheme the concept description. Concept de-scriptions have to be intelligible so that they can be un-derstood, discussed, and disputed, and operational so thatthey can be applied to actual examples.

    Considering the very specific and limited role that datamining and machine learning play in this dissertation, webriefly characterise below only the most relevant groupsof machine learning methods.

    2.4.1 Linear modelsA linear model is a simple style of representation appli-cable when both the output and the input attributes arenumeric: its output is a weighted sum of the attributevalues and the trick is to come up with good values forthe weights. Linear models are easiest to visualize intwo dimensions, where they are tantamount to drawinga straight line through a set of data points.

    Linear models can also be applied to binary classifica-tion problems. In this case, the line produced by themodel separates the two classes: It defines where the de-cision changes from one class value to the other. Such aline is often referred to as the decision boundary. Thismodel can be extended to multiple attributes, in whichcase the boundary becomes a high-dimensional plane, or

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 5

    “hyperplane,” in the instance space. Examples of linearclassifiers include perceptrons (precursors to modern neu-ral networks), logistic regression, and support vector ma-chines (SVM).

    In this dissertation, SVM with SMO (sequential minimaloptimization) was used to gauge the predictive accuracyof causal partition components. SVM tries to maximizethe margin between the decision boundary and the closestinstances of both classes as shown in Figure 3. Theseclosest instances are called support vectors.

    Figure 3: A maximum-margin hyperplane sepa-rating two classes in two dimensions illustratesthe principle of support vector machines (SVM).(Reproduced from [31])

    2.4.2 TreesWhen the classes in question are not linearly separable,more powerful methods are needed. One of them, basedon a “divide-and-conquer” approach, is a decision tree.Its internal nodes involve testing a particular attributeagainst a constant. (In two dimensions this amounts tosplitting the data vertically or horizontally). Leaf nodesthen give a classification that applies to all instances thatreach the leaf. To classify an unknown instance, it isrouted down the tree according to the values of the at-tributes tested in successive nodes, and when a leaf isreached the instance is classified according to the classassigned to the leaf.

    The same kind of tree can be used to predict numericquantities. In that case, each leaf might contain a numericvalue that is the average of all the training set values towhich the leaf applies. Such trees are called regressiontrees. A regression tree whose leaf nodes contain linearmodels is called a model tree. A model tree with obliquesplits (a result of testing linear combinations of attributesagainst a constant in its internal nodes) is called a func-tional tree.

    The results obtained in this dissertation through SVMclassifiers were independently validated in [8, 13] by C4.5decision trees and M5P model trees.

    2.4.3 ClustersWhen a clusterer rather than a classifier is learned, theoutput takes the form of a diagram that shows how theinstances fall into clusters. In the simplest case this in-volves associating a cluster number with each instance.Some algorithms produce a hierarchical structure of clus-ters so that at the top level the instance space divides

    into just a few clusters, each of which divides into its ownsubcluster at the next level down, and so on. Clusteringis often followed by a stage in which a decision tree or ruleset is inferred that allocates each instance to the clusterin which it belongs. Then, the clustering operation is justone step on the way to a structural description.

    In this dissertation, clustering is used as a prerequisitefor classification. In the early stages the Weka’s simplek-means clusterer was used. Later experiments relied onthe Weka’s hierarchical clusterer.

    3. Thesis ObjectivesIn line with the main goal of this dissertation, the follow-ing detailed thesis objectives were set:

    1. Design a generic agent architecture enabling easyincorporation and exploration of various models ofhuman nature and behaviour;

    2. Analyse the limitations of the standard approach tostructural causal analysis from the point of view ofsimulation studies;

    3. Propose a new method based on structural causalanalysis that overcomes these limitations and canassist in the discovery of causal relationships lead-ing to emergence in agent-based models of humanbehaviour;

    4. Implement the algorithmic elements of the proposedmethod at the level of “proof of concept”;

    5. Apply the method to simulation data and validatethe results.

    These objectives were realised in the context of projectEUSAS. As a result, the dissertation analyses the agent-based model used in project EUSAS and investigates itssurprising emergent behaviour in one of the project’s sce-narios, which became a motivating use-case for the dis-sertation. Both the model and the scenario are describedin more detail in the next section.

    4. The Model and the ScenarioIn this scenario, which was inspired by the peacekeepingISAF mission in Afghanistan, a crowd of civilians is loot-ing a shop and an approaching security patrol is supposedto stop the looting and disperse the crowd. The scene isdepicted in Figure 4. The black areas represent buildingsand barriers unreachable to agents. The rectangle withgray interior near the top is the looted shop. It is sur-rounded by dots, each representing one agent. The darkones are the looters; the white ones are aggressive indi-viduals whose intention is to attack the soldiers. In thescenario there are 40 looters and 15 aggressive individuals.The security patrol is represented by the three mediumgray dots in the bottom part of the figure.

    Civilian agents are endowed with one“default”motive anda matching behaviour by which they try to satisfy it. Forlooters this leads to looting and for the violence-prone in-dividuals to stone-pelting the soldiers. The agents alsomonitor their surroundings. As the patrol approaches,this may induce fear in some looters who then start leav-ing the scene. The violence-prone individuals, however,

  • 6 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    Figure 4: Initial setting of the simulation scenario

    do not get afraid but rather attack the patrol. The vi-olence may impact the remaining looters in two possibleways – they may either get afraid and leave, or get angryand join the attack. The ratio of looters who get afraidto those who get angry depends on their motivational dy-namics, which we explain next.

    A simplified diagram of the key factors affecting the be-haviour of our civilian agents is shown in Figure 5. Themodel draws primarily on Berkowitz [2], Prentice-Dunn etal. [23], Staub [27] and Cañamero [4]. The main ideas andprocesses underlying the model were developed by AirbusDefence and Space (former EADS Deutschland GmbH)3

    in collaboration with the Department of Social Psychol-ogy of the University of Zurich and the chair for Oper-ations Research at the University of Passau on behalf ofthe German Bundeswehr.

    As depicted in Figure 5 (starting from the top left cor-ner), the number of people surrounding the agent, theiractions and other events in the vicinity affect the agent’semotional motives (fear, anger) and other internal vari-ables (arousal, readiness for aggression). Besides eventsand actions, there is also a direct social influence of otheragents on the agent’s fear and anger. This was modelledaccording to Latané’s formula of strength, physical prox-imity and the number of influencing agents [14].

    Speaking qualitatively, the agent’s internal arousal de-pends on the number of people in the vicinity and theirviolence: the higher the number and the more violent theyare, the sharper the increase of the agent’s arousal. De-individuation means the agent considers himself a partof the crowd and no longer a separate individual: thehigher the agent’s arousal and the cohesion of his group,the higher the de-individuation. Readiness for aggression(RFA) is jointly affected by the norms for anti-aggression,de-individuation and external events as follows: (a) the

    3 http://airbusdefenceandspace.com/

    Figure 5: Key factors affecting the behaviour ofcivilian agents

    higher the norms for anti-aggression, the lower the RFA;(b) the higher the de-individuation, the higher the RFA;and (c) the more violent actions are witnessed, the higherthe RFA. The form of the agent’s eventual aggression de-pends primarily on the RFA. In our scenario, the initialvalue of RFA was set to a level guaranteeing that civilianswould resort to stone-pelting the soldiers whenever theybecame aggressive.

    Our model includes four state variables that act directlyas motives: fear, anger, looting motive (present only inlooters) and will to attack (present only in violence-pronecivilians). These compete for control over the behaviourof the agent: the strongest wins and becomes action-guiding. Unlike fear and anger, which were endowed withcomplex dynamics described below, the looting motiveand will to attack were both set to constant values whichfear and anger had to cross in order to affect the agents’behaviour.

    Each motive, when it becomes action-guiding, pre-selectsa group of behaviours that can satisfy it. In our model,for example, there are three “fearful” behaviours: with-drawal (walking away), flight (running away) and panicflight (running away at extra speed with sensory percep-tion blocked). Which of them is triggered when fear be-comes action-guiding depends on a secondary selectioncriterion, in this case the actual intensity of fear. Thiscriterion can be arbitrary, e.g. while anger pre-selectsa group of aggressive behaviours, the final choice of theform of aggression depends on the readiness for aggres-sion (RFA). As we have already mentioned, we set RFAso that aggressive civilians would always resort to stone-pelting the soldiers.

    As for the dynamics of the simulated emotions fear andanger, they comprise a continuous part and a discretepart. The continuous dynamics of fear (F) is driven bythe differential equation

    dF

    dt= c1F · F · (c2F − F ) · (c3F + IF ) (3)

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 7

    Figure 6: Internal structure of civilian agents for project EUSAS (implementation view)

    where c1F , c2F , c3F are fear-related constants (see Table 1)and IF fear-related social influence of nearby agents. Anal-ogously, the continuous dynamics of anger (A) is drivenby the equation

    dA

    dt= c1A ·A · (c2A −A) · (IA + L− c3A) (4)

    where c1A, c2A, c3A are anger-related constants (see Ta-ble 1), IA anger-related social influence of nearby agentsand L the model variable arousal.

    Social influence IV of nearby agents on a motive variableV (where V stands either for fear F or anger A) of anobserving agent j is defined by the following sum:

    IV = c4V ·∑k 6=j

    [(Vk − Vj) · prestigek · sympathyjk

    distancejk

    ](5)

    Here, the summation is over those agents (indexed by thesubscript k) who are not farther away from agent j than acertain social influence radius (set to 100 metres for bothfear and anger), and in whom the motive V happens to beaction-guiding at the moment of evaluation. Each agentis assigned a constant social rank prestigek which mod-ulates its influence on others: group leaders enjoy higherprestige than ordinary members, and thus influence theothers more. Analogously, there are constant sympathiesassigned between groups: sympathyjk captures the sym-pathy of agent j’s group towards agent k’s group, and isinterpreted here as the susceptibility of the former to theinfluence of the latter. Finally, distancejk is the physicaldistance between the two agents. In the original model,

    Table 1: Fear- and Anger-related constantsName Value Interpretationc1F 5 Sensitivity of the first derivative of fear

    dF/dt.c2F 1.1 Maximum value of fear F at which

    dF/dt becomes zero.c3F 0.1 Inbuilt tendency of fear to increase – a

    composite constant defined as the dif-ference between the effect of the ex-pected negative consequences and theagent’s resilience to fear.

    c4F 12.5 Sensitivity to fear-related social influ-ence of other agents.

    c5F 1 Sensitivity to fear-inducing events (seeTable 2).

    c1A 2 Sensitivity of the first derivative ofanger dA/dt.

    c2A 1.1 Maximum value of anger A at whichdA/dt becomes zero.

    c3A 0.3 Resilience to anger.c4A 12.5 Sensitivity to anger-related social in-

    fluence of other agents.c5A 1 Sensitivity to anger-inducing events

    (see Table 2).

    variables and many constants were expressed in the per-centage scale [0, 100]. For the sake of simplicity, in thispaper we have converted them into the ratio scale [0, 1](see Table 1).

    During the simulation, differential equations (3) and (4)for each agent are solved numerically by the Euler methodwith a constant (but user-definable) time step ∆t. Resort-ing to numerical method enabled us to ignore the specificform of the equations’ right-hand side and consider thegeneral case

  • 8 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    Table 2: Main event impacts on Fear and AngerImpact on Fear Impact on Anger

    Event Direct Indirect Direct IndirectEffective shot 0.4 0.35 0.1 0.25Warning shot 0.3 0.3 0.1 0.1Stone thrown 0.002 0.002 0.18 0.15

    dF

    dt= f(F, IF ) (6a)

    dA

    dt= h(A, IA, L) (6b)

    where f, h are bounded (but not necessarily continuous)nonlinear functions. The Euler method approximates thenew values of fear F (t+ ∆t) and anger A(t+ ∆t) on thebasis of the current ones F (t), A(t):

    F (t+ ∆t) ≈ F (t) + ∆t · f(F (t), IF (t)) (7a)A(t+ ∆t) ≈ A(t) + ∆t · h(A(t), IA(t), L(t)) (7b)

    After calculating these new “continuous” values, discretedynamics come into play: the cumulative effects of theperceived external events on fear (∆EF ) and anger (∆EA)are added in order to obtain the new “total” values:

    FT (t+ ∆t) = F (t+ ∆t) + c5F ·∆EF (8a)AT (t+ ∆t) = A(t+ ∆t) + c5A ·∆EA (8b)

    In the next iteration the new total values of fear FT andanger AT will be used as initial conditions in the nu-merical solution of the differential equations representingtheir continuous dynamics. This sequential coupling ofthe continuous and the discrete dynamics qualifies ouragent models as sequential hybrid in the sense of Swinerdand McNaught [29].

    The constants c5F , c5A (see Table 1) capture the agent’sindividual sensitivity, while ∆EF and ∆EA are the sumsof the emotion-inducing impacts as per Table 2 for all theevents perceived by the agent during the time interval(t, t + ∆t). The “direct” values from Table 2 are usedwhen the perceiving agent is within 20% of the maximumperception distance. When the agent is farther away than40% of this maximum distance, the “indirect” values areused. In the intermediate zone (from 20% to 40%), aweighted average is used, sliding down linearly from thedirect value towards the indirect one. Sensory perceptionof the agents is limited by a radius of 50 m for events likethrowing stones and 150 m for gun shots.

    Besides the constants in Table 1 and the event impacts inTable 2, emotional dynamics is greatly influenced by theinitial values of fear F0 and anger A0. In our scenario,these were set to F0 = 0.3 and A0 = 0.2. Additionally, ouragents underwent the moderating influence of emotionsand fatigue: when the average of fear F and anger A(termed in our model the emotional moderator) crossedthe level of 0.5, further sensory perception of externalevents was blocked. Moreover, as the physical energy of

    our agents decreased with expended effort and sustainedinjuries, they slowed down and their actions took longer.

    Numerical solution of this model by iterating throughequations (7) and (8) can give us a complete and de-tailed time-evolution of simulated fear and anger of civil-ian agents. Our present goal, however, is more ambi-tious. In the case of fear, for example, we want to knowwhat portion of its actual value at any time should beattributed to the social influence of nearby agents (vari-able IF in equation (7a)) as opposed to the direct impactof external events (variable ∆EF in equation (8a)). Wetackle this question in the sections that follow.

    In contrast to our civilian agents, the soldier patrol char-acters were intended primarily as avatars to be controlledby real people in a high-fidelity 3D cyber-environmentof a commercial battlefield simulator VBS24. In conse-quence, our soldier agents were not defined at the samelevel of detail as the civilian ones. In our scenario, forexample, they are just passing by and act in self-defence.Their rules of self-defence say that when a given civilianfirst throws a stone at a particular soldier, that soldierresponds by a warning shot in the air. If the same civil-ian throws a stone at the same soldier a second time, thatsoldier is permitted to use an effective shot aimed at thelegs of the attacker in order to immobilize him. That is,of course, an extreme simplification, but it proved usefulin the early phases of project EUSAS for calibrating thecivilian agents.

    While experimenting with this model in this scenario, wenoticed that for the parameter setting listed above, theemergent collective behaviour of the civilians seemed tobifurcate along two different trajectories. In some cases,almost all the looters got afraid and left the scene, whilein others almost all got angry and joined the attack. Be-cause our model incorporated an element of randomness,some variation in its behaviour was expected, but the ex-treme variation we witnessed was unusual and called foran explanation. This spurred our search for analyticalmethods that could unravel causal chains and dependen-cies in complex systems of this kind. The method of causalpartitions presented below is the result.

    5. The Concept of a Causal PartitionWe shall illustrate the concept of a causal partition onthe same electrical circuit that was used in subsection 2.3to elucidate the concept of a structural equation. Letus again consider a variation of Figure 2(b) with a photo-voltaic cell and a thermistor described by structural equa-tions (2). As mentioned there, the application of the clas-sical structural causal theory to the variable U (voltage)in this circuit leads to the correct but rather unillumi-nating conclusion that its actual value was caused by thevalues of its parent variables I and R on the right-handside of its structural equation (2c). This problem of thetriviality of results occurs as a side-effect of working withreal-valued model variables.

    In our previous work [11] we suggested how to extendthe structural approach in order to cope with real-valuedvariables. Instead of asking, “What is the cause?” weproposed to ask a modified question: “In what proportion

    4http://www.army-technology.com/contractors/training/bohemia-interactive/

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 9

    have all the causes contributed to the effect?” Contin-uing with the example, we should try to determine theproportion in which the change of U from some previ-ous level u0 to the present one u1 could be attributed to(or split among) its parent variables I and R. Let us as-sume that U = u0 was observed at time t0 together withI = i0, R = r0 in the context C = c0 = (E = e0, T = τ0).For the sake of simplicity, let us further assume that ourtwo observations were so close in time that the state-spacetrajectory of the system between them can be consideredlinear. In such a case we can approximate the change ofU by a scalar product of two vectors. The first is thegradient of U along its parent variables according to itsstructural equation (2c). The second is the vector formof the change in these parent variables between the twoobservations (contexts C = c0 and C = c1):

    ∆U ≈ grad(U) · (∆I,∆R) (9a)

    ≈(∂U

    ∂I,∂U

    ∂R

    )· (∆I,∆R) (9b)

    ≈ ∂U∂I·∆I + ∂U

    ∂R·∆R (9c)

    Equation (9c) can be interpreted as a general “recipe” forquantifying the responsibility of the parent variables I, Rfor the change of their dependent variable U : the firstsummand on the right-hand side (∆I · ∂U/∂I) capturesthe contribution of I, the second one (∆R · ∂U/∂R) thatof R.

    Suppose that we trace further evolution of U at timest2 . . . tm as it assumes new values u2 . . . um along sometrajectory representing our observational study or exper-iment. Expressing each of these changes of U as per (9c)and summing up the contributions of I separately fromthose of R, we obtain a general formula quantifying thetotal contribution of each causal factor (parent variable)along an arbitrary trajectory (implicitly approximatedhere by a piecewise linear curve):

    m∑j=1

    ∆U (j) ≈m∑

    j=1

    ∆I(j) · ∂U∂I

    ∣∣∣∣I=Î(j)

    R=R̂(j)

    +

    m∑j=1

    ∆R(j) · ∂U∂R

    ∣∣∣∣I=Î(j)

    R=R̂(j)

    (10)

    The above formula uses superscript indexing where X(j)

    denotes the value of the variable X at time tj , and ∆X(j)

    its backward difference: ∆X(j) = X(j) −X(j−1). Partialderivatives are evaluated at midpoint (Î(j), R̂(j)) of each

    linear segment of the trajectory, so Î(j) = 0.5 · (I(j) +I(j−1)) and R̂(j) = 0.5 · (R(j) +R(j−1)).

    In order to keep the total contribution of each causal fac-tor separate, we proposed a new, vector-like representa-tion of model variables termed a causal partition. Thusthe variable U at the end of our observational study wouldbe represented by the causal partition vector (UU0, UI , UR)where

    UU0 = u0,

    UI =

    m∑j=1

    (∆I(j) · ∂U

    ∂I

    ∣∣∣∣I=Î(j)

    R=R̂(j)

    )

    UR =

    m∑j=1

    (∆R(j) · ∂U

    ∂R

    ∣∣∣∣I=Î(j)

    R=R̂(j)

    )

    Note that in this example the contribution of the initialvalue u0 remained constant throughout the trajectory, i.e.

    U(j)U0 = u0, j = 1 . . .m. In general, however, this contri-

    bution can vary.

    6. Summary and Generalization of the Methodof Causal Partitioning

    The concept of a causal partition introduced above on asimple electrical circuit can be generalized. This leadsto a generalized method of causal partitioning applicableto systems modelled by structural equations whose right-hand sides are differentiable with respect to their parentvariables.

    Causal partitions and their components. A causal par-tition of a model variable Y is a vector-like structure(YC1, YC2 . . . YCn) whose components sum up to Y andwhere each component YCi represents the portion of thevalue of Y attributed to one specific causal factor Ci. Inthis way it is possible to quantify the extent to which var-ious factors can be considered “responsible” for the valueof Y at a given time and in a given simulation scenario.

    In order to cover the general case, let us consider a modelvariable Y driven by the structural equation

    Y = f(Xk, k = 1 . . . n), (11)

    where {Xk, k = 1 . . . n} are the parent variables of Y andfk is an unspecified nonlinear function differentiable withrespect to all these parent variables. Let us further as-sume that Y starts from the initial value Y = y0 andproceeds through Y = yj , j = 1 . . .m. (This meansthat the potentially arbitrary trajectory along which Ymoves is implicitly approximated here by a piecewise lin-ear curve.) The causal partition vector of Y at the end ofthis trajectory (Y = ym) is then expressed as the struc-ture

    (YY 0, YX1, YX2 . . . YXn), (12)

    where each partition component YXk stands for the to-tal contribution of the corresponding parent variable Xkalong this trajectory and is calculated as

  • 10 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    YXk =

    m∑j=1

    ∆X(j)k · ∂Y∂Xk∣∣∣∣X1=X̂(j)1···Xn=X̂

    (j)n

    =

    m∑j=1

    ∆X(j)k · ∂f∂Xk∣∣∣∣X1=X̂(j)1···Xn=X̂

    (j)n

    (13)

    where (X̂(j)1 . . . X̂

    (j)n ) is the midpoint of the j-th linear

    segment of the approximated trajectory.

    The first partition component YY 0 has a special role: itdenotes the contribution of the initial setting Y = y0 tothe final value Y = ym. It makes the partition compo-nents sum up to the value of the represented variable,which helps to interpret causal partitions:

    ym = YY 0 +

    n∑k=1

    (YXk) (14)

    This approach implies that, at the beginning of the tra-jectory, Y is represented by the causal partition vector(y0, 0, 0 . . . 0).

    Practical steps of causal partitioning. Let us assumethat we already have at our disposal the structural equa-tions describing the system of interest, as well as their ex-ecutable implementation in the form of a simulator. Theapplication of causal partitioning to such a system theninvolves the following steps:

    1. Choice of model variables for partitioning. Inall but the most rudimentary cases it will be im-practical to partition all the variables in the model.The analyst should start with the variables directlyconnected with the phenomenon of interest, and addothers as and when needed.

    Regarding the agent-based model investigated in thisdissertation, the phenomenon of interest was thesurprising bifurcation of simulation trajectories alongeither the timid or the aggressive path. Since thetimid behaviours were triggered by fear and the ag-gressive ones by anger, the model variables repre-senting these two emotions were the obvious choicesfor causal partitioning. As a matter of fact, theanalysis was successfully concluded without havingto partition any other model variable like arousal orreadiness for aggression.

    2. Deriving formulas for individual partition com-ponents. In the ideal case, each model variable Ychosen for causal partitioning will have a dedicatedstructural equation in the model in which it appearson the left-hand side, such as (11) above. The “ba-sic version” of causal partitioning will then produceone partition component for each parent variable onthe right-hand side of that equation, and each suchcomponent can, in general, be calculated as per for-mula (13). In practice, however, it may be necessary

    to go beyond this basic form of partitioning in orderto gain more information, improve interpretability,remove redundant parent variables, or factor in de-pendencies among them.

    These more advanced forms of causal partitioningwere demonstrated on the simulated emotion of fearF in chapter 5 of the dissertation, whose full text isavailable online elsewhere in this ACM bulletin5.

    3. Software implementation and data provision.The algorithms for computing and logging the causalpartition components for the chosen model variableshave to be implemented in the simulator, and ap-propriate simulation experiments then need to bedefined and executed.

    4. Data analysis and hypothesis generation. Logsof simulation runs deemed relevant for causal anal-ysis need to be analysed by suitable machine learn-ing techniques (clustering, classification, etc.) us-ing partition components as predictors in order togenerate and test hypotheses about the causes anddevelopmental stages of the phenomena of interest.This typically proceeds iteratively in conjunctionwith the next step.

    5. Hypothesis validation by confirmatory sim-ulation experiments. In this phase we need togo beyond statistical hypothesis testing by takingadvantage of the fact that simulation models arefully observable and manipulable by setting theirinitial conditions, parameters and, if necessary, bythe modification of their software implementation.Thus we can ultimately prove or disprove our causalhypotheses by directly manipulating the suspectedcauses and observing the effects of our interventionson the behaviour of the simulated system.

    This step is required because high predictive accu-racy of a given partition component does not byitself guarantee that the causal factor behind it isactually causing the phenomenon – it may be merelyassociated with it. Additional validation activitiesare needed – separate experiments manipulating thesuspected cause and confirming or disconfirming itseffect on the emergent phenomenon. In any case, thebest early predictors help us focus on the relevantaspects of the system in search for its real causesand generating processes.

    In conclusion it needs to be stressed that causal partition-ing involves numerous heuristic elements. Its applicationtherefore does not automatically guarantee analytical suc-cess. Rather, it is meant as a set of guidelines that shouldsignificantly improve our chances in this respect.

    7. Results of the Application of Causal Parti-tioning to the Investigated Model

    Causal partitioning was applied to the model and the sce-nario from project EUSAS described in section 4. In thisscenario, for certain initial values of fear F0 and anger A0,a surprising emergent phenomenon was observed: simula-tions starting from these initial values seemed to bifurcatealong two radically different paths. Along one branch, al-most all the looters in the simulation became angry and

    5http://acmbulletin.fiit.stuba.sk/abstracts.php

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 11

    Figure 7: Two global clusters of simulations in thenew set of experiments.Each simulation is represented as a dot with coordinates(A-count, F-count), where A-count says how many timesanger prevailed among its civilian agents and F-count howmany times fear prevailed. (Each simulation included atotal of 55 civilian agents, out of which 15 were aggres-sive.)

    attacked the security patrol, while along the other nearlyall the looters became afraid and fled to safety. Such bi-furcation is typical of complex systems whose behaviourcannot be captured by a closed-form expression. The goalof this investigation was to uncover the causes and thestages of the observed bifurcation and, if possible, to sup-press it. Through the application of causal partitioning,the following results were obtained:

    1. Confirmation of the existence of two high-level clusters of simulations. Our clusteringexercises confirmed the existence of two clusters ofsimulations in both series of simulation experimentscarried out in the dissertation. These clusters broadlycorresponded to each simulation taking either a timidor an aggressive turn. Clustering results for the sec-ond set of simulation experiments are shown in Fig-ure 7.

    2. Confirmation of the relevance of causal par-titions. Already the first set of experiments usinga less-developed version of causal partitioning suc-ceeded in establishing the relevance of causal parti-tions to the observed phenomenon. In other words,causal partitions reliably indicated the timid or ag-gressive turn of each simulation and equalled in thisrespect the more traditional Measures of Effective-ness (MoE) used in project EUSAS (these MoE in-cluded the numbers of stones thrown as well as ofwarning and effective gun shots).

    3. Support for hypothesis generation. Causal par-titions also supported and guided the process of hy-pothesis generation regarding the possible causes

    Figure 8: Time-evolution of predictive accuracyof selected predictors.Predictor ALL incorporates all components of causal par-titions of fear and anger, predictor MoE the three tradi-tional MoE measures (numbers of stone-throws, warningshots and effective shots), predictor Fec only the con-tribution of civilian actions to the value of fear, predictorFes only the contribution of security actions, and Fs onlythe contribution of social influence of nearby agents.

    and stages of the observed phenomenon. In con-trast, the more traditional Measures of Effectivenessdid not provide any such guidance.

    4. Early onset of the predictive accuracy of causalpartitions. Subsequent improvements to the methodof causal partitioning made it possible to study thetime evolution of the predictive accuracy of causalpartitions. Figure 8 shows that already in the 16thsecond of simulated time they could predict the sub-sequent timid or aggressive turn of each simulationwith 99% accuracy. In this respect they signifi-cantly surpassed MoE, which in this early stage onlyreached about 70% accuracy.

    5. Establishing the time limits for the casualmechanism of emergence. Early onset of thepredictive accuracy of causal partitions helped usdelimit the time interval in which the “cause” ofthe bifurcation must have acted: the future turn ofeach simulation was shown to be invisibly decidedbetween the 8th and the 14th second of simulatedtime.

    6. Establishing the stages of emergence. Thepeaks of predictive accuracy of individual partitioncomponents in this time-frame (shown as local max-ima in Figure 8) helped us identify three distinctstages through which the resulting emergent behaviourof the system gradually arose: (a) the initial attackof aggressive individuals on the security patrol; (b)the patrol’s response; and (c) the reaction of thelooters who could either join the attack or flee tosafety.

    7. Identifying the mechanism of emergence. In-vestigation of selected model variables in this time-frame revealed that, by the 14th second, sensoryperception of most civilian agents was already blocked.This led to the formulation of the final, correct hy-pothesis that the timid or aggressive turn of eachsimulation depended on how quickly and resolutelythe patrol members reacted before the sensory per-ception of civilians got blocked. This hypothesis wasconfirmed by a subsequent simulation experiment inwhich the patrol reaction time was varied and whose

  • 12 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    Figure 9: Proportion of “aggressive” simulationsin the batch as a function of patrol reaction time.

    results are shown in Figure 9. Prolonged patrol re-action time was shown to dramatically increase theprobability of the aggressive turn of the scenario.

    8. Suppressing the bifurcation. Setting the patrolreaction time to five seconds suppressed the bifur-cation and forced all the simulations along the “ag-gressive” path.

    8. Conclusions and Future WorkThe main scientific contribution of this dissertation hasthree aspects:

    1. Formulation of a new causal analytical method.Causal partitioning is based on nonlinear structuralapproach to causality, which is a nonlinear and non-parametric form of structural equation modelling (SEM).

    2. Successful application of the new method toa surprising emergent phenomenon in a com-plex system. Results of the application of causalpartitioning to the investigated agent-based modelof human behaviour are summarised in section 7.

    3. Generalisation of the method. Causal parti-tioning was generalised in section 6 so as to applyto systems described by nonlinear structural equa-tions with differentiable right-hand side.

    Regarding future work, there are several directions. Thefirst is to investigate our agent-based model in greaterdepth and elucidate certain subsidiary aspects before wetry to apply our method to other systems.

    The first such aspect is the significance of social influencefor the bifurcation of simulation trajectories. At presentwe tend to think that social influence is not crucial, i.e.the bifurcation would persist even if we kept IF and IAin equations (3) and (4) at zero level, e.g. by setting theconstant c4V in equation (5) to zero. Furthermore, we donot expect this to significantly affect the time-evolution ofpredictive accuracy of causal partition components shownin Figure 8. What we expect to change is the compositionof the clusters, that is, some simulations might shift theircluster affiliation, although we are at present unable tosay which cluster would grow and which would shrink.We also expect this effect to be relatively mild (if any).

    Another interesting question is how many clusters reallyare there. In our early report [10] we decided to workwith two clusters, but that was a provisional decision inorder to keep the analysis simple. We were not at all surethat we would succeed and were ready to consider moreclusters if necessary. The fact that we could completethe analysis using only two clusters does not by itself set-tle the question; it might merely show that our methodtolerates some uncertainty in this respect.

    The second direction – and a natural next step – is toapply our method to other similar systems. By similaritywe mean that, besides being modelled through structuralequations, the variables and functions used in them shouldbe numerical (ideally, real-valued). The most straightfor-ward application would be to systems governed by ordi-nary differential equations, but we believe an extension tosystems driven by partial differential equations should bepossible as well.

    Agent-based models using structural equations would makeanother legitimate and interesting analytical target. (Note,however, that other types of agents, e.g. simple rule-basedagents, are likely to fall outside the scope of this method.In order to apply causal partitioning to them we wouldhave to define, first, what we mean by cause and effectin these systems and, second, how we could quantify suchcausal effects.)

    We also see an exciting possibility to adapt our method foruse in artificial neural networks – these too are describedby structural equations, because each neuron has only oneoutput (the dependent variable) which is a function of oneor more inputs. In this way we might be able to study e.g.the processes of learning in complex neural architectures.

    System Dynamics example. The discipline of System Dy-namics (SD) merits special mention: it not only deals withdynamical systems, but also meticulously maps the flowsof causality through them. In most cases, therefore, SDmodel equations should qualify as structural equations,which would make causal partitioning applicable to themat least in principle. This enables us to offer a few broadhints regarding the application of causal partitioning toan SD model. As an example, let us consider a variable X(modelled in SD as a stock) with n inflows and outflowsx1, x2 . . . xn. For simplicity, let us assume that the dy-namical process we are interested in starts at time t = 0with zero initial value of X. Then its value at an arbitrarysubsequent point of time t can be expressed as the sumof all its inflows and outflows separately integrated (orsummed) over the time interval [0, t]. Thus, if Xi denotesthe integrated or summed flow xi over the concerned pe-riod, then X(t) = X1 + · · ·+Xn. We can then representX(t) by a vector-like structure (X1, . . . Xn) which closelycorresponds to what we called the basic causal partitionintroduced in the dissertation.

    In this “basic” version of causal partitioning we look oneach flow as contributing to its own dedicated partitioncomponent of X. More sophisticated (“enhanced”) formsof causal partitioning might try to re-define this basic par-tition, typically by splitting some components into twoor more in order to gain more information, or by elim-inating others in order to improve interpretability. Anysignificant functional dependencies among the inflows and

  • Information Sciences and Technologies Bulletin of the ACM Slovakia 13

    outflows of X might have to be taken into account as well.

    The whole point in calculating causal partitions is to gainadditional information: we learn not only the resultingvalue of X but also how much each flow contributed. Thismight provide valuable clues for understanding the globaldynamics of the model, especially if we study the time-evolution of predictive accuracy of partition componentsand succeed in mapping their local maxima to variousdevelopmental stages of the observed global behaviour.

    Finally, the last direction of future work comprises thestudy of the method’s mathematical properties, especiallythe limits of its stability. Elsewhere in the dissertationwe mentioned that we had to reject some data becauseit exhibited signs of numerical instability. We did notgo into details, but in [12] we mentioned how this couldbe improved, and we intend to follow it up alongside ourwork on further practical applications.

    Acknowledgements. This work was supported by EDAproject A-0938-RT-GC EUSAS, national project VEGANo. 2/0167/16 and the Slovak Research and Develop-ment Agency under the contracts No. APVV-0233-10 andAPVV-0809-11. The author would also like to thank Dr.Ladislav Halada of the Institute of Informatics of the Slo-vak Academy of Sciences for helpful suggestions regardingmathematical notation.

    References[1] R. J. Allan. Survey of agent based modelling and simulation tools.

    Technical Report DL-TR-2010-007, Science and TechnologyFacilities Council, 2010. ISSN 1362-0207.

    [2] L. Berkowitz, editor. Aggression: A Psychological Analysis.McGraw-Hill, New York, 1962.

    [3] S. Bullock and D. Cliff. Complexity and emergent behaviour inict systems. Technical report, Hewlett-Packard Labs, 2004.http://eprints.soton.ac.uk/261478/1/HPL-2004-

    187.pdf, Accessed November2015.

    [4] D. Cañamero. Modeling motivations and emotions as a basis forintelligent behaviour. In Proc 1st Int. Conf. on AutonomousAgents, pages 148–155. ACM Press, 1998.

    [5] N. Gilbert and S. Bankes. Platforms and methods for agent-basedmodeling. Proceedings of the National Academy of Sciences,99(suppl 3):7197–7198, 2002.

    [6] J. Y. Halpern and J. Pearl. Causes and explanations: Astructural-model approach. Part I: Causes. The British journal forthe philosophy of science, 56(4):843–887, 2005.

    [7] J. Y. Halpern and J. Pearl. Causes and explanations: Astructural-model approach. Part II: Explanations. The BritishJournal for the Philosophy of Science, 56(4):889–911, 2005.

    [8] P. Krammer, M. Kvassay, and L. Hluchý. Validation of parameterimportance by regression. In INES 2014 – IEEE 18thInternational Conference on Intelligent Engineering Systems,Proceedings, pages 27–31, 2014.

    [9] M. Kvassay, L. Hluchý, v. Dlugolinský, B. Schneider, H. Bracker,A. Tavčar, M. Gams, M. Contat, L. Dutka, D. Król, M. Wrzeszcz,and J. Kitowski. A novel way of using simulations to supporturban security operations. Computing and Informatics, 34(6),2015.

    [10] M. Kvassay, L. Hluchý, P. Krammer, and B. Schneider. Exploringhuman behaviour models through causal summaries and machinelearning. In INES 2013 – IEEE 17th International Conference onIntelligent Engineering Systems, Proceedings, pages 231–236,2013.

    [11] M. Kvassay, L. Hluchý, P. Krammer, and B. Schneider. Causalanalysis of the emergent behavior of a hybrid dynamical system.Acta Polytechnica Hungarica, 11(4):21–40, 2014.

    [12] M. Kvassay, L. Hluchý, and B. Schneider. Summarizing thebehaviour of complex dynamic systems. In SAMI 2013 – IEEE11th International Symposium on Applied Machine Intelligenceand Informatics, Proceedings, pages 15–20, 2013.

    [13] M. Kvassay, P. Krammer, and L. Hluchý. Validation of parameterimportance in data mining: a case study. In WIKT 2013 WorkshopProceedings, pages 115–120, 2013.

    [14] B. Latané. Dynamic social impact. Journal of Communication,46(4):13–25, 1996.

    [15] M. Luck, P. McBurney, O. Shehory, and S. Willmott. Agenttechnology roadmap: A roadmap for agent based computing.Technical report, University of Southampton on behalf ofAgentLink III, 2005.http://eprints.soton.ac.uk/261788/1/al3roadmap.pdf,Accessed November 2015.

    [16] L. Metzger. Kollektive Gewalt: Entstehungsbedingungen, Verlaufund Eskalation. Research report, University of Zurich,Psychological Institute, 2002.

    [17] H.-P. Nolting. Aggression ist nicht gleich Aggression. Der Bürgerim Staat, 2:91–95, 1993.

    [18] J. Pearl. Theart and science of cause and effect. 1996 Faculty Research Lecture,http://singapore.cs.ucla.edu/LECTURE/lecture:sec1.htm,Accessed November 2015.

    [19] J. Pearl. Reasoning with cause and effect. 1999 IJCAI AwardLecture,http://singapore.cs.ucla.edu/IJCAI99/index.html,Accessed November 2015.

    [20] J. Pearl. Causality: Models, Reasoning, and Inference. CambridgeUniversity Press, New York, 2000.

    [21] J. Pearl. An introduction to causal inference. The InternationalJournal of Biostatistics, 6(2), 2010.

    [22] R. W. Pew, A. S. Mavor, et al. Modeling Human andOrganizational Behavior: Application to Military Simulations.National Academies Press, 1998.

    [23] S. Prentice-Dunn and R. Rogers. Deindividuation and theself-regulation of behaviour. In P. Paulus, editor, Psychology ofGroup Influence, pages 89–109. Lawrence Erlbaum Associates,Hillsdale, 1989.

    [24] S. F. Railsback, S. L. Lytinen, and S. K. Jackson. Agent-basedsimulation platforms: Review and developmentrecommendations. Simulation, 82(9):609–623, 2006.http://www.humboldt.edu/ecomodel/documents/ABMPlatformReview.pdf,Accessed November 2015.

    [25] B. Schmidt. The modelling of human behaviour: The PECSreference model. In A. Verbraeck and W. Krug, editors,Proceedings of the14th European Simulation Symposium, pages13–18. SCS Europe BVBA, 2002.

    [26] P. Spirtes. Introduction to causal inference. The Journal ofMachine Learning Research, 11:1643–1662, 2010.

    [27] E. Staub. Predicting collective violence: The psychological andcultural roots of turning against others. In C. Summers andE. Markusen, editors, Collective Violence, pages 195–209.Rowman & Littlefield Publishers, Lanham, 1999.

    [28] C. Summers and E. Markusen, editors. Collective Violence:Harmful Behavior in Groups and Governments. Rowman &Littlefield Publishers, Lanham, 1999.

    [29] C. Swinerd and K. R. McNaught. Design classes for hybridsimulations involving agent-based and system dynamics models.Simulation Modelling Practice and Theory, 25:118–133, 2012.

    [30] C. Urban. PECS: A reference model for the simulation ofmultiagent systems. In R. Suleiman, K. G. Troitzsch, andN. Gilbert, editors, Tools and techniques for social sciencesimulation, pages 83 – 114. Physica-Verlag, 2000.

    [31] I. H. Witten and E. Frank. Data Mining: Practical machinelearning tools and techniques. Morgan Kaufmann, 2011.

    [32] J. Woodward. Causation and manipulability. In StanfordEncyclopedia of Philosophy.http://plato.stanford.edu/entries/causation-mani/,Accessed November 2015.

    [33] J. Woodward. Scientific explanation. In Stanford Encyclopedia of

  • 14 Kvassay, M.: A Contribution Towards the Discovery of Causal Relationships in Agent-Based Models of ...

    Philosophy.http://plato.stanford.edu/entries/scientific-

    explanation/, Accessed November2015.

    Selected Papers by the AuthorM. Kvassay, P. Krammer, L. Hluchý, B. Schneider. Causal Analysis of

    an Agent-Based Model of Human Behaviour. Complexity,Volume 2017, Article doi:10.1155/2017/8381954, 18 pages.

    M. Kvassay, L. Hluchý, Š. Dlugolinský, B. Schneider, H. Bracker, A.Tavčar, M. Gams, M. Contat, L. Dutka, D. Król, M. Wrzeszcz, J.Kitowski. A novel way of using simulations to support urbansecurity operations. Computing and Informatics, 34(6):1201–1233, 2015.

    M. Kvassay, L. Hluchý, P. Krammer, B. Schneider. Causal analysis ofthe emergent behavior of a hybrid dynamical system. ActaPolytechnica Hungarica, 11(4): 21–40, 2014.

    L. Hluchý, M. Kvassay, Š. Dlugolinský, B. Schneider, H. Bracker, B.Kryza, J. Kitowski. Towards more realistic human behavioursimulation: Modelling concept, deriving ontology and semanticframework. In Applied Computational Intelligence inEngineering and Information Technology : revised selectedpapers from the 6th IEEE International Symposium on AppliedComputational Intelligence and Informatics (SACI 2011), pages1–17. Springer, 2012.

    M. Laclavík, Š. Dlugolinský, M. Šeleng, M. Kvassay, B. Schneider,H. Bracker, M. Wrzeszcz, J. Kitowski, L. Hluchý. Agent-basedsimulation platform evaluation in the context of human behaviormodeling. In Advanced Agent Technology: AAMAS 2011Workshops (Revised Selected Papers), LNAI 7068, pages396–410. Springer, 2012.

    M. Laclavík, Š. Dlugolinský, M. Šeleng, M. Kvassay, E. Gatial, Z.Balogh, L. Hluchý. Email analysis and information extraction forenterprise benefit. Computing and Informatics, 30(1): 57–87,2011.