7
A Multi-Formalism Framework to Generate Diagnostic Decision Support Systems Giuseppe Cicala, Marco De Luca, Marco Oreggia, Armando Tacchella KEYWORDS Modeling and simulation of Cyber-Physical Sys- tems, Knowledge-based modeling formalisms, Actor- based simulation. ABSTRACT The task of a Diagnostic Decision Support System (DDSS) is to deduce the health status of a physical system. In this paper, a multi-formalism framework to generate DDSS software based on formal descriptions of the application domain and the diagnostic computations is proposed. The key idea is to describe systems and related data with a domain ontology, and to describe diagnostic computations with an actor-based model. Implementation-specific code is automatically generated from such dual-formalism descriptions, while the struc- ture of the DDSS is invariant across applications. An evaluation involving an artificial scalable domain related to the diagnosis of air conditioning systems is presented to exemplify and to test the proposed framework. I. I NTRODUCTION Diagnostic Decision Support Systems (DDSSs) help humans in the deduction of information about the health status of some observed physical system. From a prac- tical point of view, the availability of digital sensors, reliable and high-capacity networks and powerful pro- cessing units, makes automated diagnosis applicable to an increasing number of systems. However, data and diagnostic rules remain domain-dependent, and the implementation of a DDSS requires the development of substantial portions of ad-hoc software which can hardly be recycled. Indeed, while most of the existing literature about DDSS focuses on improving performances in some domain of interest, to the best of our knowledge there is no contribution in the way of generating cus- tomized DDSS from high-level specifications. The research presented in this paper attempts to fill this gap by developing a framework to generate customized DDSSs using a multi-formalism approach. Multi-formalism modeling — see, e.g., [GI13] — refers to tools and techniques wherein several different for- malisms are exploited to achieve a specific goal. The G. Cicala, M. Oreggia and A. Tacchella are with “Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi” (DIBRIS), University of Genoa, Viale Causa 13, 16145 Genoa, Italy, E-mail: [email protected] — M. De Luca is with ABIRK Italia S.r.l., Corso MonteGrappa 1/1A, 16137 Genova, Italy, E-mail: [email protected] combination of formalisms is useful whenever specify- ing a system with a single modeling language would be hard, if not impossible. In this paper, two “classical” AI formalisms are combined to generate DDSSs: systems are described with ontologies in the sense of [Gru95], i.e., “formal and explicit specification of conceptual- izations”; diagnostic computations are described with actor-based models as introduced in [Agh85] with the extensions found in [LTSS11]. More in detail, the choice of ontologies is motivated by their increasing popularity outside the AI community — mainly due to Semantic Web applications — and the added flexibility that they provide over traditional relational data models, e.g., the ability to cope with tax- onomies and part-whole relationships, and the ability to handle heterogeneous attributes. It should be noticed that other proposals exists in the literature to extend the basic relational data model in order to handle more expressive domains, e.g., [CM94]. We consider ontologies because they provide a general-purpose, logically well-fonded extension which also enjoys widespread use. Since it is expected that large quantities of data should be handled to provide meaningful input to the DDSS, the choice of the ontology language should be restricted to those designed for tractable reasoning like the DL-Lite family introduced by [CGL + 05]. The choice of actor-based models is motivated by support for heterogeneous mod- eling, i.e., a situation wherein different parts of a system have inherently distinct properties, and therefore require different types of models. DDSSs are no exception to this pattern, since they are required to monitor and diagnose the behavior of heterogeneous systems, and they are themselves a composition of physical processes and computational elements. Following the approach outlined above, a DDSS generator – called DiSeGnO for “Diagnostic Server Generation through Ontology” – has been developed. DiSeGnO outputs a DDSS given a formal description of the application domain — the domain ontology and associated diagnostic computations — the diag- nostic rules. DiSeGnO interprets such dual-formalism descriptions by generating a relational database from the domain ontology and then computing diagnostic rules using PTOLEMY II [EJL + 03], an open-source software supporting experimentation with actor-based design. The generated DDSS is also wrapped by automatically gen- erated web services which connect to the plant on one side, and to diagnostic dashboards on the other. The Proceedings 30th European Conference on Modelling and Simulation ©ECMS Thorsten Claus, Frank Herrmann, Michael Manitz, Oliver Rose (Editors) ISBN: 978-0-9932440-2-5 / ISBN: 978-0-9932440-3-2 (CD)

A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

A Multi-Formalism Framework to GenerateDiagnostic Decision Support Systems

Giuseppe Cicala, Marco De Luca, Marco Oreggia, Armando Tacchella

KEYWORDS

Modeling and simulation of Cyber-Physical Sys-tems, Knowledge-based modeling formalisms, Actor-based simulation.

ABSTRACT

The task of a Diagnostic Decision Support System(DDSS) is to deduce the health status of a physicalsystem. In this paper, a multi-formalism framework togenerate DDSS software based on formal descriptions ofthe application domain and the diagnostic computationsis proposed. The key idea is to describe systems andrelated data with a domain ontology, and to describediagnostic computations with an actor-based model.Implementation-specific code is automatically generatedfrom such dual-formalism descriptions, while the struc-ture of the DDSS is invariant across applications. Anevaluation involving an artificial scalable domain relatedto the diagnosis of air conditioning systems is presentedto exemplify and to test the proposed framework.

I. INTRODUCTION

Diagnostic Decision Support Systems (DDSSs) helphumans in the deduction of information about the healthstatus of some observed physical system. From a prac-tical point of view, the availability of digital sensors,reliable and high-capacity networks and powerful pro-cessing units, makes automated diagnosis applicableto an increasing number of systems. However, dataand diagnostic rules remain domain-dependent, and theimplementation of a DDSS requires the development ofsubstantial portions of ad-hoc software which can hardlybe recycled. Indeed, while most of the existing literatureabout DDSS focuses on improving performances insome domain of interest, to the best of our knowledgethere is no contribution in the way of generating cus-tomized DDSS from high-level specifications.

The research presented in this paper attempts tofill this gap by developing a framework to generatecustomized DDSSs using a multi-formalism approach.Multi-formalism modeling — see, e.g., [GI13] — refersto tools and techniques wherein several different for-malisms are exploited to achieve a specific goal. The

G. Cicala, M. Oreggia and A. Tacchella are with “Dipartimentodi Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi”(DIBRIS), University of Genoa, Viale Causa 13, 16145 Genoa, Italy,E-mail: [email protected] — M. De Luca is with ABIRKItalia S.r.l., Corso MonteGrappa 1/1A, 16137 Genova, Italy, E-mail:[email protected]

combination of formalisms is useful whenever specify-ing a system with a single modeling language would behard, if not impossible. In this paper, two “classical” AIformalisms are combined to generate DDSSs: systemsare described with ontologies in the sense of [Gru95],i.e., “formal and explicit specification of conceptual-izations”; diagnostic computations are described withactor-based models as introduced in [Agh85] with theextensions found in [LTSS11].

More in detail, the choice of ontologies is motivatedby their increasing popularity outside the AI community— mainly due to Semantic Web applications — andthe added flexibility that they provide over traditionalrelational data models, e.g., the ability to cope with tax-onomies and part-whole relationships, and the ability tohandle heterogeneous attributes. It should be noticed thatother proposals exists in the literature to extend the basicrelational data model in order to handle more expressivedomains, e.g., [CM94]. We consider ontologies becausethey provide a general-purpose, logically well-fondedextension which also enjoys widespread use. Since it isexpected that large quantities of data should be handledto provide meaningful input to the DDSS, the choiceof the ontology language should be restricted to thosedesigned for tractable reasoning like the DL-Lite familyintroduced by [CGL+05]. The choice of actor-basedmodels is motivated by support for heterogeneous mod-eling, i.e., a situation wherein different parts of a systemhave inherently distinct properties, and therefore requiredifferent types of models. DDSSs are no exception tothis pattern, since they are required to monitor anddiagnose the behavior of heterogeneous systems, andthey are themselves a composition of physical processesand computational elements.

Following the approach outlined above, a DDSSgenerator – called DiSeGnO for “Diagnostic ServerGeneration through Ontology” – has been developed.DiSeGnO outputs a DDSS given a formal descriptionof the application domain — the domain ontology —and associated diagnostic computations — the diag-nostic rules. DiSeGnO interprets such dual-formalismdescriptions by generating a relational database from thedomain ontology and then computing diagnostic rulesusing PTOLEMY II [EJL+03], an open-source softwaresupporting experimentation with actor-based design. Thegenerated DDSS is also wrapped by automatically gen-erated web services which connect to the plant on oneside, and to diagnostic dashboards on the other. The

Proceedings 30th European Conference on Modelling and Simulation ©ECMS Thorsten Claus, Frank Herrmann, Michael Manitz, Oliver Rose (Editors) ISBN: 978-0-9932440-2-5 / ISBN: 978-0-9932440-3-2 (CD)

Page 2: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

conversion of the ontology design to a database structureis a key element in DiSeGnO. It preserves the high leveldescription but, at the same time, it ensures quick accessto data and leverages industry-standard database sys-tems. The usage of PTOLEMY II as a rule engine enabledfast-prototyping of DiSeGnO, and might be replacedby a diagnostic rule compiler in practical applications.However, as the experimental analysis herein presentedshows, even in its current PTOLEMY II-based implemen-tation, DiSeGnO can process a substantial flow of datafrom an incoming (simulated) plant in real-time. In thissense, our work is similar in spirit to [FMMV16], as bothcontributions propose to merge different formalisms inorder to describe complex systems properly.

The rest of the paper is structured as follows. InSection II an introduction to ontologies and actor-basedmodels is given. A case study about Heating, Ventilationand Air Conditioning (HVAC) systems in households ispresented in Section III. In Section IV the architectureof DiSeGnO and the main components to generateDDSSs are presented. Finally, Section V shows theexperimental evaluation of DiSeGnO on the HVAC casestudy. The paper is concluded in Section VI with somefinal remarks and an outline of a future research agenda.

II. BACKGROUND

Ontology-based data access (OBDA) relies on theconcept of knowledge base, i.e., a pair K = 〈T ,A〉where T is the terminological box (Tbox for short) spec-ifying the intensional knowledge, i.e., known classes ofdata and relations among them, and A is the assertionalbox (Abox for short) specifying the extensional knowl-edge, i.e., factual data and their classification. Fillingthe Abox with known facts structured according to theTbox is a process known as ontology population. One ofthe mainstream languages for defining knowledge basesis OWL 2 (Web Ontology Language Ver. 2) describedin [MPSP+09]. Since OWL 2 is a World Wide WebConsortium’s recommendation, it is supported by severalontology-related tools. However, the logical underpin-ning of OWL 2 is the description logic SROIQ whosedecision problem is 2NEXPTIME-complete accordingto [Kaz08]. This makes the use of the full expressivepower of OWL 2 prohibitive for an application like theone we are considering.

To retain most of the practical advantages of OWL2, but to improve on its applicability, Motik et al.introduced OWL 2 profiles – see [MPSP+09]. Formally,an OWL 2 profile is a sub-language of OWL 2 featuringlimitations on the available language constructs andtheir usage. In particular, the OWL 2 QL profile isdescribed in the official W3C’s recommendation as “[thesub-language of OWL 2] aimed at applications thatuse very large volumes of instance data, and wherequery answering is the most important reasoning task.”.Given our application domain, OWL 2 QL is moreappealing than both OWL 2 and other profiles, becauseit guarantees that conjunctive query answering and theconsistency of the ontology can be evaluated efficiently.

OWL 2 QL logic underpinning is given by DL-LiteR,one of the members of the DL-Lite family [CGL+05].A detailed description of DL-LiteR can be foundin [CGL+05]. The most important feature of OWL 2QL in this context is that, using the mapping techniquesintroduced in [RMC12], it is possible to keep the ter-minological view to reason about data, while storingthe Abox elements as records in a relational database.Formally, given a knowledge base K = 〈T ,A〉, it ispossible to build a database with a set of relations(tables) RK such that the query K |= α can be translatedto a relational algebra expression over RK returning thesame result set. The choice of OWL 2 QL guaranteesthat the mapping from K to RK is feasible, and thetranslation of ontology-based queries to SQL querieswill yield polynomially bounded expressions. In thisway, it is possible to take the best of the two approaches,i.e., use ontologies to define the conceptual view of thedomain, and databases to store actual data and connect tothe other performances-critical elements of the generatedDDSS, like data I/O and processing components.

The following notations and definitions arefrom [LTSS11]. Let S be a set of variables thattake values in some universe U . A valuation over S is afunction x : S → U that assigns to each variable v ∈ Ssome value x() ∈ U . The set of all assignments overS is denoted by S. If x ∈ S, v ∈ S and α ∈ U , then{x | v 7→ α} denotes the new valuation x′ obtainedfrom x by setting v to α and leaving other variablesunchanged. Timers are a special type of variables thattake values in R+, i.e., non-negative real numbers.Let R∞+ denote the set R+ ∪ {∞}, where ∞ denotespositive infinity. Finally, let ⊥ ∈ U and absent ∈ Udenote “unknown” value or “absence” of a signal at aparticular point in time, respectively.

An actor is a tuple A = (I,O, S, so, F, P,D, T )where I is a set of input variables, O is a set of outputvariables, S is a set of state variables, and so ∈ S is avaluation over S representing the initial state; F is thefire function, defined as F : S × I → O, that producesoutput based on input and current state; P is the postfirefunction defined as P : S× I → S that updates the statebased on the same information of the fire function; D isa deadline function defined as D : S×I → R∞+ and T isa time-update function defined as D : S× I ×R+ → S.It is assumed that F , P , D, T , are total functions, andI , O, and S are pair-wise disjoint. In the following, theterms input, output and state refer to valuations over I ,O and S, respectively.

Every actor A defines a set of behaviors whose modelis inspired by the semantic models of timed or hybridautomata. A timed behavior of A is a sequence

s0x0/yo−−−−→ s′0

x′0/d0−−−−→ s1

x1/y1−−−−→ s′1x′1/d1−−−−→ s2

x2/y2−−−−→ s′2 . . .

where for all i ∈ N, si, s′i ∈ S, di ∈ R+, xi ∈ I , yi ∈ O,

yi = F (si, xi) s′i = P (si, xi)

di ≤ D(s′i, x′i) si+1 = T (s′i, x

′i, di).

Page 3: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

Fig. 1. Domain ontology for HVAC monitoring. Concepts are represented by ovals, concept inclusions (is-a relationships) are denoted by dashed arrows, roles aredenoted by solid arrows, and attributes are denoted by dots attached to classes.

Fig. 2. Thermal model of a house with an HVAC unit sketched with PTOLEMYII graphical syntax. Gray boxes which include rectangles inside, e.g., House,are composite actors, whereas those with circles inside, e.g., Thermostat, arefinite-state machines.

Intuitively, if A is in state si at some time t ∈ R+

and the environment provides input xi to A, thenA instantaneously produces output yi using the firefunction F , and moves to state s′i using the postfirefunction P . The environment then proposes to advancetime, but it does so “respecting” any restriction on theamount of time that may elapse. A “declares” suchrestrictions by returning a deadline D(s′i, x

′i). Next, the

environment chooses to advance time by some concretedelay di ∈ R+, making sure that di does not violate thedeadline provided by A. Finally, the environment notifiesA that it advanced time by di, and A updates its stateto si+1 accordingly, using the time-delay function T .

III. HVAC CASE STUDY

HVAC systems are a classic topic in diagnostics— see, e.g., [NAL+07]. Here, the model1 shown inFigure 2 is considered as an example. This modeltakes into account topology, thermal properties of ma-terials, and warmer characteristics, i.e., temperature ofoutput hot air and flow-rate. As shown in Figure 2,

1The house thermal model can be downloaded from http://www.mathworks.it/help/simulink/examples/thermal-model-of-a-house.html

the main model components are Thermostat, Warmerand House subsystems. Thermostat allows fluctua-tions within a certain range above or below the de-sired set point. If House temperature drops below theset point minus allowed fluctuation, Thermostat turnson Warmer to provide a hot air flow at a constantrate and temperature. The heat flow is expressed bydQwarmer

dt = (Twarmer − Troom) · Mdot · c where dQdt

is the heat flow from Warmer to House, c is the heatcapacity of air at constant pressure, Mdot is the air massflow rate, Twarmer is the temperature of hot air andTroom is air temperature in the house. House subsystemcalculates internal temperature variations. It takes intoaccount the heat flow from Warmer and heat losses.Heat losses and the temperature time derivative aregoverned by dQlosses

dt = Troom−Toutside

Reqand dTroom

dt =1

Mair·c ·dQwarmer

dt − dQlosses

dt where Mair is the massof air inside House and Req is the equivalent thermalresistance of House. The DailyTempVar subsystemgenerates a daily fluctuations of outdoor temperature.Both inside and outside temperatures are affected by aGaussian noise to simulate reading from real sensors.

The ontology of the HVAC domain is shown inFigure 1. The main concepts in the static part ofthe domain are System and DataSource. They arerelated by isInSystem, stating that every systemhas — possibly several — data sources attachedto it. hasSubsystem relationship indicates thatone System could be composed by one or moreSystemComponent which are themselves subclassesof System. DataSource is the comprehensive classof elements that can generate diagnostic-relevantinformation. The main concepts in the dynamicpart of the ontology are DDSS which receivesinstances of IncomingEvent and sends instancesof OutgoingEvent. Notice that IncomingEvent

Page 4: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

Fig. 3. Functional architecture and work-flow of the current DiSeGnOframework.

instances are connected to DataSource instancesby the role generates, denoting that all incomingevents, i.e., data from the observed system, aregenerated by some data source, i.e., some fieldsensor. Also every OutgoingEvent instance, i.e.,every diagnostic event, relatesTo some instance ofDataSource. This is because the end user must beable to reconstruct which data source(s) providedinformation that caused diagnostic rules to fire agiven diagnostic event. OutgoingEvent specializesto AlarmEvent, FaultEvent and DescriptorEvent.Every OutgoingEvent instance is connected to one ofDiagnosticIndicator instances, i.e. Alarm, Fault andDescriptor sub-concepts, by reports relation, in orderto have a reference message about the diagnostic rules.

Diagnostic rules of interest have been extracted fromthe literature on HVAC systems — see, e.g., [RWLF04].In particular, assuming that there is a single fault in thesystem at any time, air filter obstruction, thermostat faultand pressure loss in the compressor are investigated. Incase of air filter obstruction, a reduction of air flow inoutput from the warmer results in a slow temperaturedrift away from the comfort zone. If the thermostatceases to work properly, e.g., because its state is stuckto either on or off, then the house temperature stayspermanently away from the comfort zone. In case ofpressure loss in the compressor, a loss of refrigerantcharge happens which diminishes the capability of thecompressor. The domain ontology, as well as the modelrules herein described are available on-line from

http://www.aimslab.org/disegno

IV. DISEGNO FRAMEWORK

A. Software architecture

Figure 3 shows the current functional architecture andwork-flow of DiSeGnO, organized in three phases. In theUSER phase, the domain ontology and the rules model

are designed by the user. In the DiSeGnO phase, thesystem reads and analyzes both the domain ontology andthe rules model. The output is code structured as shownin the DDSS phase. Here, input web services receivedata from the observed physical system and record themin the generated data store. The rule engine feeds thediagnostic rules with records loaded from the data storeand logs results, if any. Output web services can thenbe invoked to query the data store.

In the USER phase, the user is required to providean ontology of the observed physical system whichmust be written using OWL 2 QL language. Whilethis can be accomplished in several ways, the toolPROTEGE [GMF+03] is suggested because it is robust,easy to use, and it provides, either directly or throughplug-ins, several add-ons that facilitate ontology designand testing. The diagnostic computation model mustbe a sound actor diagram generated by PTOLEMY IIwhich describes the processing to be applied to incomingdata in order to generated diagnostic events. The onlyadditional requirement on the rules model is that the setof external inputs of the diagram must coincide with theincoming events described in the ontology. Analogously,the set of external outputs of the diagram must coincidewith the outgoing events.

The DiSeGnO phase contains the actual DDSS gen-eration system which consists of two modules in thecurrent implementation, namely the Data Store Gen-erator and the Web Services Generator. Given thedomain ontology, a data store is generated to recorddata and events. The data store is a relational databasewhich is obtained by mapping the domain ontologyto suitable tables. The web services generator createsservices whose interface asks for incoming events of thecorrect type (input web services) and services which canbe queried to obtain diagnostic events (output web ser-vices). Currently, the working prototype uses PTOLEMYII internal engine to run the rules model as if theywere code run on top of an interpreter. This solution isstraightforward to implement, but has the disadvantageof being potentially slow for real-world applications.

In the DDSS phase, the customized DDSS runs ina loop wherein (i) data is acquired from the observedsystem and stored in the internal database, (ii) the rulesengine processes data and generates diagnostic eventswhich are recorded on the database, and (iii) diagnosticdata is served to end-user application. The details ofthe data acquisition on the observed system are not ofconcern to the DDSS generated by DiSeGnO, becauseit is the responsibility of the observed system controllogic to implement the data acquisition part. This choiceeffectively isolates the physical details of data acquisi-tion from the rest of the DDSS. Similarly, the generatedDDSS is not concerned with the details of displayingand representing diagnostic data, because these data aremade available through output web-services and it isresponsibility of the user applications to read such dataand present them in a meaningful way.

Page 5: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

procedure VISITONTOLOGY(onto, d, r)g ← new Graph()VISITONTOLOGYREC(onto, onto.getThing(), g, d, r)return g

end procedure

procedure VISITONTOLOGYREC(onto, c, g, d, r)for all Concept s ∈ c.getSubConcepts() do

father ← NILif c 6= onto.getThing() then

father ← g.getTable(c)end ifT ← g.getTable(s)if T is NIL then

T ← new Table(s)for all Attribute a ∈ d.getDataAttribute(s) do

T .addAttribute(a)end forg.addNode(new Node(T ))if father is not NIL then

r.add(new Relationship(T , father, ’1 to n’))g.addEdge(T , father)

end ifVISITONTOLOGY(onto, s, g, d, r)

elseif father is not NIL then

r.add(new Relationship(T , father, ’1 to n’))g.addEdge(T , father)

end ifend if

end forend procedure

Fig. 4. Main algorithms of the Data Store Generator component.

B. From ontology to Database

Ontology has to be divided into two interconnectedparts, namely a static and a dynamic part. In the staticpart, the ontology should contain a description of theobserved physical system including entities for eachrelevant (sub)system and relationships among them.This part, once populated with the actual systems tobe observed, does not require further updates whilemonitoring. On the other hand, the dynamic part de-scribes events, including both the ones generated by theobserved system and its components, and those outputby the DDSS. An event is always associated to a time-stamp, i.e., the time at which the event happens. Dataassociated to events can be of heterogeneous types, butwe always distinguish between two kinds of events, i.e.,those incoming to the DDSS from the observed system,and those outgoing from the DDSS. This distinctionis fundamental, because DiSeGnO must know whichevents have to be associated with input and output webservices, respectively. Furthermore, both events shouldbe associated with the data sources, i.e., the elementsof the static part which generate events or influence thegeneration of a diagnostic event.As mentioned in the Introduction, the creation of arelational database from the ontology, i.e., the Tbox T ,allows efficient storage of the corresponding Abox A.The knowledge base K = 〈T ,A〉 can still be queriedseamlessly, e.g., by using the mapping techniques de-scribed in [RMC12]. The algorithm used by DiSeGnOto encode an OWL 2 QL ontology into the structureof a relational database reads the ontology model froman OWL file into the internal representation onto; thenit parses onto and extracts the map dataMap between

concepts and datatype properties and the map relMapbetween concepts and object properties (roles). At thispoint, it can visit the ontology by traversing the concepthierarchy with the function VISITONTOLOGY — seeFigure 4 — and it creates the graph dbGraph containingpart of the relational model corresponding to onto, usingdataMap as d and relMap as r. Finally, it builds therelational model by considering all the relationships,and translates it into a database, considering all thenodes of dbGraph and building corresponding tables andconstraints.

In more detail, VISITONTOLOGY and its sister proce-dure VISITONTOLOGYREC — see Figure 4 — performa visit of the concept hierarchy contained in onto tocreate a corresponding graph stored in dbGraph. Sincethe concept hierarchy forms, by definition, a directedacyclic graph, a simplified implementation of depth-firstsearch visit is sufficient to explore onto exhaustively.Inside VISITONTOLOGYREC a new table T — and acorresponding node in the graph g — is created foreach concept contained in onto. Furthermore, all thedatatype properties corresponding to the concept of Tare retrieved from the map d and added to T . These willbecome attributes of the entity corresponding to the con-cept in the final relational database. Notice that a one-to-many relationship corresponding to the inheritancerelation is added to r, the set of relationships extractedconsidering object properties in onto. As long as d isimplemented with a constant-time access structure, therunning time of VISITONTOLOGY is linear in the sizeof onto.

C. Rules Engine in Ptolemy

Database connection is guaranteed by a DatabaseM-anager actor that opens a connection and passes it to allactors accessing the database. Data are collected usinggeneric DatabaseQuery actors that query the databasevia the specified DatabaseManager and provide resultsas arrays of records. Collected data are provided to otheractors in the rule models according to their time stamp.Fault-detection rules are implemented in PTOLEMY IImodels using data-driven techniques. In particular, bothrule 1 and 3 leverage Artificial Neural Networks (ANN)trained with Encog [Hea14] software. In the rule detect-ing fault 1, the ANN is used to estimates the value (inpercentage) of air which is getting through the warmer.In case of fault 3, the ANN estimates the value ofgas pressure in the compressor circuit. Both estimationsare required because no physical sensors are availableto measure those quantities directly — ANNs act asvirtual sensors as in [KPJ06]. Rule 2 is based on astatistical outliers detection on the population of timeintervals between thermostat switching cycles. Outliersare identified by a finite state machine that assesseswhether or not they fall within a set of numericalboundaries called fences. If the time interval betweentwo consecutive warmer “on” status is bigger than the

Page 6: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

Fig. 5. PTOLEMY II model to detect air filter obstruction and generatecorresponding alarms and faults.

corresponding fence value, a thermostat stuck-at fault isrecorded.

As an example, in Figure 5 the PTOLEMY IImodel related to the air filter obstruction is shown.The main model components are DiagnosticCom-putationalModel, DecisionStateMachine, AlarmTo-Database and FaultToDatabase subsystems — datacollection is not detailed in the figure. Bold arrowson the left of Figure 5 represent incoming data. Diag-nosticComputationalModel subsystem contains actorscapable of organizing raw data in vectors to be fedto an ANN to estimate the percentage of air comingthrough the warmer (0% fully obstructed - 100% noobstruction). The moving average of the estimated valueis used as input of DecisionStateMachine subsystemwhere the proper event (i.e fault or alarm) is determined.The decision is based on two thresholds t1 = σ andt2 = 2 ∗ σ where σ is the standard deviation of theestimated percentage flow in normal conditions. If analarm or a fault is detected, the corresponding eventis inserted in the database by AlarmToDatabase andFaultToDatabase subsystems. In the plot of Figure 6an example behavior of the HVAC system leading toidentification of an air filter obstruction is shown. Togenerate the profile shown in the figure, a fault isinjected into a simulated HVAC filter for a specified timeinterval. The onset and the end of the faulty behaviorare marked by arrows in the plot. The profile of thefault is assumed in this case to be trapezoidal, i.e.,starting with no obstruction the air flow is graduallyreduced to 70% of the capacity and then it is graduallyrestored. The behavior of Figure 6 corresponds to thegeneration of several alarm events as soon as thresholdt1 is exceeded due to the initial drift with respect to thenormal behavior, and then fault events when thresholdt2 is exceeded due to persistent anomalous behavior.

D. Web Services

Data coming from physical systems are collected inxml files and sent to the DDSS input web service throughthe Internet. Because of potential security threats, filesare digitally signed combining a message-digest algo-rithm with public-key cryptography. Encryption uses thesymmetric-key algorithm available in the Java securityAPI. The code that implements web services consistsof a manually-developed skeleton — which is invari-ant across applications — and application-dependentmetadata. These are stored in tables inside DiSeGnO

3x10

2

4

6

8

10

12

14

16

18

20

22

24

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

Fault 1 output

t ime [s]

tem

pe

ratu

re [

°C

]

Fault begins Fault ends

Fig. 6. An example of air filter obstruction. The x axis reports time(seconds) and the y axis reports temperature (Celsius degrees). Theblue profile is the normal behavior, whereas the red profile is obtainedby injecting the fault in the system.

TABLE ILOAD RESULTS OF THE DDSS FOR DIFFERENT CONFIGURATIONS.

# Hits Meanservicetime [s]

MeanSQLtime [s]

SQLhits/min

HTTPhits/min

Meanclienttime[h:m:s]

5 3600 2.611 0.597 236712 101 0:33:4010 7200 4.073 0.871 264612 113 1:02:0115 10800 6.557 1.330 309150 132 1:19:3120 14400 9.070 1.774 302727 129 1:49:4925 18000 10.894 2.123 315289 134 2:11:54

data store and contain all the information related tothe queries that input and output web services haveto execute. Metadata are leveraged by the skeleton toimplement domain-specific behavior.

V. EXPERIMENTAL EVALUATION

The purpose of our experiments is to assess whetherthe DiSeGnO framework can be used in a real case, bothin terms of absolute performances, i.e., time to storedata coming from clients, and in terms of scalability,i.e., growth of computation time related to the quantityof data to be handled by the rules. Synthetic data aregenerated by PTOLEMY II models of the HVAC systemshown in Figure 2. Data from temperature sensors andwarmer apparatus are sampled (1 sample per simulationminute), collected in a file, and sent to the DDSSinput web-services through an HTTP connection every60 simulation minutes. Experiments were performed ona family of six identical Intel-based PCs, featuring aCore2Duo 2.13 GHz CPU, 4 GB of RAM and runningUbuntu Linux 12.04 (64 bit edition). Five “client” PCsrun household simulations, and one “server” runs theDDSS server generated by DiSeGnO. Each client PCsimulates 1 to 5 HVAC systems running for 30 sim-ulated days. Server performances are monitored usingJavaMelody2, an open-source tool to profile Java serverapplications.

Table I shows load results for different configurationsobtained varying the number of HVAC systems (“#”)

2https://code.google.com/p/javamelody.

Page 7: A Multi-Formalism Framework to Generate Diagnostic ...pdfs.semanticscholar.org/879b/08f413cdf04721caba4121ffc8907759466a.pdfA Multi-Formalism Framework to Generate Diagnostic Decision

TABLE IIREAL-TIME PERFORMANCES OF THE DDSS.

Numberof rules

Wall clocktime [h:m:s]

Usertime [s]

Systemtime [s]

1 6:29:32 10539 16652 9:56:07 24340 16523 12:06:08 31600 1628

connected to the DDSS server on a time span of 30days. For each row, the table shows the number of filessent to the DDSS server (“Hits”), the mean time toserve each file (“Mean service time”), the mean time toexecute SQL queries related to a single file (“Mean SQLtime”), the number of SQL queries per minute (“SQLhits/min”), the number of HTTP requests per minute(“HTTP hits/min”), and the mean time required by theclient to send all the data (“Mean client time”). Noticethat the figures for mean service time and mean databaseaccess time refer to cumulative performances averagedover the number of hits. On the other hand, mean clienttime refers to cumulative performances averaged overthe number of systems. For instance, the last line of thetable refers to loading data from 25 systems running for30 (simulated) days. Since the simulation on the clientsis accelerated, it takes only about two hours (on average)for a client to send all the data it generates in this case.Clearly, if the number of systems to monitor grows,the throughput of the DDSS decreases and client timeincreases — linearly in all the experiments we consider.However, two hours is about 2 orders of magnitude lessthan 30 days, indicating that the DDSS generated byDiSeGnO could support more systems or more signalseasily. Table II shows the performances of the generatedDDSS when varying the number of rules (1 to 3)applied to the biggest configuration loaded (25 systemsrunning). Here, one can observe that the total wall clocktime required on the server side is much less than 30days, indicating that, even in its prototypical stage, theDDSS generated by DiSeGnO could run in real-time.On the other hand, the CPU time required to processthe diagnostic rules (“User time” plus “System time”)albeit a fraction of total wall clock time — from 45%in the case of 1 rule, up to 73% in the case of threerules — indicates that the current implementation is aptto scale better in the number of systems rather than inthe number of rules.

VI. CONCLUSIONS

Summing up, this paper shows that it is possible tocombine ontology-based system descriptions and actor-based rule computation models in DiSeGnO frameworkto generate efficient DDSS software in a push-buttonway. In the current prototype implementation, DiSeGnOstill relies heavily on PTOLEMY II to run the rulesengine potentially requiring more computation time thanan equivalent, manually-coded, DDSS. However, evenin its present prototypical stage, the system is usablein practice to diagnose small-to-medium scale systems

with acceptable performances as shown in Table I andTable II. One of the issues left for future extensions isto automatically compile the model of rules in order toimprove performances, e.g., by generating code inde-pendent from PTOLEMY II. A practical implementationon a real industrial case study will be the final testingground.

REFERENCES

[Agh85] G.A. Agha. Actors: A Model Of Concurrent Computa-tion In Distributed Systems. PhD thesis, University ofMichigan, 1985.

[CGL+05] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini,and R. Rosati. DL-Lite: Tractable Description Logics forOntologies. In Proceedings of the National Conferenceon Artificial Intelligence, volume 20, page 602. MenloPark, CA; Cambridge, MA; London; AAAI Press; MITPress; 1999, 2005.

[CM94] Sharma Chakravarthy and Deepak Mishra. Snoop:An expressive event specification language for activedatabases. Data & Knowledge Engineering, 14(1):1–26,1994.

[EJL+03] Johan Eker, Jorn W Janneck, Edward A. Lee, Jie Liu,Xiaojun Liu, Jozsef Ludvig, Sonia Sachs, Yuhong Xiong,and Stephen Neuendorffer. Taming heterogeneity - thePtolemy approach. Proceedings of the IEEE, 91(1):127–144, 2003.

[FMMV16] Francesco Flammini, Stefano Marrone, Nicola Maz-zocca, and Valeria Vittorini. Fuzzy decision fusion andmultiformalism modelling in physical security monitor-ing. In Recent Advances in Computational Intelligencein Defense and Security, pages 71–100. Springer, 2016.

[GI13] Marco Gribaudo and Mauro Iacono. Theory and Appli-cation of Multi-Formalism Modeling. IGI Global, 2013.

[GMF+03] J.H. Gennari, M.A. Musen, R.W. Fergerson, W.E.Grosso, M. Crubezy, H. Eriksson, N.F. Noy, and S.W.Tu. The Evolution of Protege: An Environment forKnowledge-Based Systems Development. InternationalJournal of Human-Computer Studies, 58(1):89–123,2003.

[Gru95] T.R. Gruber. Toward principles for the design of ontolo-gies used for knowledge sharing. International journalof human computer studies, 43(5):907–928, 1995.

[Hea14] Jeff Heaton. Encog machine learning framework, 2014.https://github.com/encog.

[Kaz08] Y. Kazakov. RIQ and SROIQ are Harder thanSHOIQ. In Description Logics, 2008.

[KPJ06] Sanem Kabadayi, Adam Pridgen, and Christine Julien.Virtual sensors: Abstracting data from physical sensors.In Proceedings of the 2006 International Symposium onon World of Wireless, Mobile and Multimedia Networks,pages 587–592. IEEE Computer Society, 2006.

[LTSS11] Edward A Lee, Stavros Tripakis, Christos Stergiou, andChris Shaver. A modular formal semantics for ptolemy.Technical report, University of California at Berkley —Dept. of Electrical Engineering and Computer Science.,2011.

[MPSP+09] B. Motik, P.F. Patel-Schneider, B. Parsia, C. Bock,A. Fokoue, P. Haase, R. Hoekstra, I. Horrocks, A. Rut-tenberg, U. Sattler, and et al. OWL 2 Web OntologyLanguage: Structural Specification and Functional-StyleSyntax. W3C Recommendation, 27, 2009.

[NAL+07] SETU MADHAVI Namburu, Mohammad S Azam, Jian-hui Luo, Kihoon Choi, and Krishna R Pattipati. Data-driven modeling, fault diagnosis and optimal sensorselection for hvac chillers. Automation Science and En-gineering, IEEE Transactions on, 4(3):469–473, 2007.

[RMC12] M. Rodrıguez-Muro and D. Calvanese. Quest, an OWL 2QL Reasoner for Ontology-based Data Access. OWLED2012, 2012.

[RWLF04] Kurt W Roth, Detlef Westphalen, Patricia Llana, andMichael Feng. The Energy Impact of Faults in USCommercial Buildings. In International Refrigerationand Air Conditioning Conference, 2004.