4
M. Pěchouček, P. Petta, and L.Z. Varga (Eds.): CEEMAS 2005, LNAI 3690, pp. 661 664, 2005. © Springer-Verlag Berlin Heidelberg 2005 The Role of Ontologies in a Multi-agent Based Data Integration System Rahee Ghurbhurn 1 , Philippe Beaune 1 , and Hugues Solignac 2 1 Génie Industriel et Informatique, Ecole des Mines de St Etienne, 158 cours Fauriel 42000 St Etienne, France {ghurbhurn, beaune}@emse.Fr 2 STMicroelectronics, zi Peynier Rousset 13790 Rousset [email protected] Abstract. In this paper, we present a flexible architecture allowing applications to access heterogeneous distributed manufacturing data. The objective is to eliminate data duplication and therefore the need for their synchronization and complex update. We propose a multiagent architecture based on ontologies for integrating the different data sources and for retrieving desired data. Web services are also used for agent-application communication. This is still an on- going work. 1 Introduction Let’s consider an information system composed of ERPs, data-warehouses, data bases and applications exploiting data found in these repositories for statistical analysis, production planning etc. Due to changes in the technological or business environment, several changes taking the form of modifications in the data sources’ physical or logical structure, replacement of a data source or replacement of an application may arise. These changes will impact the whole information system as the elements composing the system are tightly coupled, through the use of ERPs or hard coded queries. Moreover, data may be organised into two layers. Second layer data sources being data sources that may collect, aggregate, transform data sub-sets from several first layer (master) data sources. The update of second layer data sources may be triggered manually, by the system administrator, or automatically at a regular interval of time. The applications, forming the information system, may sometimes either be linked to first layer or second layer data sources or both. The problem with second layer data sources is that they need some synchronization with the first layer data sources. This synchronization is rendered more complex by the fact that the repositories may have a different storage (relational schema, table names) and data structure (primary key, data type, data size), making it difficult to establish a mapping. The dual objective of this paper is to firstly show how MAS and ontologies can help to achieve greater flexibility in the information system’s architecture. That is reduce the impact, on the information system’s architecture, of addition, removal or modification of applications or data sources. In other words loosen the links between the applications and the data sources while providing a single, knowledge base, point

[Lecture Notes in Computer Science] Multi-Agent Systems and Applications IV Volume 3690 || The Role of Ontologies in a Multi-agent Based Data Integration System

Embed Size (px)

Citation preview

M. Pěchouček, P. Petta, and L.Z. Varga (Eds.): CEEMAS 2005, LNAI 3690, pp. 661 – 664, 2005. © Springer-Verlag Berlin Heidelberg 2005

The Role of Ontologies in a Multi-agent Based Data Integration System

Rahee Ghurbhurn1, Philippe Beaune1, and Hugues Solignac2

1 Génie Industriel et Informatique, Ecole des Mines de St Etienne, 158 cours Fauriel 42000 St Etienne, France {ghurbhurn, beaune}@emse.Fr

2 STMicroelectronics, zi Peynier Rousset 13790 Rousset [email protected]

Abstract. In this paper, we present a flexible architecture allowing applications to access heterogeneous distributed manufacturing data. The objective is to eliminate data duplication and therefore the need for their synchronization and complex update. We propose a multiagent architecture based on ontologies for integrating the different data sources and for retrieving desired data. Web services are also used for agent-application communication. This is still an on-going work.

1 Introduction

Let’s consider an information system composed of ERPs, data-warehouses, data bases and applications exploiting data found in these repositories for statistical analysis, production planning etc. Due to changes in the technological or business environment, several changes taking the form of modifications in the data sources’ physical or logical structure, replacement of a data source or replacement of an application may arise. These changes will impact the whole information system as the elements composing the system are tightly coupled, through the use of ERPs or hard coded queries.

Moreover, data may be organised into two layers. Second layer data sources being data sources that may collect, aggregate, transform data sub-sets from several first layer (master) data sources. The update of second layer data sources may be triggered manually, by the system administrator, or automatically at a regular interval of time. The applications, forming the information system, may sometimes either be linked to first layer or second layer data sources or both. The problem with second layer data sources is that they need some synchronization with the first layer data sources. This synchronization is rendered more complex by the fact that the repositories may have a different storage (relational schema, table names) and data structure (primary key, data type, data size), making it difficult to establish a mapping.

The dual objective of this paper is to firstly show how MAS and ontologies can help to achieve greater flexibility in the information system’s architecture. That is reduce the impact, on the information system’s architecture, of addition, removal or modification of applications or data sources. In other words loosen the links between the applications and the data sources while providing a single, knowledge base, point

662 R. Ghurbhurn, P. Beaune, and H. Solignac

of entry for information retrieval from multiple heterogeneous data sources. Secondly how the use of a MAS and ontologies can help to achieve flexible semantic applications integration. The idea is to device web services representing business functions of applications we want to integrate, and use an ontology to convert the output of one application into an input format that can be understood by another.

We propose, in this paper, a Multi-agent system (MAS) [2] [3] [4] [7] that allows the application to directly retrieve data from the first layer data sources. Thus no update is needed and users can retrieve all the desired attributes. The knowledge contained in the data sources and the relationship existing between them is defined in an ontology [1][5]. The latter is used by an application to formulate user queries in terms of concepts. These queries are sent, by the application via messages, to the MAS, which finds its corresponding location(s) and retrieves the required data.

This paper is organized as follows. In section 2 we will briefly present a sub-set of an information system’s architecture that will be used for our research. Section 3 describes the proposed architecture based on a data integration ontology [8] and MAS. In section 4 we will give a conclusion and some future works.

2 Information System’s Context

Let us consider a maintenance-planning problem in an integrated circuit manufacturing company. To illustrate our problem, let us suppose that we have three data sources (Maintenance, Human Resource and Equipment) and an application.

The Human Resource data source (HRDS) stores personal and trainings data. The latter is frequently updated, due to a rapidly evolving environment. The manufacturing staff is regularly trained on new processes, new equipments and products to ensure a certain level of competence. These trainings are valid for a time period. Beyond this period, the concerned manufacturing staff members are no longer authorized to work on the machines.

The manufacturing data source (MDS) is fed with data coming from the different equipments, be it production, testing or control equipments. The stored data is aggregated before being dispatched to more specific applications for monitoring and production planning tasks.

The equipment data source (EDS) stores information about past maintenance actions but also documentations about the maintenance actions corresponding to each equipment.

The application is responsible for providing a list of machines to be serviced with the corresponding maintenance actions to be performed. It also provides a list of technicians authorized to perform these actions. To provide such an information, the application has to access the three data sources.

The critical point here is that the manufacturing and human resource data are manually fed into the EDS. Thus there is no access to the original MDS and HRDS. This poses the problem of data synchronization and data update between EDS and the other data sources. This synchronization is rendered more complex by the fact that the data stored in the EDS may have a different storage (relational schema, table names.) and data structure (primary key, data type, data size), making it difficult to establish a mapping.

The Role of Ontologies in a Multi-agent Based Data Integration System 663

We propose to use an ontology to model the knowledge contained in the data sources and relationships between them. This ontology is used by a MAS to retrieve the appropriate data before communicating it to the requesting application. Our proposal is explained in the following section.

3 Knowledge Retrieval and MAS

Our proposition consists in building an ontology, expressed in OWL, modelling the targeted users’ domains’ knowledge. For each property of the model, we define the location of the corresponding data in the data sources. This association, done manually by the user or the administrator, consists in associating the different data source attributes, retrieved by the resource agents, to the ontology’s properties. This approach is less tedious than that followed by the Museum of Finland [6] as it requires less human intervention. Indeed, in our case, human intervention is limited ontology building and a simple concept/attribute association whereas in the case of the Museum of Finland the administrator has to build the ontology, the XML rules corresponding to the concepts’ structure in the data sources, instantiate the rules with XQuery and choose the appropriate concept in case of multiple result.

To link the applications to the MAS web service connectors are defined for all the applications therefore providing a standard means to plug a new application to the MAS for data retrieval. During the initialisation, the application sends a SOAP message to a query database and retrieves the available predefined queries. These queries are proposed to users who compose and validate their queries. The latter is embodied into a SOAP message and is sent to the MAS for data retrieval.

Fig. 1. The Proposed Architecture

For example in our context, one predefined query may be “Is employee having ID145 authorized to perform task number 158 on equipment xv156gt?" The task agent receives and decodes a SOAP message, locates the ontology agent, by means of the middle agent, and sends the query. The query is automatically converted, by the ontology agent, into appropriate SQL queries by means of the conversion matrix. The SQL queries are then dispatched to the appropriate resource agents who retrieve and send the data back to the task agent. The task agent sends the results, in a structured

Application

WS Connector Task Agent

Middle Agent Ontology Agent Ressource Agent

Monitoring Agent Ressource Agent

Ressource AgentAdmin Agent

H. Resource

Maintenance

Equipment

System Administration Query Answering

QueryDB

664 R. Ghurbhurn, P. Beaune, and H. Solignac

form, back to the web service via a SOAP message. A monitoring agent that computes performance indicators monitors all the agents.

A special agent allowing the systems administrator to build and maintain the ontology does the administration of the system. When ever a change in the data sources’ data structures (removal/addition of an attribute or table) is made, a message is sent by the concerned resource agent to the ontology agent, via the admin agent, to evaluate the impact on the ontology. In case of simple changes (addition or removal of attributes), we may allow the ontology agent to update the ontology and in more complex cases the agent sends an alert message to the ontology administrator. Another task of the administration agents is to allow the testing of new queries before proposing them to the users via the query database (QueryDB). This function may prove to be useful when adding new data sources.

4 Conclusion

In this paper we presented a data retrieval architecture based on multi agents and ontologies. This architecture proposes an alternative to data duplication therefore avoiding the need of data synchronization and the necessary integrity controls. We are currently implementing our ontology-data source linking method. After this implementation phase performance tests will be performed and a comparison with the first method made.

References

[1] T.R.Gruber. “Towards Principles for the Design of Ontologies Used for Knowledge Sharing”, International Workshop on Formal Ontology, N. Guarino & R. Poli, (Eds.), Padova, Italy, 1993

[2] P.M.Hatch 2001. Multiagent System Infrastructures for Information Integration on the Web.

[3] NR Jennings, K. Sycara, M. Wooldrige, “A Roadmap of Agent Research and development”.International Journal of Autonomous Agents and Multi-agents Systems1(1),1998, 7-38.

[4] N.R.Jennings, M. Wooldrige. “Intelligent Agent: Theory and Practice”.The Knowledge Engineering Review. 10(2), 1995, pp.115-152.

[5] N.F. Noy, D. L. McGuinness. “Ontology Development 101 : A guide to creating your first ontology.” Stanford University, Stanford, CA, USA, 2001.

[6] V. Raatikka and E. Hyvonen. “Ontology-based semantic metadata validation.” HIIT Publications number 2002-03, Helsinki Institute for InformationTechnology (HIIT), Helsinki, Finland, 2002.

[7] K. Sycara, K. Decker, A. Pannu, M. Williamson, and D. Zeng, "Distributed Intelligent Agents," IEEE Expert, 1996.

[8] H.Wache “Ontology-Based Integration of Information - A Survey of Existing Approaches”.IJCAI-01 Ontologies and Information Sharing Workshop.” 2001.