55
Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID Purdue University, West Lafayette, Indiana 47907-1398 ( [email protected]. edu) A multidatabase system (MDBS) is a confederation of preexisting distributed, heterogeneous, and autonomous database systems. There has been a recent proliferation of research suggesting the application of object-oriented techniques to facilitate the complex task of designing and implementing MDBSS. Although this approach seems promising, the lack of a general framework impedes any further development. The goal of this paper is to provide a concrete analysis and categorization of the various ways in which object orientation has affected the task of designing and implementing MDBSS. We identify three dimensions in which the object-oriented paradigm has influenced this task: the general system architecture, the schema architecture, and the heterogeneous transaction management. Then we provide a classification and a comprehensive analysis of the issues related to each of the above dimensions. To demonstrate the applicability of this analysis, we conclude with a comparative review of existing multidatabase systems. Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]: Distributed Systems—distributed applications, dlstrkbuted databases; D.1.5 [Programming Techniques]: Object-oriented Programming; H.2.1 [Database Management]: Logic Design—data models, schema and subschema; H.2.3 [Database Management]: Languages-query languages; H.2.4 [Database Management]: Systems—distributed systems, query processing, transaction processing; H.2.5 [Database Management]: Heterogeneous Databases—data translation, program translation General Terms: Design, Languages, Management, Standardization Additional Key Words and Phrases: Distributed objects, federated databases, integration, multidatabases, views 1. INTRODUCTION The computing environment in most con- temporary organizations contains dis- tributed, heterogeneous, and autonomous hardware and software systems. There is an increasing need for technology to sup- port cooperation of the services this soft- ware and hardware provides. The requirement for building systems that combine heterogeneous resources and services can be met at the low level of interconnectiuity or at the higher level of interoperability [Manola et al. 1992; So- ley 1992]. Interconnectivity simply sup- ports system communication, while inter- operability additionally allows systems to cooperate in the joint execution of tasks. In this paper we focus on the special case in which the goal is to use and combine information and services pro- vided by database systems. A multi- database system (MDBS) [Elmagarmid and Pu 1990] is a confederation of pre- existing, autonomous, and possibly het- erogeneous, database systems. The pre- termission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee prowded that copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication and its date appear, and notice is given that copying M by permission of ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. @ 1995 ACM 0360. 0300/95/0600-0141 $03.50 ACM Computing Surveys, Vol. 27, No 2, June 1995

Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation in Multidatabase Systems

EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Purdue University, West Lafayette, Indiana 47907-1398 ( [email protected]. edu)

A multidatabase system (MDBS) is a confederation of preexisting distributed,

heterogeneous, and autonomous database systems. There has been a recent proliferation

of research suggesting the application of object-oriented techniques to facilitate the

complex task of designing and implementing MDBSS. Although this approach seems

promising, the lack of a general framework impedes any further development. The goal

of this paper is to provide a concrete analysis and categorization of the various ways in

which object orientation has affected the task of designing and implementing MDBSS.

We identify three dimensions in which the object-oriented paradigm has influenced

this task: the general system architecture, the schema architecture, and the

heterogeneous transaction management. Then we provide a classification and a

comprehensive analysis of the issues related to each of the above dimensions. To

demonstrate the applicability of this analysis, we conclude with a comparative review

of existing multidatabase systems.

Categories and Subject Descriptors: C.2.4 [Computer-Communication Networks]:

Distributed Systems—distributed applications, dlstrkbuted databases; D.1.5

[Programming Techniques]: Object-oriented Programming; H.2.1 [Database

Management]: Logic Design—data models, schema and subschema; H.2.3 [Database

Management]: Languages-query languages; H.2.4 [Database Management]:

Systems—distributed systems, query processing, transaction processing; H.2.5

[Database Management]: Heterogeneous Databases—data translation, program

translation

General Terms: Design, Languages, Management, Standardization

Additional Key Words and Phrases: Distributed objects, federated databases,

integration, multidatabases, views

1. INTRODUCTION

The computing environment in most con-temporary organizations contains dis-tributed, heterogeneous, and autonomoushardware and software systems. There isan increasing need for technology to sup-port cooperation of the services this soft-ware and hardware provides. Therequirement for building systems thatcombine heterogeneous resources andservices can be met at the low level ofinterconnectiuity or at the higher level of

interoperability [Manola et al. 1992; So-ley 1992]. Interconnectivity simply sup-ports system communication, while inter-operability additionally allows systems tocooperate in the joint execution of tasks.

In this paper we focus on the specialcase in which the goal is to use andcombine information and services pro-vided by database systems. A multi-database system (MDBS) [Elmagarmidand Pu 1990] is a confederation of pre-existing, autonomous, and possibly het-erogeneous, database systems. The pre-

termission to make digital/hard copy of part or all of this work for personal or classroom use is grantedwithout fee prowded that copies are not made or distributed for profit or commercial advantage, thecopyright notice, the title of the publication and its date appear, and notice is given that copying M bypermission of ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists,requires prior specific permission and/or a fee.@ 1995 ACM 0360. 0300/95/0600-0141 $03.50

ACM Computing Surveys, Vol. 27, No 2, June 1995

Page 2: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

142 ● E. Pitoura et al.

CONTENTS

1 INTRODUCTION

11 Research Dmections m Object-Oriented

Multldatabase Systems

12 Preference Programmmg-Based Object

Model

13 Organlzatlon ofthls Paper

2. OBJECT-BASED ARCHITECTURES FOR

DISTRIBUTED HETEROGENEOUS SYSTEMS

21 MDBSsm Object-B ased Architectures

22 Standardization Efforts mObJect-Based

Architectures

3 MDBSSWITHAN OBJECT-ORIENTED

COMMON DATA MODEL

3.1 ObJect-Orlented Data Models Used as CDMs

32 Multldatabase Languages

33 Schema Translation

3.4 Schema Integration

35 Advantages of Adopting an ObJect-Oriented

CDM

4 OBJECT-ORIENTATION AND TRANSACTION

MANAGEMENT

41 Trends mTransactlon Management

42 ObJect-Orlented Transaction Management

43 Transaction Management mMultldatabases

44 Bringing the Two Concepts Together

5 CASE STUDIES

51 Pegasus

52 ViewSystem

53 01S

54 CIS

5.5 The EIS/XAIT Project

56 DOMS

57 UniSQL/M

5,8 Carnot

59 Thor

510 The InterBase Project

511 The FIB ProJect

5 12 Conclusions

6 SUMMARY

existin~ database svstems that partici-

pate in-the confederation are call~d localor component database systems. Thehigh-level architecture of a multi-

database system is depicted in Figure 1.The “global” layer provides access to theunderlying local systems. The complexityof this layer varies from allowing directaccess to each local system to providingsophisticated seamless access with con-trol flow facilities.

Creating a MDBS is complicated bythe heterogeneity and autonomy of itscomponent systems. Heterogeneity mani-fests itself through differences at the op-erating, database, hardware, or commu-nication level of the component systems[Sheth and Larson 1990]. Here we con-centrate on the types of heterogeneitiescaused by differences at the databasesystem level, including discrepanciesamong data models and query languagesand variability in system-level supportfor concurrency, commitment, and recov-ery. Building multidatabase systems isfurther complicated by the fact that thecomponent databases are autonomous,have been independently designed, andoperate under local control.

Recently, many researchers have sug-gested using object-oriented techniquesto facilitate building multidatabase sys-tems. Object-oriented techniques, whichoriginated in the area of programminglanguages, have been widely applied inall areas of computer science includingsoftware design, database technology, ar-tificial intelligence, and distributed sys-tems. Although using them in buildingmultidatabases seems promising, the lackof a common methodology impedes anyfurther development.

This survey analyzes the various waysin which object-oriented techniques haveinfluenced the design and operation ofmultidatabases. Our goal is to classifythe approaches proposed and provide acomprehensive analysis of the issues in-volved. Although this survey is self-con-tained, a familiarity with basic databaseconcepts (e.g., database textbooks such asOzu and Valduriez [1991]) and with thebasic m-inciples of object orientation (e.g.,the sirvey paper by Wegner [1987]) will

facilitate understanding the issues in-volved.

1.1 Research Directions in Object-Oriented

Multidatabase Systems

In this section we classify the ways inwhich object technology has influencedthe design and implementation of multi-database systems. First, the application

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 3: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 143

Multidatabase System---------------------------------------------------------------! !!, ,!

IGlobal Layer, ,

! ,I k

I acorn”””:’ ‘emO’kQ I

, I

1 ,

\____ ---------------------------------- ----------------------- _-

Figure 1. The high-level architecture of a multidatabase system.

of object-oriented concepts in system ar-chitectures provided a natural model forheterogeneous, autonomous, and dis-tributed systems. According tothis archi-tectural model, called Distributed ObjectArchitecture, the resources of the varioussystems are modeled as objects, while theservices provided are modeled as themethods ofthese objects. Methods consti-tute the interface of the objects. In thespecial case in which the systems aredatabase systems, the resources are theinformation stored in the database andthe facilities provided are efficient meth-ods of retrieving and updating thisinformation.

Second, object technology has beenused in multidatabase systems at a finerlevel of granularity. The informationstored in a database is structured accord-ing to a data model. When a componentdatabase participates in a multidatabasesystem, its data model is mapped to adata model that is the same for all par-ticipating systems, the Common (orCanonical) Data Model (CDM). Severalresearchers have recently advocated theuse of an object-oriented data model asthe CDM. The objects of the databasemodel are of a finer granularity than thedistributed objects: at one extreme, an

entire component database may be mod-eled as a single distributed complex ob-ject [Li and McLeod 1991].

In a multidatabase system, multipleusers simultaneously access various com-ponent systems. Heterogeneous transac-tion management deals with maintainingthe consistency of each componentsystem individually and of the multi-database system as a whole. Object tech-nology has also influenced a number ofaspects of heterogeneous transactionmanagement. It offers an efficient methodof modeling and implementation, facili-tates the use of semantic information,and has independently introduced thenotion of local transaction management.

Summarizing, we can identify the fol-lowing three dimensions in which the ob-ject-oriented paradigm has influenced thedesign and implementation of multi-database systems:

(1) system architectures have been influ-enced by the introduction of dis-tributed object-based architectures;

(2) schema architectures have been in.fluenced by the use of an object-ori-ented common data model; and

(3) transaction management has beeninfluenced by the application of tech-

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 4: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

144 “ E. Pitoura et al.

niques from object-oriented transac-tion management.

The above dimensions are orthogonal inthe sense that systems may sup~ort ob-ject-orientation on one dimension but notnecessarily on others. For example, adatabase system having a relational com-mon data model can participate in adistributed-object architecture by beingconsidered a (large) distributed object.Analogously, database systems with ob-ject-oriented common data models canparticipate in nonobject-based system ar-chitectures. Moreover, systems that donot support objects can use object-ori-ented techniques in developing theirtransaction management schemes. In thefollowing sections, the relationshipsamong the above dimensions will be fur-ther clarified.

Although all the above combinationsare viable, a fully object-oriented multi-database should support the same objectmodel at all dimensions to avoid confu-sions, incompatibilities, or errors andrepetitions in implementation. However,different requirements are placed on eachone of these dimensions, resulting in datamodels that emphasize different fea-tures. Thus, different object-oriented datamodels have been introduced for thearchitecture. schema. and transactionmanagement level. At the level of systemarchitectures, models tend to be pro-gramming-based and focus on such is-sues as efficient ways of implementingremote procedure calls and namingschemas. At the level of schema architec-tures, models tend to be database-ori-ented, support persistency and databasefunctionality, and have extended view-definition facilities. Finally, at the trans-action-management level, models usuallysupport active objects appropriate formodelirw transactions and their interac-tion. Inv addition, at the transaction-management level, different approachesutilize different features of the objectmodel in the pursuit of efficiency. In thispaper we first provide a reference pro-gramming-based model and then high-light variations of this model appropriatefor each of the above dimensions.

1.2 A Reference Programming-Based Object

Model

Object orientation [Wegner 1987] is anabstraction mechanism in which theworld is modeled as a collection of inde-pendent objects that communicate witheach other by exchanging messages. Anobject is characterized by its state andbehavior and has a unique identifier as-signed to it upon its creation. The state ofan object is defined as the set of values ofinstance variables. The value of an in-stance variable is also an object, The be-havior of an object is modeled by the setof operations or methods that are appli-cable to it. Methods are invoked by send-ing messages to the appropriate object.The state of an object can be accessedonly through messages; thus, the imple-mentation of an object is hidden fromother objects.

Each object is an instance of a class, Aclass is a template (cookie-cutter) fromwhich objects may be created. All objectsof a class have the same kind of instancevariables, share common operations, andtherefore demonstrate uniform behavior,Classes are also objects. The instancevariables of a class are called class vari-ables and the methods of a class arecalled class methods. Class variables rep-resent properties common to all in-stances of the class. A typical classmethod is new, which creates an in-stance of the class.

Classes are organized in a class hierar-chy. When a class B is defined as asubclass of a class A, class B inherits all

the methods and variables of A. A iscalled a superclass of 13. Class B mayinclude additional methods and vari-ables. Furthermore, class B may redei%e

(overwrite) any method inherited from Ato suit its own needs. Inheritance from asingle superclass is called single inheri-tance; inheritance from multiple super-classes is called multiple inheritance.Some systems also consider classes to beinstances of classes called metaclasses.Metaclasses define the structure and be-

havior of their instance classes. Themetaclass concept is a very powerful one,since it provides systems with the ability

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 5: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 145

to redefine or refine their classmechanism.

The relations typically supported bythe object-oriented model are: the classi-fication or instance-of relation betweenan object and the class (typically one) ofwhich it is an instance, the generaliza-tion/specialization or is-a relation be-tween a class and its superclasses, andthe aggregation relation between an ob-ject and its instance variables. Wediscuss briefly below some design alter-natives of the basic model.

Delegation versus Inheritance. Inheri-tance is a mechanism for incrementalsharing and definition in class hierar-chies. An alternative mechanism, inde-pendent of the concept of class, is delega-tion. Delegation [Stein 1987; Lieberman1986; Wegner 1987] is a mechanism thatallows objects to delegate responsibilityfor performing an operation to one ormore designated ancestors. A key featureis that when an object delegates an oper-ation to one of its ancestors, the opera-tion is performed in the environment

(scope) of the ancestor.

Method Resolution. Since a class mayprovide a different implementation for aninherited method, methods are over-loaded in object-oriented systems. The se-lection of the appropriate method is calledmethod resolution. With single inheri-tance (where the class hierarchy is atree), when a message is sent to an objectof a class A the most specific method isused; that is the method defined in thenearest ancestor class of A. This resolu-tion method is also applied in the case ofmultiple inheritance, although the prob-lem there is complicated by the fact thatthe same method may be defined in morethan one of A’s superclasses. In such aninstance, there is no default resolution

Subtyping versus Subclassing. Sub-typing rules are rules that determinewhich objects are acceptable in a specificcontext. Every object of a subtype can beused in any context where an object ofany of its supertypes could be used. Al-though some systems relate subtypingand subclassing, subtyping should bebased on the behavior of objects in orderto increase flexibility [Snyder 1986]. Ifinstances of type A meet the externalspecification of class B, A should then bea subtype of B, irrespective of whether Ais a subclass of B. Conformance [Blair etal. 1989; Raj et al. 1991] is a mechanismfor implicitly deriving subtyping rela-tions based on behavioral specifications.

1.3 Organization of This Paper

The remainder of this paper is organizedas follows. In Section 2, we discuss thefirst direction, namely distributedobject-based architectures. Since the fo-cus of this paper is on the integration ofdatabase systems, the influence of thearchitecture on multidatabase systems isstressed. Thus, this section does not ex-haust this very important research area

(for a review on this topic, see for exam-ple Nicol et al. [1993]). The following twosections discuss the other two directionsin detail. In Section 3, we describe howthe object-oriented model has beenadapted to serve as the data model of themultidatabase and its role in facilitatingthe tasks of schema translation and inte-gration. In Section 4, we discuss theimpact of object-oriented transactionmanagement in multidatabases. Finally,Section 5 is a comparative review of ex-isting multidatabase systems that adoptobject-oriented techniques in one or moreof the above directions.

method for specifying which of the multi-ply-defined methods A should inherit.

2. OBJECT-BASED ARCHITECTURES

Some systems support multimethods,FOR DISTRIBUTED

which are methods that involve. as ara-HETEROGENEOUS SYSTEMS

ments, more than one object and in wh~ch A popular way [Manola et al. 1992; Nicolthe classes of all the arguments are con- et al. 1993] of modeling a distributed

sidered in selecting the appropriate heterogeneous system is as a distributedmethod during resolution [Dayal 1989; collection of interacting objects that rep-Heiler and Zdonik 1990]. resent the distributed system resources.

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 6: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

146 * E. Pitoura et al.

Each component system defines an inter-face of services and provides an imple-mentation for these services. A clientinteracts with the heterogeneous systemby issuing requests expressed in a com-mon language. Distributed object man-agers are responsible for translating theclient’s requests in terms of the availableservices, for directing these requests tothe appropriate systems, and for provid-ing the response expressed in the samecommon language,

The use of objects to model distributedcomponents accommodates both the het-erogeneity and the autonomy require-ments. The modeling of distributedresources as objects supports heterogene-ity because the messages sent to a dis-tributed component depend only on itsinterface and not on the internal imple-mentation of the method or the com-ponent. This approach also respectsautonomy because the components mayoperate independently and transpar-ently, provided that their interfaces re-main unchanged. The ultimate goal ofdistributed object-based architectures isthe construction of a heterogeneous sys-tem in which all system resources maybe treated as a commonly accessible col-lection of objects that can be recombinedin arbitrary ways to provide new infor-mation accessing capabilities.

2.1 MDBSS in Object-Based Architectures

Object-based architectures offer means ofintegrating applications across technol-ogy domains, including the domains ofGUIS, file systems, database systems, andprogramming languages. MDBSS focus onissues within the database system do-main, As a consequence, an object modelfor a heterogeneous system that includescomponents that are database systemsshould be a specialization of the objectmodel of the general object-based systemarchitecture that will support databasefunctionality such as persistence, query-ing, transactions, and concurrentsharing.

In the special case where an object-ori-ented component database system partic-

ipates in a heterogeneous system with anobject-based architecture, there are twotypes of objects, the local objects sup-ported by the component database andthe distributed objects of the heteroge-neous system. The component object-ori-ented database system supports millionsof fine-g-rained objects. Providing clientsof the heterogeneous system with directaccess to these objects may involve sig-nificant overhead or may violate theautonomy or security of the database.Instead, the heterogeneous system mayprovide access to a containing object,which in the extreme case may be thewhole database. Then, the containing ob-ject can handle the requests. In this case,the local, fine-grained objects are hiddenfrom the client of the heterogeneous sys-tem. In other words, the distributed ob-jects of the heterogeneous system maynot correspond directly to the local ob-jects of the component database system.

Finally, research in architectures fordistributed systems has concentrated oninterconnectivity issues and has not yetaddressed interoperability aspects. Thus,most research on integrating informationresources is expressed in terms ofdatabase integration of the schemas ofthe component databases.

2.2 Standardization Efforts

in Object-Based Architectures

The impact of object-orientation in thearchitecture of heterogeneous distributedsystems is also evident in the fact thatmost standardization efforts in this di-rection are based on the object model. Inthe following, we describe the prevailingsuch approach, namely the OMG, andreport briefly on some others. In the longrun, future compliance of a database sys-tem with the standards will ease the taskof building multidatabase systems. In theshort run, new multidatabase systemsshould take into consideration such stan-dards in defining their interfaces. Stan-dardization efforts towards defining astandard object model are also beingmade in the database community. Wediscuss two of these efforts, SQL3 and

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 7: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 147

Client Object Implementation

I RequestORB I

(a)

r—————if—————m

Object Request Broker

I Object Services I

Figure z. (a) A request being sent through the object request broke~ (b) object manager architecture.

ODMG, in Section 3.1.1, since they arepertinent to database modeling. Theseefforts define the standard services thatshould be provided by each componentdatabase system and in this regard areextensions of the object models definedfor distributed object-based systems.

Object Management Group. The Ob-ject Management Group (OMG) [Soley1992] is developing a suite of standardsaddressing the integration of distributedapplications through object technology.The architecture proposed by OMG, theObject Management Architecture (OMA),is depicted in Figure 2. In the OMAmodel, every piece of software is repre-sented as an object. Objects communicatewith other objects via the Object RequestBroker (ORB), which is the key commu-nication element. ORB provides themechanisms by which objects transpar-ently make requests and receive re-sponses. Figure 2a shows a request beingsent by a client to an object implementa-

tion: The client is the entity that wishesto perform an operation on an object andthe object implementation is the code anddata that actually implement the object.

The OMG categorizes objects into threebroad categories: Application Objects,

Object Services, and Common Facilities(Figure 2b). Application Facilities is aplaceholder for objects that belong to thespecific applications that are being inte-grated. The Common Facilities comprisegeneral facilities useful in many applica-tions that will be made available throughOMA-compliant interfaces. The ObjectServices provide the main functions forimplementing basic object functionalityusing the ORB, e.g., the logical modelingand the physical storage of objects. Theproposed standard for the ORB, CORBA(Common Object Request Broker Archi-tecture) [OMG 1991] supports a generalInterface Definition Language (IDL) thatmay be mapped to any implementationlanguage. ORB provides for locationtransparency and permits the integra-

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 8: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

148 ● E. Pitoura et al.

tion of applications via wrappers to im-plement CORBA-compliant behavior.

All objects are expressed in a commonObject Model [OMG 1992]. In this model,subtyping is based not on subclassing(Section 1.2) but rather on the behaviorof objects. Inheritance is defined betweeninterfaces. An interface is a description ofa set of possible operations that a clientmay request of an object. An interfacetype is the type that is satisfied by anyobject that complies with a particular in-terface. An interface can be derived fromanother interface, which is then called abase interface of the derived interface. Aderived interface inherits all elements

(variables and methods) of the base in-terface and may redefine them or definenew elements.

Other Efforts. In addition to the OMGstandardization efforts, 1S0 and CC!ITTare also working on a joint standardiza-tion effort known as Open DistributedProcessing (ODP) [Taylor 1992]. ODP’Sgoal is the development of a referencemodel to integrate a wide range of futureODP standards for distributed systems.The support of object-orientation in com-mercial systems and in standardizationefforts for heterogeneous processing isexamined in Nicol et al. [1993]. Supportin commercial systems for heterogeneousprocessing includes OSF’S DCE (a set oftools and services to support distributedapplications) and the BBNs Cronus sys-tem (a system that provides operatingsystem and communication services). Fi-nally, X3H7, a new ANSI/X3 technicalcommittee, has the mission of harmoniz-ing the object-oriented aspects of stan-dards developed by other committees[Kent 1993].

3. MDBSS WITH AN OBJECT-ORIENTED

COMMON DATA MODEL

The traditional 3-level architecture[Tsichritzis and Klug 1978] used todescribe the schema architecture of acentralized database system has been ex-panded [Devor et al. 1982] to describe thearchitecture of a MDBS. In this 5-level

architecture (see Figure 3, adapted fromSheth and Larson [1990]), the conceptualschema of each com~onent database is.called the local schema. The local schemais exm-essed in the native data model ofthe ~omponent database; thus, the localschemas of different componentdatabases may be expressed in differentdata models. To facilitate access to thesystem, most approaches translate thelocal schemas into a common data model,called the canonical or common datamodel (CDM). The schema derived bythis translation is called the componentschema. Each database participates inthe federation by exporting a part of itscomponent schema, called the exportschema. A federated or global schema iscreated by the integration of multiple ex-port schemas. Finally, for customizationor access control reasons, an externalschema is created to meet the needs of aspecific group of users or applications.

Different types of multidat abase sys-tems are created bv different levels ofintegration of the e~port schemas of thecomponent databases [Bright et al. 1992;Sheth and Larson 19901. The nonfeder-ated approach [Litwin et al. 1990] as-sumes no integration of the exportschemas. An important component of anonfederated multidatabase svstem is the.multidatabase language, that allows uni-form access to all component databases.In contradistinction to the multidatabaseapproach, the federated approach [ Shethand Larson 1990] assumes the integra-tion of the ex~ort schemas to create aglobal schema: Federated database sys-tems (FDBSS) can be further categorizedbased on the distribution of integration.Centralized FDBSS [Bertino 1991] sup-port a single federated schema system-wise. This federated schema is built bythe selective intimation of the ex~ortschemas of the co-mponent sites. In’thecase of decentralized FDBSS [Czedjo andTaylor 1991; Li and McLeod 1991] eachcomponent site builds its own federatedschema by integrating its local schemawith the export schemas of some othercomponent sites. Decentralized FDBSSare further characterized by the degree

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 9: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation e 149

-m [External SchemaAny Data Model

1

\ / I

Federated (Global) Schema Federated (Global) SchemaCommon Data Model Common Data Model

\

Export Schema Export Schema

1

Export Schema

Common Data Model Common Data Model Common Data Model

\ /\ / I. / T

-

Component SchemaCommon Data Model

1. . . . . . ..- ---------- ,f \ (

! Component I! pre-existing ~

database system’1-------------- -----1

.--------k ----------!

~@ziEzzJt Component I, pre-existing ;

database system {I--------------

Figure 3. The 5-level schema architecture.

of consistency that they maintain amongthe different federated schemas.

The translation of local schemas intothe CDM is essential to both the feder-ated and nonfederated approaches. TheCDM should be both rich enough [Salteret al. 1991; 13atini et al. 1986] to capturethe semantics expressed or implicit inthe local schemas and simple enough tofacilitate the creation of the federatedschema in the federated approach andthe multidatabase queries in the nonfed-erated approach. Most systems [Bright etal. 1992; Thomas et al. 1990; Schaller1993] use a relational data model as theCDM. Recently, many systems [Bukhreset al. 1992] have been introduced thatuse an object-oriented data model as theirCDM. In this section, we attempt to eval-uate the rationale behind this approachand discuss its effectiveness.

Introduction to Integration. Schematranslation alleviates the problems thatoccur due to the use of different datamodels. If there were no relations amongthe concepts represented in each compo-nent schema, then the federated schemawould simply be the union of the compo-

nent schemas. Unfortunately, the sameconcepts may be represented in differentdatabases and furthermore, due to het-er( +ymeity, these concepts may be repre-sented differently.

Type of Conflicts. The following is ageneral classification of the possible con-flicts between component schemas. Thisclassification is independent of the typeof the CDM used.

(1) Identitv conflicts occur when the same

(2)

concep~ is ‘represented by differentobjects in different component

databases.

Schema conflicts occur when the com-ponent schemas that represent thesame concept are not identical.

(a)

(b)

Naming conflicts occur when thesame name is used for differentconcepts (homonyms) or when thesame concept is described by dif-ferent names (synonyms).

Structural conflicts occur when(i) the same concept is repre-

sented by different constructsof the data model, or

(ii) although the same concept is

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 10: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

150 “ E. Pitoura et al.

modeled by the same con-structs, the constructs usedhave either different structure(missing or different rela-tions/dependencies) or differ-ent behavior (different ormissing operations).

(3) Semantic conjZicts occur when thesame concept is interpreted differ-ently in different componentdatabases. This category includesscale or rate differences.

(4) Data confZicts occur when the datavalues of the same concept are differ-ent at different component databases.

Similar taxonomies are presented inBatini et al. [1986]; Dayal and Hwang[1984]; Kim and Seo [1991]; Kim et al.

[1993]. In Batini et al. [1986], two typesof conflicts are described: naming andstructural conflicts. These largely corre-spond to the naming and structural con-flicts as defined above except from thekey structural conflict which, in our tax-onomy, is considered a special case of theidentity problem. In Dayal and Hwang[1984], a taxonomy is proposed of con-flicts that might occur when all compo-nent schemas are expressed in an ex-tended functional model with three basicconstructs, functions, objects, and types.This taxonomy differentiates betweenschema and data conflicts. Our definitionof schema conflicts is an extension of thisformulation with the exception of scaledifferences, that we classify as semanticrather than as schematic differences. Theclassification presented in Kim and Seo[1991] is similar to Dayal and Hwang[1984], but is tailored for the case ofrelational schemas. Kim et al. [ 1993] pro-vides a comprehensive taxonomy of con-flicts which arise when the common datamodel is a relational-object model, calledSQL,/M.

Table l(a) summarizes the differenttypes of conflicts along with some exam-ples in the case of an object-orientedCDM.

Interschema Relations. In order toperform integration, it is crucial to iden-tify not only the set of common concepts

ACM Computmg Surveys, Vol. 27, No 2, June 1995

but also the set of different concepts indifferent schemas that are mutually re-lated by some semantic properties. Theseproperties are called interschema proper-ties [Batini et al. 1986]. They are seman-tic relationships which hold between aset of objects in one schema and a differ-ent set of objects in another schema. Forreasons of completeness, these relationsshould be represented in the federatedschema. Interschema relations that arisewhen the common data model is a rela-tional model have been extensively stud-ied (e.g., Larson et al. [1989]) and aremost commonly expressed in terms of theinclusion relationships among the do-mains of the related entities. To the bestof our knowledge, there has been no com-prehensive study of the different inter-schema relations that can exist when anobject-oriented common data model isused. The problem is complicated by thefact that object-oriented models expresssemantic relations that are difficult tocapture by such simple relations as theset-inclusion relation.

Table l(b) lists some types of inter-schema relations that correspond directlyto the relations supported by the refer-ence object-oriented model along withsome examples.

Organization of the Remainder of thisSection. The tasks of translation andintegration are strongly influenced by thedata model used to represent the compo-nent schemas. In the remainder of thissection, we discuss how object-orienteddata models facilitate both tasks. Thebasic object-oriented model as introducedin Section 1.2 lacks some concepts neces-sary to a common data model. In Section3.1, we discuss how it can be augmentedto serve as a common data model. InSection 3.2, under the general title multi-database languages, we present someissues related to the languages used. Is-sues related to schema translation wherethe target of the translation is an object-oriented model are discussed in Section3.3. In Section 3.4, we focus on issuesgermane to the creation of the global(federated) schema. Section 3.5 concludesthis section with an overview of some of

Page 11: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ●

Table 1. (a) Taxonomy of the possible conflicts; (b) rrterschema relatlons*

Type

Identity conflict

Schemaconflict

Semantic conflict

Naming

Structural

Data conflict

Definition Example

Sameconcept representedbyCopiesof same book

different objects in different localstored in both CSLibary

databasesand MathLibrary withdifferent local identifiers

In MathLibrary “media”

Homonyms: Same name used refers to magazines and

for different concepts newspapers; in CSLlbrwyto videotapes

Synonyms: Same conceptIn MathLibrary “references”

described by different namesis used; in CSLlbrary“bibliography” is used.

Same concept represented by In CSLibmry number ofdifferent constructs of the model, of citations is an instance(i.e., by a method in one database variable of the book; inand a class in the other) or MathLlbrarv it is a method

Same concept re resented byrecomputed-upon invocation

tsame construct, ut classeshave different methods, or Book has an instance variable

methods have different “keywords” in one library

parameters or return values but not in the other

Same concept interpreted Conference is a refereeddifferently in different databases conference in one not in

the other

Data values of the same Same book appears toentity are different in different have different authorscomDonent databases

EType

Aggregation

Specialization

Generalization

Arbitrary

(a)

Example

Edited book in onelibrary has as parts articlesstored in the other

Article of a specificmathematical journal is aspecial case of a journal

Books in the Math and CSlibraries are all books

Some books and articles ofinterest to a particularscientific area

151

(b)

*We consider two local database schemas, one that describes the library of the Corn~uter Science Dept.(CSLibrary) and the other, the library of the Dept. of Mathematics (MathLibrary). ‘

the advantages of using an object-ori- to multidatabase systems. Various re-ented common data model. search approaches have resulted in dif-

ferent extensions of the basic data model.

3.1 Object-Oriented Data Models UsedWe first describe efforts in ODMG and

as CDMSANSI SQL3 in terms of defining a stan-dard object-oriented data model. Then,

The object-oriented model as defined in we describe extensions of the model thatSection- 1.2 lacks some concepts pertinent facilitate integration and translation.

ACM Computmg Surveys, Vol 27, No. 2, June 1995

Page 12: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

152 ● E. Pitoura et al.

Since there is no standard object-ori-ented data model, in this section we dis-cuss the most prevailing of the proposedextensions.

3.1.1 Standardization Efforts in Ob -ject-Oriented Database Models. SQL3 is

a new database language standard devel-oped by both ANSI X3H2 and 1S0 DBLcommittees targeted for completion in1997 [Krishna 1993; Krishna 1994]. SQL3is upwards compatible with SQL-92, thecurrent ANSI/ISO database languagestandard [Gallagher 1992]. The major ex-tension is the addition of an extensibleobject-oriented type system based on Ab-stract Data Types (ADTs). However,SQL3 still maintains the restriction thattables are the only persistent structures[Krishna 1993]. ADT definitions includea set of attributes and routines. Usingthe terminology of our reference model,ADTs correspond to classes, attributes toinstance variables and routines to meth-ods. Routines can either be implementedusing SQL3 procedural extensions or us-ing code written in external languages.ADTs are related by subtype relation-ships, where an ADT can be a subtype ofmultiple supertypes. Resolution of over-loaded routines is based on all argu-ments in a routine invocation.

The Object Database ManagementGroup (ODMG) is a consortium ofobject-oriented vendors that have devel-oped a standard interface for their prod-ucts called ODMG-93 [Cattell 1993]. TheODMG members are also members ofOMG task forces, and OMG has adoptedthe ODMG-93 interface as part of thePersistence Service, which is one ofOMGS Object Services. Unlike SQL3,ODMG choose not to extend SQL butrather to extend existing programminglanguages to support persistent data anddatabase functionality. ODMG combinesSQL syntax with the OMG object modelextensions to allow declarative queries,

3.1.2 Extensions for Object-OrientedCommon Data Models. In this sectionwe discuss a number of proposed exten-sions of the reference object model forproviding database interoperability.

Types and Classes. A class, as definedin the reference model, is a template forcreating objects with a specific behaviorand structure. A class is not directly re-lated to the real objects whose structureand behavior it models. In a databasesystem we need a language construct tomodel a set of objects. In this section wediscuss how this construct should be de-fined and related to the notion of a class,so that integration and translation arefacilitated.

To express sets of objects and querieson these objects, a new concept, calledthe extent [Banerjee et al. 1987; Bertino1991] of a class, is defined as the set ofall objects that belong-to the class. Theextent of a class defines how a class ispopulated. To differentiate between theextent of a class and the class itself, manyresearchers [Gagliardi 1990] term theseaspects the intentional and the exten-sional parts of a class, respectively.

We have informally defined the extentof a class as the set of objects that belongto the class. A natural way to define the

“belong-to a class” relation is as the setof all objects that are instances of thatclass. This approach proves to be restric-tive. For example, assume a simple li-brary database where the books in eachdepartment’s library are modeled as aclass; for instance two such classes couldbe CSLibrary _ Book and MathLibrary _Book. All these classes are subclasses ofthe class UnivLibrary_Book, which hasno instances. To find a book, a user mustname all existing libraries, though theintuitive way to accomplish that is todesignate the extent of UnivLibrary _Book as the target of his query. Thisleads us to the following definition of thebelong-to relation: an object “belongs-to a

class” if it is an instance of that class orof any of its subclasses. This is also calledthe member-of relation [Bertino 1991;Papazoglou and Marines 1990]. Underthis definition, the extent of the classUnivLibrary _ Book is the union of theextent of all its subclasses and one canexpress the above request as a querywith the extent of UnivLibrary _ Book asits target, This is a valid definition since

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 13: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation 0 153

an instance of a subclass has at least thebehavior of its superclass.

The implication of the above definitionis to impose a hierarchy of the extentsthat parallels the hierarchy of theirclasses. If a class A is a subclass of aclass B then the extent of class A is asubset of the extent of class B. We shouldstress that the class hierarchy is of asemantic nature, whereas the extent hi-erarchy is an inclusion hierarchy be-tween sets of objects. We should alsomention that, although the definition of aclass remains the same, the extent of aclass changes with time as new instancesare created or deleted.

Many researchers go beyond that andfully differentiate the structure of objectsfrom the real objects having that struc-ture [Scholl et al. 1992; Scholl and Schek1990; Geller et al. 1991; Geller et al.1992]. In this case, types are defined astemplates and classes as sets of typedobjects. Inheritance of structure and be-havior is supported in the subtype hier-archy, whereas the subclass hierarchy, ifsuch exists, is based on set-inclusion re-lations. A class may have an associatedtype that defines the structure and be-havior of its members. An object maybelong to more than one class and tomore than one type (or, more precisely,to more than one type extent). Further-more, a class may contain objects belong-ing to different types but related by somecommon property.

It is very diflicult to evaluate which isthe best choice for a canonical model.Each of the proposed models is accompa-nied by a related methodology that re-solves some types of conflicts and ex-presses some interschema relations. Ingeneral, the distinction between sets andtypes adds flexibility to the model. Inte-gration may then be supported at twodifferent levels, at a type (structural)level and at a class (set-based) level. At astructural level, global types abstractcommonalities in the structure and be-havior of the component types. At a set-based level, objects (or parts of objects)belonging to more than one componentclass are brought together in some global

class. On the other hand, this distinc-tion complicates the maintenance of rela-tions among classes, among types, andbetween classes and their associatedtypes.

Finally, we should mention that all theabove are not necessarily different mod-els, but can be implemented as exten-sions of the basic object model using themetaclass mechanism. For example,

classes representing sets of objects, maybe considered a special kind of class (e.g.,

collective classes). For example, ORION[Banerjee et al. 1987] offers an elegantimplementation of the concept of classextent.

Schema Evolution Operations. Manysystems [Banerjee et al. 1987; Li andMcLeod 1991; Czedjo and Taylor 1991]support schema evolution operations,that is, operations for dynamically defin-ing and modifying the database schema,e.g., the class definitions and the inheri-tance structure. These operations play animportant role in restructuring theschema resulting from the merging ofcomponent schemas.

Semantic Extensions. Many object-

oriented models used as CDM are ex-tended to support additional relationswhich can capture the semantics of thelocal schemas and their interrelation-ships. These extensions can be imple-mented using the metaclass mechanismof the basic model. The relations addedare either specializations of existing rela-tions or correspond to relations explicitlysupported by other kinds of data models(e.g., relational). One typical example ofthe latter case is the part-of relation.The basic object-oriented data model issufficient to represent a collection (ag-gregation) of related objects by allowingan object to have other objects as itsinstance variables. However, it fails torepresent the notion of dependency be-tween objects, since an object does notown the value of its instance variablesbut simply keeps references to them.Many database models add the notion ofdependency by defining a composite ob-ject [Papazoglou and Marines 1990;

ACM Computing Surveys, vol. 27, NO 2, June 1995

Page 14: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

154 ‘ E. Pitoura et al.

Banerjee et al. 1987; Gagliardi 1990] asan object with a hierarchy of exclusivecomponent objects. These component ob-jects are dependent objects, in that theirexistence depends upon the existence ofthe composite object that they are part-of.

State and Behavior. Most object-ori-ented data models used as CDMS do notdistinguish between the state and thebehavior of an object but use the sameconstruct, usually called function, to

model both instance variables and meth-ods. An instance variable is modeled bv apair of set and get functions [Ungar &dSmith 1987], where set assigns a valueto the variable and get returns its value.This approach leads To a model with fewerconstructs and thus minimizes the num-ber of possible structural conflicts. Moreimportantly, it offers increased flexibilityto the integrator by permitting the stateof an object to be redefined in the globalschema. For example, take an object of aclass named employee. Let us say that anemployees’s salary is represented in dol-lars in one component database, indrachmas in another, and in marks inthe global database. Then, if salary isrem-esented as a function. we can definean appropriate function’ in the globalschema that performs the necessarytransformations based on the dailv rate.of exchange between these monetaryunits. In contrast, if salary is repre-sented as an instance variable, there isno straightforward way to solve the aboveconflict. Alternatively, schema evolutionoperators may be applied to the compo-nent database schemas prior to their in-tegration, to restructure them appropri-ately.

Upwards Inheritance. The referencemodel suffers from an asymmetry. Whilea subclass constructor is provided andinheritance from a superclass is defined.there is no superclas& constructor. Sug~gested extensions provide such a con-struct and also define inheritance from asubclass to a superclass, called general-ization or upwards inheritance [Pedersen1989; Schrefl and Neuhold 1988]. Resolu-tion problems related to upwards inheri-

tance are discussed later in this section(Section 3.4.2).

3.2 Multidatabase Languages

There are two fundamental approachesto the design of object-oriented databaselanguages [Kim 1990]. The first extendsa query language (usually SQL) to sup-port the manipulation of object-orienteddatabases and then embeds the extendedquery language in the application lan-guage. We call this type of languagesquery-based. The second approach ex-tends an object-oriented programminglanguage to support database opera-tions. In this case, the application andquery languages are the same and noimpedance problem exists [Pitoura 1995].We call this type of languages program-ming-based. For the purposes of thispaper, we further characterize query lan-guages as (1) language-oriented whenthey allow operations (messages) to besent to single objects or as (2) set-oriented when they permit queries tosets (or collections) of objects other thanclass extents.

In a multidatabase system, a Data Def-inition Language (DDL) is used to definethe global schema while a Data Manipu-lation Language (DML) is used to manip-ulate data. Most object-oriented systemsuse the same language for both purposes.The language is extended [Krishna-murthy et al. 1991] (a) to support queries(or methods) that access data stored indifferent component databases and (b) toallow the definition of the dobal schemaby integrating the compo~ent schemas.The definition of the global schema isusually accomplished by using the view-definition facilities of the language. Thosefacilities are described in Section 3.4.1.When a global schema is not provided,uniform access to the com~onent schemasis accomplished only’ through thelanguage.

Furthermore, object-oriented lan-guages defined for multidatabases haveadditional constructs to support the ex-tensions of the data model described inthe previous section. These may include

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 15: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 155

declarations for defining types andclasses. Some languages [Ahmed et al.1991] also provide constructs for definingthe mapping between local and compo-nent schemas.

Finally, some multidatabase systemsallow the user to specify the flow of con-trol of his interactions with the databasesystem at a finer level of detail. Thisspecification is expressed using an ex-tended transaction model (see Section 4).Some systems extend their DML or DDLwith constructs for defining and usingextended transaction models [Chen 1993].Others offer a special language for defin-ing transaction models [Woelk et al.1992].

3.3 Schema Translation

Schema translation is performed when aschema (schema A) represented in onedata model is mapped to an equivalentschema (schema B) represented in a dif-ferent data model. This task generatesthe mappings that correlate the schemaconstructs in one schema (schema B) tothe schema constructs in another schema(schema A). The task of command trans-formation [Sheth and Larson 1990] en-tails using these mappings to translatecommands involving the schema con-structs of one schema (schema B) intocommands involving the schema con-structs of the other schema (schema A).In the multidatabase context, schematranslation occurs (see Figure 3):

e when translating from the local modelto the common data model, and

. when translating from the federated(global) model to the external model.

When the target schema B is ex-pressed in an object-oriented data model,roughly speaking, relations are mappedto classes and tuples to objects. The in-clusion relationship between two rela-tions in schema A may be used todetermine the semantic (e.g., subclass-ing) relationships between the corre-sponding classes in schema B [Castel-lanos and Salter 1991].

In addition, during translation, seman-

tic information is collected and repre-sented in the common data model. Thisprocess is called semantic enrichment[Castellanos and Salter 1991] or semczn-tic refinement [Mannino et al. 1988].

Some multidatabase languages (suchas HOSL [Ahmed et al. 1991]) provideconstructs that support procedural map-pings of schemas expressed in other mod-els to their object-oriented model.

Bertino et al. [1988] introduces a newapproach to schema translation; calledthe Operational Mapping Approach. In-stead of defining the correspondence be-tween the data elements of the schemataA and B (Structural Mapping Approach),the correspondence is defined betweenoperations of the different schemata. Anumber of basic operations of the schemaB (called abstract operations) are definedin terms of a number of primitive opera-tions of the schema A. All other opera-tions of B are implemented using theseabstract operations, possibly automati-cally by the integration system. Theprimitive operations provided by A mustbe an appropriate minimal set so thatthe corresponding abstract operationsprovide the necessary functionality. Theuse of an object-oriented CDM facilitatesschema translation by operational map-ping. The operational mapping approachis based on the same principle as theobject-based architectures, that is, eachcomponent system provides a specific in-terface consisting of a set of primitiveoperations.

3.4 Schema Integration

Schema integration is defined as the ac-tivity of integrating the schemas of exist-ing or proposed databases into a global,unified schema [Batini et al. 1986]. Inthe case of FDBSS, schema integrationoccurs in two contexts (see Figure 3):

(1) when integrating the export schemasof (usually existing) component sys-tems into a single federated schema;and

(2) during database design, as view inte-

gration of the multiple user views of

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 16: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

156 ● E. Pitoura et al.

a proposed federated database(federated schema) into a singleton-ceptual description of this database

(external schema).

In many applications there is a need tointegrate nontraditional componentdatabases that do not support schemas.It is necessary to generalize the conceptof schema integration to include the inte-gration of such systems. Object-orienteddata models can be very useful, sincethey permit the definition of the concep-tual schemas of nondatabase systems interms of the operations they support, thuscompletely hiding the structure of theirdata.

Batini et al. [1986] identifies four mainsteps in the process of integration: proin-tegration, comparison of schemas, con-forming of schemas, and merging andreconstructing. Translation is consideredas part of the preintegration step. In gen-eral, a data model to facilitate all steps ofthe integration task should be semanti-cally rich; it should provide mechanismsfor expressing not only the semantics ex-pressed at the local databases but alsoadditional semantics relevant to schemaintegration (schema enrichment). Fur-thermore, it should ideally be capable ofexpressing the semantics of any new lo-cal database that might be added to thesystem in possible future expansions.From this perspective, object-orientedmodels are especially appropriate.

During the comparison step, the com-ponent schemas are compared to detectconflicts in their representation and toidentify the interschema relations. Thecomparison of the schema objects is pri-marily based on their semantics, not ontheir syntax. The CDM should be seman-tically rich to facilitate comparison andshould also support abstraction mecha-nisms to permit comparisons to be madeat a higher level of abstraction. The ob-jective of the conformation step is to bringthe component schemas into compatibil-ity for later integration. Comparison andconforming activities are usually per-formed in layers. These layers corre-spond to the different semanticconstructs supported by the model, The

fewer the basic constructs supported bythe model the fewer the conflicts and theeasier the conformation activity. For thisreason, object-oriented mod~ls whichsupport a single construct (function) forboth instance variables and methods arepreferable. When only functions are sup-ported, comparison and conformation are~erformed first for classes (structural~onformation) and then for functions (be-havioral conformation [Bertino 1991]).

The search for identifying relations orpossible conflicts may be guided by theclass hierarchy. Instead of comparing allclasses in a random manner, classes maybe compared following the class hierar-chy in a top-down fashion [Garcia-Solacoet al. 1993].

As in the translation ~hase. relationsbetween different classes’must be identi-fied. The difference is that now theseclasses may belong to different databases.Subclassing relations may be specifiedbased on inclusion relations between theextents of the corresponding classes(Mannino et al. 19881. Assertions mav beused to express these relations. The as-sertions should be checked for consis-tency and completeness [Mannino et al.1988]. The identification of relations be-tween classes can also be made by com-paring the definitions of classes [Savasereet al. 1991] rather than their actualextensions.

Most systems use view definition facili-ties for defining the global schema, dur-ing the last step of integration. Thecreation of the global view is usually per-formed in two phases. In the first phase,the classes of the component schema areimported or connected, that is, they aremapped to corresponding global classes.In the second phase, classes are com-bined based on their interschema rela-tions. View definition facilities are de-scribed in the following section.

3.4.1 Object-Oriented Views. A view isa way of de f~ning a virtual database ontop of one or more existing databases.Views are not stored, but are recomputedfor each query that refers to them. Thedefinition of a view is dependent uponthe data model and the facilities of the

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 17: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 157

language used to specify the view. In arelational model, a view is defined as aset of relations, each populated by a queryover the relations of the existingdatabases and sometimes over already-defined view relations. There are as manydifferent approaches to defining anobject-oriented view as there are object-oriented data models. In general, an ob-ject-oriented view is defined as a set ofvirtual classes that are populated by ex-isting objects or by imaginary objectsconstructed from existing objects. The setof virtual classes defines a schema andthe objects that populate them define a

(virtual) database described by theschema [Heiler and Zdonik 1990]. Once avirtual class is created, it should betreated like any other class. The classesused in the definition of a virtual classare called base classes.

The reference object-oriented model,though rich in facilities for structuringnew objects, lacks some necessary mech-anisms for grouping already-existing ob-jects. Classes are defined as templatesfor creating new objects and no mecha-nism for grouping existing objects is sup-ported. The most common view facilitiesadded to object-oriented systems are:

(i)

(ii)

facilities for importing classes fromthe local databases into the view.Virtual classes created in this man-ner correspond directly to existingclasses; and

facilities for defining new classes,called derived classes, that do notdirectly correspond to an existingclass.

Importation. A view can incorporatedata from other databases via importstatements. Once a class is imported, itsdefinition and instances become visibleto the user of the view. Part of the im-ported data can be hidden either by ex-plicit hide commands [Abiteboul andBonner 1991] or by specifying in the im-port command only the visible functions[Dayal and Hwang 1984]. Import mecha-nisms differ in whether the importationof a class results in an implicit importa-tion of all its subclasses.

Other types of importation statementsimport in a single statement classes orentire hierarchies belonging to more thanone component database. In essence,these statements combine the importa-tion phase with the class derivationphase. The virtual class may be definedeither as the supertype of the top super-classes of each component database[Scholl et al. 1992] or by combining thesetop classes based on their interschemarelation [Mannino et al. 1988]. Duringimportation of this sort, basic types, suchas integers and strings, can be implicitlyunified [Scholl et al. 1992].

Other approaches distinguish betweenthe importation of behavior (functions)and the importation of objects [Fang etal. 1992]. By doing so, local functionsmay be executed on imported objects andimported functions may be executed onlocal objects.

Derived Classes. The definition of avirtual class includes the specification ofthe following three components: (1) theinitial members of the class (class exten-sion); (2) the structure and behavior, thatis, the functions of the virtual class (classintention); and (3) the relation between

the new class and the other virtualclasses.

As we have already pointed out, somesystems provide both classes and types.In such systems, a virtual class may haveno intensional part. Furthermore, the re-lations between the derived class and thebase classes in such systems, are purelyset-oriented (for example, relations suchas union, difference, etc.). Finally, in suchsystems, the relation between the associ-ated types of the derived class and itsbase classes must also be specified.

Different methodologies provide differ-ent language constructs for specifying theabove three components of a virtual class.Most of these constructs define one com-ponent directly and leave the other twoto be derived by the system. There arethree general methodologies:

(1) The language provides a variety ofclass constructors that correspond tothe relations between classes sup-

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 18: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

158 ● E. Pitouraet al.

ported by the model. These construc-tors are applied to existing baseclasses to create new derived classesthat have the corresponding relationwith the base classes [Dayal andIfwang 1984; Metro 1987; Kaul et al.1991; Mannino et al. 1988; and Shethet al. 1993]. This methodology in ef-fect defines explicitly the third com-ponent of a virtual class and thenimplies the other two, namely itspopulation and intention. The mostcommon such constructors are thegeneralization or superclass con-structor and the specialization orsubclass constructor.

A derived class defined as a sub-class inherits all the functions of itssuperclasses. In the subclass, func-tions may be redefined and new func-tions may be defined. There is nostandard definition of the extensionof the subclass. It is generally definedas a subset of the extensions of thesuperclasses. In Dayal and Hwang

[19841 and Metro [19871 (subclassingis called join in this framework), theextension is defined as the intersec-tion of the extensions of the super-classes.

A derived class defined as a super-class inherits the common functionsof its subclasses (upwards inheri-tance). Functions in a superclass maybe redefined. The extension of the su-perclass is defined as the union of theextensions of its subclasses.

(2) Derived classes are defined by speci-f~ng their population. The popula-tion of the derived class is defined asthe result of a query on existing baseclasses. This is the most commonlyused approach [Bertino 1991; Manolaet al. 1992; Kim et al. 1993; Heilerand Zdonik 1990; Abiteboul and Bon-ner 1991]. The class intention andthe position in the hierarchy may ormay not be implied automatically bythe system. This methodology in-cludes mechanisms for definingclasses populated by imaginary ob-jects. Functions defined in a derived

class can typically use the functionsdefined in the base classes.

A complete methodology for infer-ring both the position of the derivedclass and its intention, is presentedin Abiteboul and Bonner [1991]. Aclass whose population is defined bya selection predicate on some func-tion of the base classes is consideredtheir subclass. A class whose popula-tion includes the population of thebase classes is considered their su-perclass. Abiteboul and Bonner [ 1991]also introduce the notion of behav-ioral and parametrized subclassing.Behavioral subclassing allows a su-perclass to include all classes thathave a specific property (function).Parameterized subclassing allows thepartition of a superclass into sub-classes based upon the value of one ofits functions. A mechanism similar toparametric subclassing, called typeschemas, is presented in Chomickiand Litwin [1992].

One important research issue[Manola et al. 1992] concerningclasses defined by that way is thedefinition of a query algebra with aminimum number of operators forcreating arbitrary derived classes and

imaginary objects. This algebrashould also support efficient queryoptimization.

(3) The structure (intention) of the de-rived class is explicitly defined.

Combinations of the above methodologiesare also possible, especially in the form ofqueries that involve class constructors.

When subclassing is used for subtyp-ing purposes (see Section 1.2), some re-strictions must be placed on the type ofarguments and on the type of the returnvalues of all functions defined in the vir-tual class by inheritance. These restric-tions should be such that every object ofa subclass could be used in any contextwhere an object of any of its superclassescould be used. We call these restrictionssubtyping restrictions. Dayal and Hwang[1984] define superfunctions as functionsthat satisfy appropriate subtyping re-

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 19: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

striations. In this framework, a virtualclass that is defined as superclass caninclude only superfunctions of the func-tions defined at its subclasses.

3.4.2 Issues in Integration. In the fol-lowing, we discuss several subtle issuesconcerning object-oriented integration.

Resolution Problems and BehavioralSharing. In an object-oriented system,a method defined in a class may be rede-fined in its subclasses, resulting inmethod overloading. The default resolu-tion method adopted by the object-ori-ented systems states that, when a methodis applied to an object the most specificmethod from those applicable to the ob-ject is selected. The introduction of vir-tual classes complicates the resolutionproblem, when a virtual class A is de-fined as a superclass of existing classes.In that case, the default resolutionmethod always selects the most specificmethod, i.e., one defined in one of A’ssubclasses, even when the user wants amore general method defined in A to beselected. The straightforward solution isto allow the user to explicitly specifywhich of the applicable functions shouldbe used [Abiteboul and Bonner 1991].Schrefl and Neuhold [ 1988] introduce theconcept of object coloring; the color of anobject specifies the class from which thesearch for a function should begin. If thefunction is not defined in this class, it issearched for in the appropriate sub-classes. This method also identifies thedifferent semantic relations that mayhold between the subclasses being gener-alized and their attributes. These rela-tions are utilized to produce different de-fault treatments of function resolution.

The above discussion refers to behav-ioral sharing along the inheritancehierarchy and specifies how the correctfunction is chosen either implicitly by thedefault resolution mechanism or explic-itly by the user. An alternative method tobehavioral sharing is introduced in Heilerand Zdonik [1990]. In this framework,the functions of the virtual class mayinvoke functions from the base classes,but these functions will be applied not to

Object Orientation ● 159

the new objects but to the objects of theappropriate base class. This correspondsto using the delegation rather than theinheritance method for sharing behavior

(see Section 1.2).

Assigning Identifiers to Imaginary Ob-jects. Object-oriented systems associatea unique identifier with each object uponits creation. Accordingly, upon the cre-ation of an imaginary object, an identi-fier must be associated with it. An imagi-nary object must be assigned the sameidentifier at each invocation. Moreover, ifan imaginary object is defined as a com-posite object (see Section 3.1.2) then itsidentity should be modified when the realobjects, that constitute it, are updated.The most common solution [Abitebouland Bonner 1991; Heiler and Zdonik1990; Kifer et al. 1992] is to define theidentifier of the imaginary object as afunction of the identifiers of the real ob-jects upon which the imaginary objectdepends.

Resolving Conflicts and Expressing In-terschema Relationships. The followingoutlines the most common ways of resolv-ing conflicts based upon the taxonomypresented in the introduction of this sec-tion:

0

e

Identity conflicts in object-orienteddata models are in general very diffi-cult to resolve since the identity of anobject is not based upon the value ofsome of its attributes but is character-ized by an identifier ( oid) assigned tothe object upon creation. Most systems[Ahmed et al. 1991; Huhns 1992] allowthe user to specify which objects areequivalent and should share the sameoid. Scholl et al. [1992] uses the meta-class mechanism to define a function,called the same-function, which is ap-plicable to all objects and specifies ob-ject equivalences. The user may appro-priately define the same-function sothat equivalent objects are treated asthe same object.

Naming conflicts are handled by defin-ing renaming operators.

Structural conflicts correspond to re-

ACM Computing Surveys, Vol 27, No. 2, June 1995

Page 20: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

160 0 E. Pitouraet al.

e

structuring of the class hierarchy or tomodifications of the aggregate rela-tions. There is no standard method ofresolving structural conflicts; examplesare presented in Ahmed et al. [1991];Kim et al. [1993]; Dayal and Hwang[1984]. Klas et al. [1995] proposes amethod of resolving structural conflictsby applying graph operations, calledaugmenting transformations. Thesetransformations are performed on thegraphs that represent the conflictingcomponent schemata so that thesegraphs become isomorphic.

Semantic and data conflicts are re-solved by defining an appropriate func-tion in the virtual class [Dayal andHwang 1984; Ahmed et al. 1991; Kimet al. 1993].

There is no systematic way of expressinginterschema relations in the globalschema. The most common relations ex-pressed are those that correspond to se-mantic relations directly supported by theobject-oriented model. The most commonsuch relations are the subclass, super-class, and aggregation relations [ Kaul etal. 1991; Abiteboul and 130nner 1991].

3.5 Advantages of Adopting anObject-Oriented CDM

We enumerate below some of the distinc-tive characteristics of the object-orienteddata model that render it suitable toserve as the CDM. The usefulness of thesecharacteristics has been demonstratedthroughout this section, and the follow-ing listing serves as a final recapitula-tion.

(1)

(2)

The object-oriented data model is se-mantically rich, in that it provides avariety of type and abstraction mech-anisms. It supports a number of rela-tions between its basic constructswhich are not expressed in tradi-tional models.

The object-oriented data model ~er-mits the behavior of objects to be cap-tured through the concept of meth-ods. Methods are very powerful be-

ACM Computmg Surveys, Vol 27, No 2, June 1995

(3)

(4)

(5)

cause they enable arbitrary combina-tions of information stored in localdatabases. For example, if books withsimilar topics exist in differentdatabases, a method can be definedin the global schema that eliminatesduplicates, sorts different editions,translates titles to a common naturallanguage (e.g., English).

The object-oriented model makes itpossible to integrate nontraditionaldatabases through behavioral map-ping.

Since the actual storage and retrievalof data is supported by the underly-ing local systems, there is no impor-tant performance degradation of theoverhead of supporting objects in theconceptual CDM.

Finally, the metaclass mechanismadds flexibility to the model, since itallows arbitrary refinements of themodel itself, e.g., additions of new re-lationships.

4. OBJECT-ORIENTATION ANDTRANSACTION MANAGEMENT

This section discusses the impact of ob-ject-orientation on transaction manage-ment in multidatabase systems. InSection 4.1, the current trends in trans-action management are reviewed to pro-vide a perspective on multidatabases andobject-oriented transaction managementmethods. Section 4.2 relates object-ori-ented approaches to the reviewed litera-ture. Section 4.3 introduces the specificchallenges of transaction management inmultidatabases, many of which moti-vated the new trends. In Section 4.4, theobject-oriented approach is applied totransaction management in multi-databases. In this section, we will use theterm method rather than function sincethis is the term most commonly used inthe transaction management literature.Also, the terms class and type will beused interchangeably, since the subtledifferences between the two terms, al-though central to database modeling, donot affect transaction managementtechniques.

Page 21: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

4.1 Trends in Transaction Management

A database consists of a set of nameddata items, while a database state is anassignment of values to these data items[Papadimitriou 1986]. Not all possiblecombinations of values represent a legaldatabase state. For example, a state thatrepresents a negative balance in a bankdatabase or an overbooked flight in anairline database is not a legal state. Thesereal-world restrictions are called in-tegrity constraints of the database. Adatabase state that satisfies the integrityconstraints is a consistent state.

A transaction is an execution of a pro-gram that consists of a sequence of oper-ations that access and manipulatedatabase items. In the traditional model,transactions consist of simple read andwrite operations and have a single beginand commit point. Individual transac-tions maintain database consistency; thatis, if they are applied to a consistentstate, they result in a consistent state.Transactions are executed concurrentlyand their operations execute in an inter-leaved fashion, potentially creating aninconsistent database state. Further-more, transactions may fail while execut-ing. The objective of transaction manage-ment is to ensure that the concurrentexecution of transactions leaves thedatabase in a consistent state, even inthe event of failures.

Classical transaction managementdeals with executions rather than withspecifications of programs (as in concur-rent program proofs). We call the execu-tion of several transactions a history. Ahistory is correct if it leaves the databasein a consistent state. The approach takento prove the correctness of a history isbased on the observation that a serialhistory (a history that corresponds to aserial execution of transactions) is cor-rect—by induction on the number oftransactions involved. That leads us tothe following correctness criterion: a his-tory is correct if it is serializable; that is,if it is equivalent to a serial history. Fail-ures are accounted for by including inthe definition of serializability only com-

Object Orientation “ 161

mitted transactions, which are transac-tions that have successfully completedtheir operations. Thus, the definition ofcorrectness may be restated as follows: ahistory is correct if its committed projec-tion is serializable. In practice, a morerestrictive notion of serializability, calledconflict-serializability is used becausethere is an efficient graph-based algo-rithm for testing it. Two operationsconflict if the result of their executiondepends upon the order in which theyare processed. Two histories are conflict-equivalent if they consist of the sametransactions and order conflicting opera-tions of committed transactions in thesame way. Papadimitriou [1986] andBernstein et al. [1987] offer an excellenttreatment of the above issues.

Current research has called the aboveassumptions regarding transaction cor-rectness and database consistency intoquestion and new approaches to theproblem of maintaining consistentdatabases are under development. Weidentify the following directions in thedevelopment of new transaction mecha-nisms [Ramamritham and Chrysantis1992; Buchmann et al. 1992; Barghoutiand Kaiser 1991; Elmagarmid 1992]:

New database consistency require-ments. The requirement that databasecorrectness be preserved, which can bealternatively characterized as thepreservation of integrity constraints,has been relaxed in various ways[Ramamritham and Chrysantis 1992].For example, while consistency hastraditionally been defined with respectto all items in the database, new ap-proaches require consistency only forparts of the database.

New transaction models. In the tradi-tional model, transactions weresequences of simple read and write op-erations with a single begin and com-mit point. New transaction models in-troduce transactions with an extendedstructure, that is, transactions thatconsist of a number of subtransactionsrelated by model-specific relations.Nested transactions [Weikum 1991;

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 22: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

162 * E. Pitouraet al.

Beeri et al. 1989] are an example ofthis type. Additionally, new modelsprovide complex operations on complexdata items instead of read and writeoperations on single-valued items. Fur-thermore, to model the execution ofcomplex tasks at various heteroge-neous systems workflows models havebeen proposed. Workflows extendtransaction models by adding evenmore elaborate structure to transac-tions.

The above directions are by no meansorthogonal.

As a result of the above advances, newcorrectness criteria for database consis-tency and transaction correctness are un-der development to provide alternativesto serializability and to the atomicity,consistency, isolation, and durability

(ACID) properties of transactions.

4.2 Object-Oriented TransactionManagement

Conventional database concurrency con-trol can be used in object-orienteddatabase systems (ORION Kim [1990],GemStone Bretl et al. [1989]). In thiscase, each object is treated as a data itemand the methods as a set of read andwrite operations on this data item. Lock-ing algorithms can use objects as thegranularity of the lock, and hierarchical-based locking algorithms can take advan-tage of the class hierarchy. However,many researchers have utilized the par-ticular characteristics of object-orientedsystems to introduce new approaches totransaction management. This sectiondiscusses these approaches.

4.2.1 Modular Concurrency Control.According to this approach, each object isresponsible for the correct execution ofthe transactions applied to it [Hadjilacosand Hadjilacos 1991; Weihl 1989;Schwartz and Spector 1984], More thanone transaction is allowed to execute inan object simultaneously and transac-tions may be executed in more than oneobject. The correct execution of the trans-actions applied to an object is referred to

as in traobject synchronization [Hadjila-cos and Hadjilacos 1991]. To ensureglobal database consistency, in additionto intraobject synchronization, there is aneed to control the correct execution oftransactions not only locally at each ob-i ect but also aloballv over all involved. .objects; this is referred to as interobjectsynchronization [Hadjilacos and Hadjila-cos 1991]. The advantage of separatingthe intra- from the interobject synchro-nization is that each object can individu-ally select the most suitable definition ofcorrectness and al~orithm for its reser-vation. The conc&rency proper~ies ofeach object are defined according to thesemantics of its tvwe and o~erations.These properties ca~ be specified in dif-ferent ways, for example, in terms of ac-ceptable histories by using state ma-chines [Weihl 1989] or in terms of depen-dencies between the methods of the type[Schwartz and Spector 1984].

Most models employ serializability astheir correctness criterion. Each objectensures the serializability of the transac-tions submitted to it. Then, roughlyspeaking, global serializability (serializa-bility of all transactions executed at allobjects) is ensured if there is a serializa-tion order compatible with the serializa-tion order at each object.

Weihl [1989] identifies a property P,called the local atomicity property, suchthat if every object in the system satis-fies P, then every history is globally seri-alizable. In that case, there is no need forinterobj ect synchronization. Any localatomicity property must be such that, insatisfying the property, the objects agreeon at least one (dobal) serialization or-der for the commi~ted transactions. Weihl[ 1989] identifies three optimal local

properties such that no strictly weakerlocal constraint on obiects suffices to en-.sure global atomicity for transactions.These three properties are dynamicatomicity (a generalization of two-phaselocking), static atomicity (a generaliza-tion of timestamp ordering) and hybridatomicity (a combination of the othertwo). In cases of dynamic and hybridatomicity, global control may still be

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 23: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation o 163

needed to resolve deadlocks. Finally,combining objects with different atomic-ity properties does not guarantee globalserializability. Schwartz and Spector[1984] identify groups of types called co-operative types consisting of types whoseobjects produce compatible serial histo-ries. No interobject synchronization isneeded for objects belonging to coopera-tive types.

4.2.2 Semantic Serializability andType-specific Operations. In an object-oriented system, information is repre-sented by typed objects that can beaccessed only through a number of pre-define type-specific methods. Systemstypically exploit this property to enhanceconcurrency [Skarra and Zdonik 1989]by:

(1) Modifying the definition of historyequivalence. A history is equivalentto another history when a differencebetween their results cannot be de-tected by data operations. Thus, twohistories are semantically indistin-guishable, even though they mayleave objects in slightly differentstates, as long as these differencescannot be detected by operations.

(2) Permitting operations to be charac-terized in finer detail than simply asreads and writes. Most approachesalso consider arguments and resultsas part of operations. This offers moreconcurrency because it allows formore precise definitions of conflicts.The most common definitions of con-flicts in the object-oriented databasecontext are: (1) Commutativity-based:Under this definition, two operationsconflict if they do not commute. InBadrinath and Ramamritham [1988]commutativity for complex objects isdefined based on the structure of ob-jects. Two operations on an object Ocommute if they do not affect thesame component objects (e.g., in-stance variables) of O. Weihl [1988]defines commutativity as forwardcommutativity and backward com -mutativity. The definition is given in

terms of state automata that de-scribe the acceptable sequence of op-erations by each object, depending onits type. Let S be a sequence of oper-ations and s a state, then let S(s)stand for the state reached after ap-plying S on s. Two sequences of op-erations S and T commute forwardif, for every state s of the object inwhich T and S are both acceptable,T(S(S)) = S(T(S)) and S(T(S)) is anacceptable state; and commute back-ward if T(S(S)) = S(7’(s)). The defi-nition of commutativity chosen forthe conflict relation determines therecovery algorithm used, An inten-tion list algorithm is used with aconflict relation based on forwardcommutativity, while an undo logalgorithm is used with backwardcommutativity. (2) Dependency-based: Under this definition, two op-erations conflict if one depends onthe other. A binary relation R onoperations is a dependency relation[Herlihy and Weihl 1991] if for allsequences of operations T and S andall operations p, such that S and pare acceptable after 7’ and no opera-tion q in S depends on p (e.g., (p, q)

@ R), it should be acceptable to do Safter p. An example of a dependencyrelation is the invalidated-by rela-tion [Herlihy and Weihl 1991], whereoperation p invalidates operation qif there are sequences of operations Tand S such that T o p o S andT o S o q are acceptable but T o p 0 qis not ( o is the concatenation opera-tion).

Most algorithms used for ensuringconcurrency are locking schemaswhere nonconflicting operations areassigned locks with compatiblemodes.

The above description does not clearlyspecify which are the primitive opera-tions of a transaction. Traditionally, theprimitive operations are read and writeoperations, but object-oriented transac-tion management techniques differ onhow they define them. A primitive opera-

ACM Computmg Surveys, Vol. 27. No 2, June 1995

Page 24: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

164 “ E. Pitoura et al.

tion must be implemented atomically bythe system (for example using mecha-nisms as semaphores or monitors). Someresearchers [Weihl 1989; Skarra andZdonik 1989] define methods as theprimitive operations implying that meth-ods are performed atomically by the sys-tem. This is not a realistic assumption,es~eciallv when methods invoke methods.in other objects, which is necessary whencomplex objects (objects that have otherobjects as instances) are supported. Inaddition it provides less concurrency thanthe concurrency provided by approaches[Hadjilacos and Hadjilacos 1991] that de-fine methods as transactions of low-levelprimitive operations.

4.2.3 Nested Transactions. There isan implicit nesting in object-orientedtransactions imposed by the way meth-ods are invoked. Methods may invokeother methods leading to a nested trans-action structure. The objective is to allowmethods to exhibit internal parallelismby exploiting their semantics. A variationof nesting is introduced in Skarra [1991]and Skarra and Zdonik [1989] as a con-ceptual framework for modeling designapplications. Here nesting is not the re-sult of the invocation order of methodsbut is based on the application require-ments. A design task is modeled as anested structure of transactions, wherethe root is an atomic transaction and thenodes are either atomic transactions orTransaction Groups (TG). The membersof a TG are called cooperating transac-tions. While atomic transactions areconsistent, members of a cooperativetransaction may or may not be individu-ally consistent. Synchronization is modu-lar and is controlled at two levels by TGsand by objects. Objects synchronizeatomic transactions. TGs produce consis-tent histories of their members, whereconsistency is defined in terms of seman-tic patterns. A pattern is a sequence ofoperations that represent semantic ac-tions, and it preserves consistency withinor among objects. A history of cooperat-ing transactions is consistent if it satis-fies the semantics patterns that apply tothe group.

4.3 Transaction Managementin Multiclatabases

Transaction management in a multi-database system is performed at two lev-els, at a local level by the preexist-ing transaction managers of the localdatabases (LTMs) and at a global levelby a global transaction manager (GTM)superimposed on them. There are twotypes of transactions, local and globaltransactions. Local transactions accessdata managed by a single local databasesystem and are submitted to the appro-priate LTM outside the control of theGTM. Global transactions may accessdata in more than one local databasesystem, and are submitted to the GTM,where they are parsed into a number ofglobal subtransactions each of which ac-cess data in a single local database. Thesesubtransactions are then submitted forexecution to the appropriate LTM. Globalsubtransactions are viewed by an LTMas ordinary local transactions. The GTMretains no control over the execution ofglobal subtransactions after their sub-mission to the LTMs and can make noassumptions about the LTMs.

4.3.1 Multi database ConsistencyCriteria. In a multidatabase, a localhistory includes all local and global sub-transactions executed at a local site, and

a global history includes all local andglobal transactions. The most commonlyused criterion for ensuring consistency isbased on conflict-serializability. The goalis to ensure global serializability, that isserializability of global histories, underthe assumption that all local historiesare serializable. In general, to ensureglobal serializability it suffices to ensurethat there is a serialization order for theglobal transactions that is consistent withall local serialization orders. Ensuringglobal serializability is difficult becausethe GTM has no knowledge or controlover the serialization order of local histo-ries. Local transactions can be the causeof indirect conflicts between global trans-actions that do not conflict directly.

One practical solution to the problemof maintaining global serializability was

ACM Computmg Surveys, Vol 27, No, 2, June 1995

Page 25: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 165

proposed in Georgakopoulos et al. [1991]and is based on the concept of a ticket. Aticket is a special data item stored ateach local database. Each global transac-tion before starting executing at any site,must read the ticket at this site, incre-ment it, and write back the incrementedvalue. These operations force direct con-flicts between all subtransactions at thesite, and thus the ticket value read by asubtransaction indicates its serializationorder at this site and can be used by theGTM to ensure global serializability.Zhang and Elmagarmid [1993] discussesthe properties that global transactionsmust have such that their serializationorder at each local site can be deter-mined without using any knowledgeabout the local transaction management.These properties are realized by enforc-ing additional conflicts between globaltransactions.

Both of the above approaches requirethe enforcement of conflicting operationsamong global subtransactions at each lo-cal site. However, enforcing conflicts mayresult in poor performance if most globaltransactions would not naturally conflict.Relaxing global serializability is thus asignificant issue for concurrency controlin multidatabases. Alternative defini-tions of consistency criteria that are lessrestrictive than global serializability havebeen introduced. Examples include two-level [Blair et al. 1992], quasi- [Du andElmagarmid 1992] and view-based[Zhang and Pitoura 1993] serializability.

Many researchers have studied differ-ent properties of local histories and howthey affect global serializability [Blair etal. 1992]. One interesting such propertyof histories is rigorousness [Breitbart etal. 199 1]. Rigorous histories disallow con-flicts between uncommitted transactions.The commitment order of the transac-tions in a rigorous history determinestheir relative serialization order, thus theGTM can ensure global serializability byensuring that the order of commitment oftransactions at all local databases iscompatible.

4.3.2 Atomicity and Failures. Even agreater challenge than satisfying global

serializability is maintaining the atomic-ity of a global transaction in the presenceof failures; that is, ensuring that eitherall its subtransactions commit or they allabort. It is proven that there is no algo-rithm that can ensure atomic globaltransactions in the absence of a prepare-to-commit state [Mullen et al. 1992]. Twotechniques are used for handlingnonatomic commitment. The first tech-nique attempts to ensure atomicity of aglobal transaction by trying to commit itsaborted subtransactions. To do so, eitheran aborted subtransaction is resubmittedfor execution as a new subtransaction ofthe original transaction (retry approach)or only the write operations of the abortedtransaction are resubmitted (redo ap-proach). The second technique attemptsto undo the results of a committed sub-transaction by executing a compensatingtransaction at the corresponding local sitethat undoes from a semantic point ofview the effects of the subtransaction[Breitbart et al. 1992].

4.3.3 Extended Transaction Models.The traditional transaction concept isvery difficult to maintain in a multi-database system. As we have alreadypointed out, it is very hard to ensureglobal serializability. Moreover, due tolocal autonomy, a local database is enti-tled to delay or even reject a global trans-action. Thus global transactions tend tobe long-lived and error-prone and thetraditional transaction model seems in-adequate to model them efficiently.Moreover, global transactions often modelcomplex tasks with arbitrary dependen-cies among their subcomponents. Manyresearchers have proposed extendedtransaction models and flow-of-controlstructures that take into account theabove considerations [Elmagarmid et al.1990; Georgakopoulos et al. 19931.

4.4 Bringing the Two Concepts Together

Although research in object-orientedtransaction management and research inmultidatabase transaction managementare being performed in parallel, it seemsthat they both lead to what we call lay-

ACM Computing Surveys, Vol 27, No 2. June 1995

Page 26: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

166 “ E. Pitoura et al.

ered transaction management. Layeredtransaction mana~ement is introduced inSection 4.4.1, w~ere the common con-cepts between object-oriented and multi-database transaction management arediscussed. In Section 4.4.2 we describehow ideas pertinent to object-orientedtransaction management can be adaptedfor multidatabase transaction mana~e-ment. Since this is a new research direc-tion, the following sections are only afirst attempt to address the issuesinvolved.

4.4.1 Layered Transaction Manage-ment. Traditional transaction manage-ment coped with the problem of main-taining consistency in a single database.Research in transaction mana~ement inmultidatabase systems and research intransaction management in object-ori-ented database systems have indepen-dently introduced the notion of layereddatabases, where each object (localdatabase) is a single database responsi-ble for its own consistence. and wheretransactions span more th~n one singledatabase. The structure of these systemshas led to layered transaction manage-ment; local or intraobject transactionmanagement; and global or interobjecttransaction management.

The traditional treatment of layeredtransaction management is based on thesole assumption that each local databaseaccepts correct histories, e.g., historiesthat maintain local consistency. Then,

mo~erties of histories or local transac-~io~ managers are identified such thatglobal consistency is maintained. Re-search to identify properties of local his-tories sufficient for maintaining globalconsistency is being pursued in both themultidatabase and the object-orienteddatabase communities [Breitbart et al.1991; Weihl 1989].

4.4.2 Adapting Object-Oriented Tech-niques. In this section we discuss theeffect of object-orientation on multi-database transaction management. Weidentify two important impacts. The firstis the use of object-oriented techniques toimplement extended transaction models.

The second is the use of semantic infor-mation. In the following, we elaborate onthem.

One application of object-orientation inmultidatabase systems is the use of ob-ject-oriented techniques to implement ex-tended transaction models. ‘Under thisimplement ation, transactions are mod-eled as objects and their interactions asthe methods of these objects [Heiler et al.1992; Buchmann et al. 1992]. Flat trans-actions (transactions with a single beginand commit point) correspond to simpleobjects and extended transactions corre-spond to complex objects. Implementingtransactions as objects requires neitheran object-oriented common data modelnor an object-based architecture.

In particular, most systems that sup-

port extended transactions model their~ransactions as active objects [Buchmannet al. 1992; Manola et al. 1992]. Activeobjects [Buchmann et al, 1992; Dayal etal. 1988] are defined as objects capable ofresponding to events by triggering theexecution of actions when certain condi-tions are satisfied. An event-condition-action (E CA) rule specifies the eventsthat are to be monitored, the conditionsthat must be fulfilled. and the actionsthat are to be executed. Transactions in-volving active objects are intrinsicallynested, since events may be detectedwhile a transaction is being executed inthat object, and thus the correspondingrule is s~awned as a nested transaction.

Resea~ch in object-oriented transactionmanagement has advanced the use of se-mantic information in transaction man-agement (see Section 4.2.2). Principlesand techniques from this area may beused to produce more efficient transac-tion management in multidatabase sys-tems that support an object-orientedcommon data model. Since each databaseobject comes with a specific set of meth-ods, semantic serializability can beemployed as the correctness criterion.Moreover, type-specific operations may beutilized to provide more appropriate defi-nitions of the conflict relation.

In the special case where the opera-tional mapping approach is used in

ACM Computmg Surveys, Vol. 27. No 2, June 1995

Page 27: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation w 167

translation, each object in a local systemprovides a number of predefine primi-tive operations. In that case, the inter-face of a local system is a set of methods[Klas et al. 1995] rather than read andwrite operations as in traditional multi-database transaction management. Thesemethods are assumed to always ensurelocal consistency constraints if executedin a (locally) serializable way.

As regards the maintenance of atomic-ity by employing the undo approach,multidatabase systems with an object-oriented common data model allow com-pensating actions to be defined at themethod level. Since each class has only aspecified set of methods, it is easy todefine a compensation method for eachone of them.

Finally, semantic information can beused during query processing to producesubtransactions with known interdepen-dencies based on the way the target classis defined. For example, to support paral-lelization within a transaction, if a queryhas as its target a superclass whose ex-tension is defined as the disjoint union ofthe extensions of two subclasses in com-ponent databases, the extensions of thetwo subclasses can be computed concur-rently. This approach of combining queryprocessing and transaction management,though promising, has not yet been fullyexplored.

5. CASE STUDIES

We conclude this review with a compre-hensive study of existing object-orientedmultidatabase projects. The projects pre-sented in this section serve as a means todemonstrate how the issues analyzed inthe previous sections are being handledin practice. Their inclusion does not, byany means, imply that there are no othersystems at least as important as thosepresented here. The previous analysisapplies to other systems as well.

Pegasus provides an interesting,though not yet well-formalized, approachto integration and a practical treatmentof query optimization. The ViewSystemprovides a comprehensive and consistent

approach to performing integration byusing constructors. OIS(CIS) serve asrepresentatives of the operational map-ping approach. EIS/XAIT proposes anobject algebra for performing integrationby queries and also an application-specific transaction management model.DOMS is the only system whose architec-ture corresponds directly to the object-based architecture and the proposedstandards. In addition, DOMS supports“customized transaction management.UniSQL/M provides a detailed classifi-cation of conflicts and a number of con-flict resolution techniques. Carnot offersa knowledge-based approach to integra-tion. Although Thor is not a multi-database system, it is included in thissection to show how sharing of informa-tion from nonpreexisting systems raisesnew research issues. FBASE offers anexample of the use of class hierarchies inintegration, while Interbase* provides acomplete transaction specification lan-guage for its extended transaction model.Finally, FIB introduces the notion ofclassification, which is based on thestructural properties of its proposed datamodel, and can lead to the automation ofthe integration process.

The presentation of the systems isstructured as follows. The “System Archi-tecture” section describes the structure ofthe system. The CDM is described in thesection titled “Common Data Model.” Inthe “Translation and Integration”section, we describe the processes oftranslation and integration following theanalysis in Section 3. In the “Query Pro-cessing” section we describe issues rele-vant to the execution of global queries.Transaction management is studied inthe “Transaction Management” section.Finally, the section entitled “ImportantFeatures” emphasizes the main charac-teristics of the system reviewed.

5.1 Pegasus

Pegasus [Ahmed et al. 1991; Ahmed et al.1991(a); Ahmed et al. 1993] is a multi-database system being developed by theDatabase Technology Department at

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 28: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

168 “ E. Pitoura et al.

Information Mmmg Intelligent Information ,

BrowsersAccess

------------------------------ -----

===:----,

.. . . . . . . . . . . . . . . . . . . . . . . .

Parser Cooperative Infomnation

I Management

Executwe

QueryProcessor

: Schema ManagerGlobal Interpreter

Object Manager I. . . . ..- --

Storage Services : Local Translator : TransactIon :

Mapper : Manager :

. . . . . . . . . .I

----- ----

I I

Communication Services !

I

Pegasus Agent ~

I

Local DBMS ~

Local :Data Access ;

Figure 4. Pegasus system architecture.

Hewlett-Packard Laboratories. Pegasusprovides access to native and externalautonomous databases. A nativedatabase is created in Pegasus and bothits schema and data are managed by Pe-gasus. External databases are accessiblethrough Pegasus, but are not directlycontrolled by it.

System Architecture. The functionallayers provided by Pegasus, shown inFigure 4, are adapted from Ahmed et al.[1991(a)]:

.

*

m

The intelligent information access layerprovides such services as informationmining, browsers, schema explorationand natural language interfaces.

The cooperative information manage-ment layer deals with schema integra-tion, global query processing, localquery translation, and transactionmanagement.

The local data access layer managesschema and command translation, lo-

ACM Computmg Surveys, Vol 27, No 2, June 1995

cal system invocation, network commu-nications, data conversion and routing.

Common Data Model. The model,based on the Iris object-oriented model,consists of three basic constructs: objects,types, and functions. Types is the Iristerm for classes. Types are organized in adirected acyclic graph that supports gen-eralization and specialization and pro-vides multiple inheritance. Iris usesfunctions to model both instance vari-ables and methods. Properties of objects,relationships among objects and compu-tations on objects are expressed in termsof functions. Pegasus uses the same lan-guage, called HOSQL (HeterogeneousObject SQL), both as a data definitionand as a data manipulation language.HOSQL is a superset of OSQL, which isthe language of Iris, and is query-ori-ented. HOSQL provides nonproceduralstatements to manipulate multipledatabases and also supports statementsfor creating types, functions, and objects

Page 29: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

both in Pegasus and in the component

databases. Furthermore, HOSQL pro-vides for attachment and mapping ofschema of local databases. Specificationof types and functions can also be im-ported from underlying databases.

Translation and Integration. An ex-ternal database is represented in Pega-sus by its imported schema (which corre-sponds to the component schema of theextended architecture presented in Sec-tion 3). The translation from the nativeschema to the imported schema and theimportation of the external schemas areperformed in a single step, using the viewmechanism of the HOSQL language. Theimportation facilities are provided in amodular fashion. For each external datamodel a separate module can be devel-oped, sold, and installed independently.A technique for importing automaticallyan external relational schema is de-scribed in Albert et al. [1993].

The imported classes are called pro-ducer types. Their extension is defined bya special kind of query, called producerexpression, over a base type called pro-ducer set. The producer expression de-fines the instances of the producer typebased on some stable and identifying lit-eral-valued property for each entity in anexternal database. Oids are fabricatedfor instances of the producer types andhave a suffix value taken from the pro-ducer set and a prefix unique to the pro-ducer type. The functions of the producertype are called imported functions andare mapped to properties or relations inexternal data sources. A characteristic ofthe function importation mechanism isthat a program written in some general-purpose programming language andcompiled outside Pegasus may be linkedto an Iris function, called foreign fiunc-tion [Chomicki and Litwin 1992].

Pegasus’ approach to integration dis-tinguishes between the views of the dataadministrator and of the end user. Thisdistinction is supported by defining twotypes: unifying types and underlyingtypes. Administrators see both kindswhereas users see only underlying types.

Object Orientation w 169

Producer types are underlying types.Each underlying type has a unifying type.The initial assumption is that every typeis its own unifier but unifying types andtheir instances can also be defined bycombining more than one underlying typeusing HOSQL statements. Pegasus sup-ports unifying inheritance; that is, everyfunction defined for a type is also definedfor its unifjcing type. Resolution problemsare resolved explicitly by the administra-tor who defines a reconciler algorithmfor each overloaded function. The recon-ciler algorithm specifies which of the ap-plicable functions will be used.

Pegasus handles the following types ofconflicts:

m

Semantic conflicts (called domain mis-match) are handled by defining appro-priate functions at the unifying type.

Schema conflicts. Naming conflicts re-ferring to function synonyms are solvedby defining aliases. Additionally,names of functions and types can beprefixed by their database names toprevent ambiguities. Structural con-flicts (called schema mismatch) can behandled by defining adequate importedfunctions. An example is presented[Ahmed et al. 1991] for handling theconflicts introduced when the sameconcept is represented by a function inone local schema and by a class inanother. In this example, a function iscreated in the global schema that re-turns the value of the local function orthe associated object of the local class.

Identity conflicts (called object identifi-cation) are resolved by allowing theuser to specify equivalences amongobjects.

Query Processing. The user issuesHOSQL queries against the unified

schema of Pegasus. These queries aredecomposed into a set of possibly para-metric subqueries, each one of whichrefers to data residing in a single exter-nal database. In a parametric subquerysome predicates include parameterswhose values are received from othersubqueries at evaluation time. Query op-timization in Pegasus is either cost-based

ACM Computing Sumeys, Vol. 27, No. 2, June 1995

Page 30: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

170 “ E. Pitoura et al.

or heuristic-based, depending on theavailability of statistical data. The pro-cess of cost-based query optimization ismodeled by three characteristics: an exe-cution space, a cost model, and a searchstrategy. For the case of heterogeneousenvironments, the traditional executionspace is extended to include multisitejoins and the traditional search strategy.Because of the autonomy of the externaldatabases, Pegasus may not be able tocontrol physical parameters such as page1/0 and CPU time. Thus, instead of us-ing cost formulas based on physical pa-rameters, it has developed a set of costformulas based on logical parameterssuch as data cardinality and selectivity.A set of calibrating databases has beendesigned to estimate the values of thecoefficients in the cost formulas. If theexternal databases do not provide ade-quate information for cost-based queryoptimization, a number of heuristics isused to generate the evaluation plan. Onesimple heuristic is to put functions thatreference one another and belong to thesame external database in the same sub-query. This minimizes invocations of anexternal database. For query evaluation,each decomposed subqueryj after beingtranslated into the DML of its externaldatabase, is sent to its external DBMSfor evaluation. The result is returned tothe central query processor of Pegasus,and is used to drive other subquerieswhich make reference to it.

e

0

e

Important Features

Treatment of conflicts [Ahmed et al.1991];

Implementation of foreign functions[Connors and Lyngbaek 19881:Cost-based or heuristic-based query op-timization, depending on the availabil-ity of statistical data [Ahmed et al.1993].

5.2 ViewSystem

The KODIM (Knowledge Oriented Dis-tributed Information Management) [Kaulet al. 1991] project at GMD-IPSI is mainly

ACM Computmg Surveys, Vol 27, No 2, June 1995

concerned with the dynamic integrationof heterogeneous and autonomously ad-ministered information bases. Anobject-oriented environment, calledViewSystem, has been developed as afirst prototype. The ViewSystem providesan object-oriented query language withextensive view facilities for defining vir-tual classes from base classes. TheViewSystem is implemented in anobject-oriented environment, namely theSmalltalk environment, and in this waybenefits from a large set of tools andreusable software.

Common Data Model. The CDM,called the VODAK data model [Ducheneet al. 1988], consists of four basic con-structs: instances (or objects), types,classes, and methods. Classes are nottemplates for creating objects, but ratheran abstraction for naming collections ofobjects and associating a number ofmethods with each collection. Types aretemplates for defining the structure andbehavior of their instances and are orga-nized in a subtype hierarchy (see Section3.1.2). Classes are related by a number ofsemantic relationships, such as special-ization, generalization, grouping and ag-gregation. These semantic relationshipshave a set-theoretic counterpart, indicat-ing how the objects of semantically re-lated classes correspond to each other;namely, specialization corresponds tosubsetting, category generalization todisjoint union, role generalization tooverlapping union, grouping to power set,and aggregation to Cartesian product.The query language, called DML, is pro-gramming-based and set-oriented.Queries are directed against classes ofinterest and return the set of instancessatisfying a qualification predicate.Queries may be nested to arbitrary depth,i.e., a query may occur at any place in thequalification part where a set-valuedterm is allowed.

Integration. To support semantic in-tegration the VODAK model allows thedefinition of virtual classes called inten-tional classes. There are two kinds ofintensional classes, external and derived

Page 31: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

classes. An external class is the VODAKrepresentation of an information unit im-ported from an underlying database. De-rived classes are constructed using arepertoire of class constructors from anumber of base classes. Derived classesin some sense correspond to relationalviews. The only difference is that derivedclasses can have methods attached tothem that are written in an object-ori-ented programming language. There is aone-to-one correspondence between classconstructors and semantic relationships;when an operator is applied to a numberof classes, it establishes a new (derived)class having the corresponding semanticrelationship with the argument classes.The methods of the derived class arecomputed using the methods of the argu-ment classes. The ViewSystem also offersa concept to support the modularizationof views. Different views are organized indifferent modules. A module M consistsof a number of classes, an input inter-face, which is a list of all imported meth-ods of all classes imported to M, and anoutput interface that consists of all meth-ods exported from M.

Query Processing. The ViewSystemidentifies two different ways of perform-ing query processing in the presence ofderived classes. One way is to material-ize all derived classes that are affectedby the query. The other way is to get ridof derived classes by transforming thequery into an equivalent set of sub-queries that refer to external and baseclasses only. The ViewSystem takes ahybrid approach; it lets the kind of de-rived class determine whether material-ization or query transformation is moreappropriate. Aggregation operators, suchas role generalization, grouping, and ag-gregation, raise the problem of identify-ing the corresponding objects and for thatreason materialization is favored overquery transformation. On the other hand,query transformation is more preferablein the case of specialization and categorygeneralization because the objects of thederived classes have a unique counter-part in the related classes, and thus a

Object Orientation ● 171

query can be split into a number of sub-queries such that no subquery is facedwith the identification problem.

e

0

e

Important Features

It is embedded in an object-orientedprogramming environment and bene-fits from reusable software;

Provides a concrete methodology forcreating virtual classes based on a setof class constructors;

Offers a hybrid approach to query pro-cessing; and

Allows for organizing views in differentmodules.

5.3 01s

The Operational Integration System

(OIS) [Gagliardi 19901 is a generalizedintegration tool that provides the appli-cation environments with a uniform in-terface for accessing data managed byheterogeneous systems. These systemsare expected to be file systems, DBMSS,information retrieval systems, remotedatabank services or ad hoc applications.01S has been partially developed in theframework of the Esprit Project 2109

(TOOTSI). 01S is similar to CIS, andboth are described together in the follow-ing section (Section 5.4).

5.4 Cls

The Comandos Integration System (CIS)

[Bertino et al. 1989; Bertino et al. 1988]has been implemented as part of the ES-PRIT project COMANDOS. It has beenused for integrating several different ap-plication environments, including rela-tional DMBSS, graphical databases, andpublic databanks.

System Architecture. The system ar-chitecture is depicted in Figure 5. A clientis an application based on CIS, that ac-cesses services provided by one or moreservers. A server is the abstraction of(part of) a preexisting application. Therole of a server is to make available toclients a uniform object-oriented inter-face on top of the preexisting application.

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 32: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

172 “ E. Pitouraet al.

m / ./ \

Object-Oriented Inte@ace

RFMk!E!EE_

Figure 5. CIS system architecture

Common Data Model. The CDM,called abstract data model in the CISframework (or integration data model

(IDM) in the 01S framework), consists ofthe following basic constructs: objects,classes, methods (called operations) andinstance variables (called definitionalproperties). The notion of a class is bothintentional and extensional (see Section3.1); a class is considered both as a set ofobjects and as a pattern for creatingthem. The model supports a number ofadditional features (see Section 3.1) suchas independent and dependent objects,constant properties, the possibility ofdefining a property as the inverse of an-other property, and the notion of key. Aproperty can be defined as optional ormandatory. Furthermore, properties canbe variant, that is, they can be instancesof several different classes. The querylanguage, QL, is logic and query-based.Queries are issued against all instances(not members) of a class. Thus, the ob-jects resulted from the query belong tothe same class, and the type of the resultis known at compile time.

Translation. Bertino et al, [1989] in-troduce the concept of operational map-ping. The operational mapping approach

(Section 3.3) is based on defining corre-spondences between operations insteadof defining correspondences between data

elements. An object-oriented abstractview is defined on top of each localdatabase system using the abstract datamodel (the abstract view corresponds tothe component schema of the extendedarchitecture, see Section 3). The abstractview is defined as a set of operations on aset of abstract data. The operationalmapping is defined as the implementa-tion of these abstract operations in termsof primitive operations of the underlyingsystems. The abstract operations is a setof predeflned generic operations classi-fied as: (i) operations on classes, thatinclude insertion and deletion of an ob-ject from a class and inspection of theinstances of a class, and (ii) operationson objects, which deal with object prop-erty manipulations. More complex opera-tions can be built using the primitiveones. Some local systems may not be ableto implement some of the generic opera-tions. The drop clause specifies whichoperations cannot be applied to the re-lated classes.

An implementation of the generic oper-ations must be provided for each localsystem. The implementation of these op-erations realizes the 01S ob]”ect-at-a-timeinterface, that is, an operation alwaysreturns a single object. The object-at-a-time interface uses oids. The validity ofan oid is bound to the duration of aspecific client/server interaction. A spe-

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 33: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 173

cial data structure is allocated by theserver whenever an object is activated.An object is active if there is at least oneclient which has a reference to it. Anactive-object-table contains references tothe data structures of all active objects.An oid is the address of an entry in thistable. The restriction that oids are validonly during a single client/server inter-action makes the implementation of oidspossible even when the component sys-tems do not support the identification ofobjects directly (e.g., the component sys-tems include file or graphical systems).

Query Processing. The CIS query pro-cessor (QP) at the client parses querieswritten in QL. Since no integration issupported, queries access data only atone server, thus the resulting parse treeis sent to the QP of that server. The QPat the server, after completing the typechecking, generates the query evaluationtree according to optimization rules. Fi-nally, the query evaluation is performedusing the object-at-a-time interface. Theoptimization adopts heuristic techniquesthat use the information stored in thedata dictionary at the server.

Important Features

Introduction of the concept of opera-tional mapping along with an implem-entation [Bertino et al. 1989;Gagliardi et al. 1990].

5.5 The EIS / )(AIT Project

The object management system (OMS)

[Pathak et al. 1991; Heiler and Zdonik1990] is an object-based interoperabilityframework for engineering informationsystems (EIS) designed at Xerox Ad-vanced Information Technology (xAIT).

Common Data Model. The CDM(called FUGUE) is an object/functionmodel that consists of three basic con-structs: objects, functions, and classes

(called types). The query language is set-oriented and comprises a set of built-infunctions that apply to collection objects.These functions take instances of specificset types as input arguments and pro-

duce set types as output. Built-in func-tions include functions for predicate-based selection of objects (select), collec-tion manipulation (union, intersection,difference, and flatten-i. e., unnest col-lections), and creation of new types andinstances. They also include an objectjoin (OJoin) function that when appliedto classes A and B it produces a classpopulated by pairs of objects (a, b) wherea G A and b = B (this operation is simi-lar to a form of subclassing). Built-infunctions are mapped to local functions.

Integration. The global schema is de-fined through a view mechanism. Thepopulation of the virtual classes (calledderived types) is defined by a query overthe base classes. The objects that popu-late the virtual class are always assignednew oids. The functions of a derived classmay invoke functions from the baseclasses, but these functions will be exe-cuted in the scope of the class where theywere originally defined; that is, they willbe applied not to the new objects but tothe objects of the appropriate base class(delegation). The procedure that imple-ments a function has its own view. Eachclient that requests the application of afunction is assigned a view that providesthe context in which it will operate.

Transaction Management. In Heileret al. [1992] the idea of cooperation be-tween transactions in the context of engi-neering environments (see Section 4.2.3)is further pursued and a framework isdeveloped for coordinating the differentgroups in an integrated organization. Themodel supports a hierarchy of groups.The topmost group represents the wholeorganization. Transaction management isimplemented modularly, as a transactionmanager hierarchy, which consists of aset of local transaction managers and aglobal transaction manager that coordi-nates the local managers. The algorithmemployed by the global transaction man-ager is fixed, and is delivered with theframework. Each group provides its ownlocal transaction manager. These localtransaction managers have two parts: agroup specific protocol and a uniform

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 34: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

174 “ E. Pitoura et al.

capability (over all groups) for coordinat-ing with neighboring transaction man-agers. The protocols can be written inany convenient specification language.Each group provides its own correctnesscriterion relative to its protocol. Globalcorrectness is relative to these individualprotocols but the relation has not yetbeen formalized.

The long term goal is to provide atoolkit for building customized transac-tion managers. The toolkit will includethe algorithm for the global transactionand a number of commonly used proto-cols. Organizations will describe theirstructure and will either write their ownprotocol or select one from the ones pro-vided. Note that the above transactionmodel relaxes the requirement for auton-omy of local sites in two ways. First, thealgorithm employed by local sites(groups) is known to the global transac-tion manager, and second, transactionsof different groups can cooperate andshare intermediate results.

Finally, since the transaction model isdesigned to be used with an object-basedsystem, it provides high-level operationsthat correspond to the functions sup-ported by the system classes. Transac-tions, functions, protocols, and transac-tion managers are modeled as objects.

Important Features

Definition of view facilities [Heiler andZdonik 1990], also note that sharing isimplemented by delegation;

Extended transaction model that suwports cooperation between transactio&and user-specified correctness [Heileret al. 1992].

5.6 DOMS

The distributed object management sys-tem (DOMS) [Buchmann et al. 1992;Manola et al. 1992] that is being devel-oped at the GTE Laboratories, is anobject-oriented environment in which au-tonomous and heterogeneous local sys-tems can be integrated and native objectscan be implemented. The local systemsare not limited to database systems butmay be conventional systems, hyperme-

ACM Computmg Surveys, Vol. 27, No 2, June 1995

dia systems, application programs, etc. Aprototype DOMS was implemented con-necting Apple Macintosh Hypercard ap-plications, the Sybase relational DBMSand the ONTOS object DBMS. The proto-type supports a simplified version of thedata model and language and does notcurrently support concurrency controland recovery facilities but supports alimited form of “distributed commit.”

System Architecture. DOMS architec-ture is depicted in Figure 6 as adaptedfrom Manola et al. [1992]. The architec-ture is built based on the general princi-ples of the distributed object-based archi-tectures described in Section 2. DOMSserve as object managers. A local appli-cation interface (LAD provides an inter-face between a DOM and a local systemthat allows the DOM to access local dataand the local system to make requests toaccess objects from other local systems orto use DOM services.

Common Data Model. The CDM,called FROOM (functional/relational ob-ject-oriented model), consists of three ba-sic constructs: objects, functions, andtypes. Functions model both state andbehavior. The subtype relation is deter-mined implicitly; any type that suppliesthe interface required by a type T is asubtype of T. FROOM distinguishes be-tween implementation and interface,thus objects of the same type may sup-port different implementations of thesame function. FROOM supports event-condition-action (ECA) rules (see Section4.4.2). Rules, events, conditions, and ac-tions are defined as object types. Thedefinition of FROOM includes an objectalgebra that resembles an extended rela-tional algebra. The object algebra in-

cludes a set of high-level functions, which,as in FUGUE, are defined for collectionsof objects, and create new collections asresults. The functions provided by thealgebra include functions that corre-spond to operations of the relational al-gebra (select, project, join), standard setoperators (union, intersect, difference),functions for creating new oids, and other

miscellaneous functions. Current re-search is pursuing the definition of a

Page 35: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 175

r LocalSystem

, [-LA1--~ ,

1I~. -----

------

I

–1

1

LocalSystem

I1

11

; LAI \1~_------;1

11

t1

1,

~ DOM /! I1 11 i---- ,---,

~

Figure 6. DOMS system architecture.

more primitive “RISC” object model[Manola and Heiler 1992]. This modelprovides the definition of a small set offundamental concepts to allow the defini-tion of other object models (includingFROOM) in terms of this single set. Thebasic approach involves incorporating atthe object level constructs that are repre-sentation or metalevel constructs at othermodels, for example, types for describingmethods, state, and object identifiers.

Integration. Integration is accom-plished by defining views throughqueries. When objects involved in thequery belong to local attached systems,DOMS maps these queries through ob-ject algebra expressions into expressionsin the local query languages of the at-tached systems. Future work will tacklethe difficult issue of providing generalfacilities for creating arbitrary objectsand functions using algebra expressions.It will also address the problem of deter-mining an optimum set of algebra func-tions for use in query optimization.

Transaction Management. The ap-proach taken by DOMS is to identify anextended transaction model that wouldcapture the capabilities of most extended

transaction models, to provide the basisfor a programmable transaction manage-ment facility. In this context, a transac-tion specification and management envi-ronment (TSME) has been suggested inGeorgakopoulos et al. [1993] and Geor-gakopoulos et al. [1994]. The TSME is atoolkit that supports the definition andconstruction of specific extended transac-tion models corresponding to applicationrequirements. It provides a transactionspecification language and a pro-grammable transaction managementmechanism that configures a run-timeenvironment to support the specifiedtransaction model.

DOMS transactions consist of a set offlat transactions, together with a set oftransaction dependencies among them.Transactions support operations on thefollowing types of objects:

Local objects that represent data andfunctionality in local systems that sup-port transactions.

Local objects that represent data andfunctionality in local systems that donot support transactions (transaction-less systems) but instead provide onlyprimitive atomic operations.

ACM Computmg Surveys, Vol 27, No. 2, June 1995

Page 36: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

176 “ E. Pitoura et al,

e Native objects whose state and behav-ior are maintained by DOMS.

DOMS models transactions as objects. Asimple transaction object is a flat trans-action that issues operations only on na-tive objects, or issues operations only onobjects that are in the same localdatabase system, or issues only a singleatomic operation on a transactionlesssystem. DOMS supports two generalclasses of extended transactions: multi-system and multidatabase transactions.Multisystem transactions are extendedtransactions that have constituent flattransactions. Multidatabase transactionsare a special case of multisystem trans-actions in which flat transactions aresubmitted only to native objects or toobjects supported by local database sys-tems. In addition, local transactions canbe executed autonomously at the localsites. Multidatabase transactions followthe traditional model of heterogeneoustransactions presented in Section 4.3.

Multisystem transactions are modeledas complex transaction objects. Complextransaction objects are defined from sim-ple transaction objects using dependency

descriptors, (DDs). DDs are FROOMfunctions that describe the interrelationsbetween transaction objects in terms oftransaction dependencies. DDs will beimplemented using ECA rules. The im-plementation of DDs for multidatabasetransactions is complicated by the factthat the serialization orders of the trans-actions at each local site are not known

(see Section 4.3). If the local histories arerigorous, DOMS will control the commit-ment order of the transactions, otherwiseDOMS will use the ticket method (seeSection 4.3).

Important Features

● Complete framework in the context ofdistributed object architecture;

* Support of transaction managementthat includes operations at transaction-less systems, however transaction cor-rectness and recovery is not formalizedespecially under the presence of localtransactions;

Q An attempt to apply state-of-the-artknowledge at all parts of the systemand include most of the features thatappear in the literature.

5.7 UniSQL/M

UniSQL/M [Kim et al. 1993; Kim 1992]is a heterogeneous database system, be-ing developed at UniSQL, that allows theintegration of SQL-based relationaldatabase systems and the UniSQL/Xunified relational and object-orienteddatabase system.

System Architecture. UniSQL/M is afull database system in that it supports adatabase definition language, a databasemanipulation language, automatic queryprocessing, access authorization, and dis-tributed transaction management.

Common Data Model—Translationand Integration. ‘l’he query language,called SQL\M, is an extension of ANSISQL that incorporates object-orienteddata modeling concepts. SQL/M sup-ports view-definition facilities and con-flict-resolution techniques. No transla-tion is necessary since the data model ofSQL/M is a superset of the relationaldata model. Tables and classes are uni-formly called entities, columns and in-stance values are called attributes, andtuples and objects are called instances.The population of a class is defined usingthe member-of relation.

The integration of multiple entities (ta-bles and classes) is accomplished bydefining a virtual class. The populationof the virtual class is defined by a queryon the base entities. The attributes andthe methods are defined by explicitlyenumerating them along with their do-main. It is the responsibility of the userto define both methods and attributes inconformity with the subtyping restric-tions. No special import operation is de-fined. An entity A may be imported asthe virtual class V(A), by defining thepopulation of V(A) as equal to the popu-lation of A and the attributes and meth-ods of V(A) as equal to the attributesand methods of A. Hiding of attributes

ACM C!omputmg Surveys, Vol 27, No 2, June 1995

Page 37: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation o 177

and methods can be achieved by not enu-merating them.

Kim et al. [1993] provides a taxonomyof possible conflicts along with the reso-lution techniques used by SQL/M. Wepresent them using the framework intro-duced in Section 3. V(A) stands for thevirtual class that results from the impor-tation of A.

(1) Identity conflicts are not discussed.Actually, it is not clear whether themodel supports object identity or not.

(2) Schema conflicts:

(i) Naming conflicts are handled byusing renaming operations.

(ii) Structural confZicts. Since thedifferent constructs supported bythe model are entities and at-tributes (methods are not consid-ered in Kim et al. [1993]), thebasic taxonomy can be adjustedas follows.

(a)

(b)

When the same concept isrepresented by different con-structs of the data model,namely by an entity and anattribute, then an entity maybe split into multiple parts, ortwo entities may be inte-grated into one by performinga vertical join.

When the same concept ismodeled by the same con-structs, then:

If the constructs are classes

(entities) and have the sameintentions, a union compati-ble join is applied. If, in addi-tion, there is an inclusion re-lation between the extents ofthe classes then one can bedefined as the subclass of theother. In the special casewhere a missing attribute hasan implicit value, a specialexpression is provided for de-termining this value. Whenthe extent of a class A is asubset of the extent of a classB and the attributes andmethods of A subsumes those

of B (meaning they respectthe subtype restrictions), thenA and B are called extendedunion compatible and an ex-tended union compatible joincan be used; that is, V(A) maybe defined as a subclass ofV(B). If both entities are ta-bles, then to integrate multi-ple tables into a single class,vertical join is used.

When the same construct isan attribute then, if one at-tribute is of a primitive typeand the other is a complexobject, the projection of theaggregation hierarchy is used;that is, only one of the compo-nents of the complex attributeis selected in the virtual class.Default coercion operations

(e.g., from integer to real) areprovided for resolving C(p-flicts between attributes hav-ing different primitive types.In addition, a concatenationoperation is provided for at-tributes of the primitive typestring. If the attributes of twoclasses A and B are in-stances of classes related by asubclass relation that mayimply the same subclass rela-tion between A and B. Reso-lution techniques for resolv-ing conflicts between at-tributes that are complex ob-jects of different classes arenot discussed.

(3) Semantic confi!icts discussed in Kimet al. [1993] refer to different repre-sentation for the same data. Thesedifferent representations include dif-ferent units, different precision, ordifferent expressions. They are han-dled by homogenizing representa-tions, that is, by defining the corre-spondence between different repre-sentations and using arithmetic ex-

pressions or look-up tables to convertfrom one to the other.

(4) Data conflicts are not discussed.

ACM Computmg Surveys, Vol 27, N. 2, June 1995

Page 38: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

178 ● E. Pitoura et al.

● Accessing Services : 2D and 3D Gra hical InteractionEnvmxmnent, ~pplication Frameworks, et,.

e Semantic Services : Integration, Knowledge Discovery, etc

0 Distribution Services : Relaxed Transaction Mana ementFDeclarative Resource Cons ramt B’ase,

Communication Agents, etc.

* Support Services : EES, RDA, TP, IRDS, ORB, X.500, etc.

@ Communication Services : 0S1, Internet, Atlas, SMDS, etc.

Access

Services

“:’facP====Ji

Distribution Services

I t ,

/Applic. Communication Services

Interface

Figure 7. Carnot system architecture

Transaction Management. The first System Architecture. Carnot has de-release of UniSQL/M presumes two- veloped and assembled a large number ofphase commit support in local database generic facilities. These facilities are or-systems but it supports concurrency con- ganized into five sets of services (see Fig-trol, global deadlock detection,/resolu- ure 7 adapted from Woelk et al, [ 1993]):tion, and distributed database recovery[Kim 1992]. (1)

Important Features

● Systematic treatment of conflicts,

o Commercial database system already (2)released.

5.8 Carnot

The Carnot project at MCC [Woelk et al. (3)1993; Huhns et al. 1992; Woelk et al.1992; Tomlinson et al. 1992] addressesthe problem of logically unifying physi-cally distributed, enterprise-wide hetero- (4)

geneous information, coming from avariety of systems including databasesystems, database applications, expertsystems, and knowledge bases, business (5)workflows, and the business organizationitself.

Communication services provide theuser with a uniform method for inter-connecting heterogeneous equipmentand resources.

Support services implement basicnetwork-wide utilities. An importantcomponent of the support services isa distributed shell environment calledExtensible Service Switch (ESS).

Distribution services support relaxedtransaction processing and a dis-tributed agent facility.

Semantic services provide a global

(enterprise-wide) view of all the re-sources integrated within a Carnot-supported system.

Access services provide mechanismsfor manipulating the other fourCarnot services,

ACM Computmg Surveys, Vol 27, No 2. June 1995

Page 39: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 17’9

The Extensible Service Switch (ESS).ESS [Tomlinson et al. 1992] provides in-terpretive access to communication re-sources, local information resources andapplications at a local site. The switchcan be thought of as a component of adistributed command interpreter that isused to implement heterogeneous dis-tributed transaction execution and gen-eralized workflow control. It may also beviewed as a high level programmablecommunication front-end for applicationsin a distributed information system. ESSis constructed on top of Rossette. Ros-sette is a high performance implementa-tion of an interpreter for an Actor-basedmodel [GuI 1986] enhanced with object-oriented mechanisms for inheritance.

Common Data Model—Translationand Integration. In addition to databaseschemas, Carnot [Huhns et al. 1992] con-siders the integration of knowledge-basesystems and process models. Instead oftranslating the schemas (models) of thelocal resources into a common data model,Carnot compares and merges them withCyc [Collet et al. 1991], a common-senseknowledge base. Cyc, besides its com-mon-sense knowledge of the world, hasknowledge about most data models andabout the relationships among them. Thecommon language is called Global Con-text Language (GCL) and is based onextended first-order logic.

A resource is integrated by specifying asyntax and a semantics translation be-tween the resource and the global con-text. The syntax translation provides abidirectional translation between the lo-cal resource management language andGCL. The semantics translation is amapping between two expressions in GCLthat have equivalent meaning. This isaccomplished by a set of articulation ax-ioms. An articulation axiom has the formist(G, +) + ist(C,, v), where @ and @ arelogical expressions and ist is a predicatethat means “is true in the context.” This

axiom says that the meaning of @ in theglobal context G is the same as that of $in the local context Cl. After integration,one can access the resources through the

global view using GCL. Carnot providesa graphical tool, the Model IntegrationSoftware Tool (MIST), that automates

some of the routine aspects of model inte-gration.

Query Processing—Transaction Man-agement. Carnot’s Distributed Seman-tic Query Manager (DSQM) [Woelk et al.1992] executes queries and/or updatesagainst integrated information resources.DSQM has been implemented as an ESSactor object. DSQM’S query graph gener-ator module accepts an SQL string andgenerates a query graph using informa-tion from a global dictionary. The querygraph is passed to the semantic augmen-tation module. This module uses articu-lation axioms to expand the query graphto include other sources that contain rel-evant information. If the original queryincluded modifications in a database, thequery graph is passed to the relaxedtransaction augmentation module, whichuses the information stored in a declara-tive resource constraint base to determinewhich other databases should be modi-fied and creates separate query graphsfor each one of the modifications. Theseparate query graphs are then relatedto each other using a transaction graphthat defines the relaxed transaction se-mantics to be used. A language, based onthe ACTA formalism [Chrysanthis andRamamritham 1994], is being designedand implemented that will be used tospecify the relationships among sub-transactions in the transaction graph. Anoptimal query plan is then generated foreach query graph. Finally, the queryplans and the transaction graph arepassed to the ESS script generator, whichgenerates a script to be executed at eachsite.

Important Features

Use of a common-sense knowledge baseas the global schema instead of a datamodel [Huhns et al. 1992; Collet et al.1991]; and

Implementation of the ESS.

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 40: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

180 ● E. Pitoura et al.

Object-Orientation in Carnot. Carnotdoes not follow any of the three dimen-sions of object-orientation introduced inSection 1.1. It is included in this survevbecause it takes advantage of the deve~-opment of object technology in the imple-mentation of its various tools. Such toolsinclude the ESS that is an actor obiect:

“.

an object-oriented deductive environ-ment called LDL + + used for applica-tion develo~ment—this can be viewed asproviding ‘an object-oriented externalschema on top of the global schema (seeSection 3) and a 3D visualization tool.

5.9 Thor

Thor [Liskov et al. 1992] is an object-ori-ented distributed DBMS being imple-mented at MIT. Thor is intended to beused in heterogeneous distributed sys-tems to allow programs written in differ-ent programming languages to share auniverse of persistent objects in a conve-nient manner, Thor is not a multi-database system since it does not supportthe integration of preexisting systems,but rather a distributed database systemthat allows different systems to shareinformation by means of objects of theThor’s universe. A prototype of Thor,called TH has been implemented in Ar-gus [Liskov 1988].

System Architecture. Thor is intendedto run in a distributed environment. Someof the nodes are Thor servers, which storethe objects in the Thor universe. Othersare client nodes where users of Thor runtheir programs. The Thor system runsfrontends (FEs) at the client nodes, andbackends (BEs) and object repositories(ORS) at the servers. Every resilient ob-ject resides at one of the ORS. A useralways interacts with Thor via an FE,which typically resides at the user’sworkstation. Each FE acts on behalf of asingle principal client, An FE is authenti-cated to the ORS with which it interactsusing the Kerberos authentication ser-vice. The client program interacts withThor by executing Thor commands suchas start or terminate transactions and

run operations. An FE makes use of BEsand ORS to carry out these client re-quests. FEs and BEs perform operationsand understand types. ORS are con-cerned only with managing the resilientstorage for objects. Delays in accessingobjects from several different ORS arehandIed by caching objects at the FEs.Caching also reduces the load at the ORSservers. Another method used to reducedelay is to combine all calls that can beperformed at one OR into a larger “com-bined operation.”

Common Data Model. The Thor datamodel is language-independent in that itis not being embedded in a particularprogramming language. It provides a typesystem that allows programs written indifferent programming languages toshare data. A type is defined by a specifi-cation; specifications are independent ofthe programming language used to ac-cess the type’s objects and of the lan-guage used to implement the type. Thoralso provides access to objects throughboth navigation and queries. It supportsfull indexing for queries over sets of ab-stract objects. Thor does not support theintegration of preexisting database sys-tems but provides for information shar-ing among heterogeneous applicationsthrough a number of persistent objectsthat are being shared among the applica-tions.

Query Processing. Queries run atORS. When a set object is used at an FE,metadata about the set is sent to the FE,but the elements of the set are not. Themetadata includes information about in-dexes. Using this information the FE canmake decisions about how to carry outthe query most effectively.

Transaction Management. Thor is nota heterogeneous DBMS but a homoge-neous distributed object-oriented DBMS.Transaction management in Thor is sim-ilar to the traditional transaction man-agement in a distributed database sys-tem, with the difference that the basis ofthe concurrency control are objects in-stead of pages or segments. Concurrency

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 41: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

control and recovery are provided for in-dividual objects. Objects become persis-tent only when the transaction that madethem persistent commits. Two-phasecommit is used as the commitment proto-col. The coordination of concurrent trans-actions from different FEs at the ORSwill be accomplished by using an opti-mistic schema. A primary copy schemawill be used for replication.

Important Features

A different approach to the problem ofhandling heterogeneous information,that allows the heterogeneous systemto conveniently share informationstored in the form of Thor’s objects;

Addressing performance issues, such asobject-caching and combined opera-tions, as well as physical storageissues that are not considered byMDBSS because such issues are han-dled by the local systems.

5.10 The InterBase Project

In this section we describe two prototypesystems developed at Purdue Universityas part of the InterBase project [Bukhreset al. 1993]. FBASE [Mullen 1992] con-centrates on data modeling issues, whileInterBase* [Mullen and Elmagarmid1993] provides complete transaction sup-port. Currently, work is underway to in-tegrate the data model of FBASE in theInterBase* system.

5.10.1 FBASE [Mullen 1992] is anobject-oriented multidatabase systemthat can be characterized asdecentralized.

Common Data Model. The FBASEmodel uses the core characteristics of theobject-oriented model, i.e., classes, ob-jects, and methods (functions) to define aclass hierarchy appropriate for modelingthe schemas of the component systems.The FBASE class hierarchy is depicted inFigure 8 adapted from Mullen [1992].Each component database system is con-sidered an instance of the predefineclass Database. Commands such as cre-ate, insert, update, and select are prede-

Object Orientation ● 181

fined methods of the objects of theDatabase class. Each object of theDatabase class is considered to be a col-lection of instances of the class Relation.Relations have several predefine meth-ods such as project, Cartesian product,union, minus. The FBASE query lan-guage, called Federated SQL (FSQL), isan extension of SQL. It extends SQL inthe following ways:

It allows the specification of remotesystem data. Remote system data arespecified by prepending the name ofthe remote system to the relation namebeing accessed.

Complex objects can be defined and ref-erenced.

Object methods can be defined and ref-erenced.

Translation and Integration. Eachcomponent has a private schema thatdescribes the data available to its localusers (corresponds to the local schema ofthe 5-level architecture, see Section 3),an export schema that describes the datathat other systems may import from it(corresponds to the export schema), andan import schema that describes the dataat other component systems that the sys-tem knows about (corresponds to the ex-port schema of those systems). No feder-ated schema is created. The data struc-tures used to represent the three schemasare stored at each component system asrelations. In addition, a special datastructure, called directory, contains a listof remote sites that the component sys-tem knows about. A component systemcan be integrated as an importer (i.e., itcan execute global queries), exporter (i.e.,global queries can access its data), orboth, and various degrees of integrationcan be supported. Special FBASE serversmay perform query language transitionand data format translation for each dif-ferent exporter system.

Important Features

e The class hierarchy provides a uniformway of mapping different data modelsto the object-oriented model.

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 42: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

182 ● E. Pitoura et al.

Figure 8. FBASE class hierarchy.

5.10.2 InterBase. Currently, Inter-base* runs on an interconnected networkwith a variety of hosts such as Unixworkstations and IBM mainframes. Itsupports global application accessingmany local systems, including SAS,Sybase, Ingres, DB2, and Unix utilities.

System Architecture. InterBase* con-sists of four types of components (seeFigure 9, adapted from Mullen andElmagarmid [1993]):

e

e

e

s

InterBase* servers maintain the datadictionary and are responsible for pro-cessing global queries.

InterBase* clients connect to Inter-Base* servers and issue global queries.

Component database systems (CDBSS)are the systems being integrated.

Component system interfaces (CS15 ) actas an inte~face for the InterBase*servers to the component systems. TheCSIS are responsible for translatingglobal queries to the native query lan-guage of the local systems and fortranslating data from the native for-mat to the global format.

Common Data Model. InterBase*uses the same language, called Inter-SQL, as both a query language and a

transaction specification language. Inter-SQL combines IPL [Chen et al. 1993], thetransaction specification language of In-terBase, with FSQL used in FBASE (seeprevious section on FBASE) and addshigh-level support for atomic commit-ment. Translation and integration are notcurrently supported. To access datastored in a local system, the user mustissue queries expressed in the native lan-

guage. The characteristics of InterSQLrelated to transaction specification aredescribed in the next section.

Transaction Management. Inter-Base* supports the Flex transactionmodel [Elmagarmid et al. 1990] which isan extended transaction model. A Flextransaction is composed of a set of tasks.For each task, the model allows the userto specify a set of functionally equivalent

subtransactions, each of which, whencompleted, will accomplish the desiredtask. The model also allows the specifica-tion of dependencies on subtransactionsthat might take the form of internal orexternal dependencies. Internal depen-dencies define the execution order of sub-transactions, while external dependen-cies define the dependency of a subtrans-action execution on events that do notbelong to the transaction (such as the

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 43: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation . 183

IInterBase* Server

Computer Network

8

:~$~gnent

Interface. . .

Corn orientXData ase

System

I

Corn orient%Data ase

System

Figure9. InterBase system architecture.

start/end events). Finally it allows theuser to control the isolation granularityof a transaction through the use of com-pensating transactions.

Commitment in InterBase* is specifiedat the subtransaction level. The desiredcommitment method is specified for eachsubtransaction. Each subtransaction mayuse various and multiple commitmentmethods. The three basic commitmentmethods allowed are as follows:

(1)

(2)

(3)

Prepare. The subtransaction is exe-cuted to a prepare-to-commit state,where it is guaranteed to be commit-table, but can still be aborted. Thismethod will be provided for systemsthat support a visible prepare-to-commit state.

Reservation (Redo). The subtransac-tion is redone (or reexecuted) until itsuccessfully commits.

Compensation (Undo). The subtrans-action is committed independently ofthe global transaction, and if the

global transaction ultimately aborts,the subtransaction is undone.

InterSQL provides transaction specifica-tion facilities to allow the user to defineFlex tasks and appropriate commitmentmethods. An InterSQL program consistsof the following fundamental compo-nents: objects and types, subtransactionsdefinitions, dependency descriptionamong subtransactions, preference de-scriptions, and a reservation list. Objectsserve as results of, and as arguments to,subtransactions. Objects are categorizedby types and are capable of participatingin a specific set of subtransactions. Asubtransaction definition specifies thename of the subtransaction, the systemat which it executes, its arguments, thecommands to be executed, and the com-mitment methods that can be used tocommit it. In addition, the definition of asubtransaction may include guards (de-scribed below) and time options such asthe valid time period of its execution and

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 44: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

184 - E. Pitoura et al.

‘------------------”---------------!B —-----

““ha’cata’”>~=\ i~ (@i@jJ Cji@IEd> c-@Exii.~m ~,oba,,c,ema~, !

1---,.gZ*------// Wchema _=-’

i/~ I Local Access Manager I—==Z

1 I l—

4up___________uLocal Catalog

Local DBMSLocal Site

. . .

Figure 10. FIB system architecture.

its maximum execution time. The depen-dency description part of the InterSQLprogram allows the user to define theexecution dependencies between sub-transactions. When the dependency con-ditions and the time constraints for asubtransaction are satisfied, its guard isevaluated, and, if it holds, the executionof the transaction is granted; otherwisethe execution is delayed until after an-other evaluation of the guard takes place.The preference description allows theuser to specify the conditions under whicha subtransaction is preferred over an-other. Finally, the reservation list repre-sents explicit reservation actions to betaken in an attempt to ensure that thesubtransaction can be committed.

Important Features

. Use of an extended transaction model;

. Use of a Transaction Specification Lan-guage to support the extended transac-tion model; and

● Treatment of commitment.

5.11 The FIB Project

The Federated Information Bases (FIB)project at Georgia Tech [Navathe et al.1994] focus mainly on the semantic inter-operability problems encountered in mul-tidatabase systems.

System Architecture. The architectureof FIB is shown in Figure 10 adoptedfrom Navathe et al. [1994]. The flow ofcontrol between the components is ex-plained below in the query processingsection.

Common Data Model. FIB’s CDM,called CANDIDE, is a terminologicalknowledge representation model thatsupports classes, attributes (instancevariables), instances, and disjoint classes(classes whose subclasses cannot haveany common instances). So far it does notprovide methods or any other form ofbehavioral support. The database schemaconsists of two partially ordered lattices,one for the class taxonomy and one for

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 45: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation 8 185

the attribute hierarchy. In the attributehierarchv. each attribute can have at

“.

most one parent attribute along with anassociated domain. This domain is eitheran instance, or a set of instances of aclass described in the class taxonomv.The domain of an attribute must be “asubclass of the domain of its parent at-tribute. An attribute appearing in a classdefinition can be qualified by additionalvalue constraints on its domain. Theseconstraints must logically imply the con-straints on each attribute of its super-classes.

The same language is used as both aDDL and a DML. Quervinz is based onthe notions of subs~mp~io~ and classifi-cation. A class A subsumes a class B ifand only if every instance of B is also aninstance of A, i.e., A is a superclass of B.The subsumption relationship is com-mted on the basis of whether the at-~ribute constraints for class A logicallyimply the attribute constraints for classB. Classification is a search techniquewhich correctly places new classes intoan existing lattice. The correct locationfor a class A is immediately below themost s~ecific classes which subsume A.and immediately above the most generalclasses subsumed by A. Querying byclassification is the process of specifyinga query object and then searching forobjects which are structurally related tothis object using classification.

Translation. The mapping of a rela-tional schema into the CANDIDE datamodel is described in Navathe et al.[1994]. Each relation is mapped to a class,and tuples are mapped to instances of aclass. The key of a relation is associatedwith the class name. The values withinan attribute domain become instances ofa class representing the attribute.

Integration. The schema integrationprocess is divided into two distinct phases[Sheth et al. 1993]. In the first phase, theuser gives as input to the integrationmodule a set of attribute relationships.Two attributes al and a~ can be relatedas follows: al is-equivalent-to az, al is-aboue, or is-below a ~ in the attribute

hierarchy, or they may be unrelated. Inthe second phase, the classification andthe subsumption techniques are em-ployed to deduce the class relationshipsautomatically. The relationships that areidentified between two classes are: sub-sume, equivalent (each class subsumesthe other), disjoint (the classes have nocommon instances), overlap (none of theclasses subsumes the other and are notdisjoint), and unrelated. Schema mergingoperators are automatically applied topairs of classes according to the nature ofrelationships between them to generatethe global schema. Any changes in theunderlying component schemas requireonly reclassification of the global schemaalong with any new attribute correspon-dence. A schema integration tool basedon this approach is described in Savasereet al. [1991].

Query Processing. The flow of controlamong the FIB’s components is shown inFigure 10. The user interface allows auser to browse the global schema andformulate queries. The query processor isresponsible for accepting the user queriesand generating the subqueries with thehelp of a global catalog. The RCM sched-uler detects any dependencies and paral-lelism and generates a schedule for thesubqueries. The results of the subqueriesmust be combined to generate the finalresults. The result combination is per-formed by one of the componentdatabases. The DEC scheduler generatesa schedule for the result combination op-erations. The program generator gener-ates the program to be executed by thetransaction execution supervisor. The ex-ecution order of subqueries and the se-quencing information is encoded in thegenerated program using constructsadapted from DOL [Elmagarmid et al.1990] (a transaction specification lan-guage that is a predecessor of IPL de-scribed in the previous section).

The execution supervisor coordinatesthe execution of the subqueries as speci-fied by the DOL program. First it set-upsconnections with the local access man-agers (LAM). Next, it sends the

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 46: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

186 “ E. Pitoura et al.

subrequests to the specified LAMs in thespecified order. The subrequests aretranslated into the local model using in-formation from a local catalog. The sub-queries are executed by the local databasesystem and the results are translatedback to the CDM. The result combinationis performed at a suitable local site. TheLAM at this site executes the result com-bination as it would execute a subquery.First, it waits for the partial results fromsubqueries at other local sites. Then itcreates temporary tables for these re-sults and executes result combination op-erations. The final result is sent to theuser interface.

e

5.

Important Features

Use of classification to perform queryprocessing and schema integration; and

Automation of the schema integrationprocess.

12 Conclusions

Tables 2, 3, 4 and 5 present a compara-tive analysis of the systems. In Table 2,we characterize as complete systems,systems that, in addition to providing anintegration framework and a transactionmodel, support network communicationand various operating system facilities.Thor is different from the other systemsdescribed in that it does not support theintegration of preexisting systems.

System Architecture. DOMS, EIS, andOIS/CIS support an object-based archi-tecture. DOMS in particular, being themost recent of them, provides the func-tionality and adapts the terminology ofmost of the proposed architectural stan-dards. In an object-based architecture,all resources are modeled as objects andall provided services are modeled as ob-ject methods. Object managers handleobjects and the communication betweenthem.

Translation and Integration. Fromthe systems described, 01S, CIS, FBASE,and InterBase* can be characterized asnonfederated since they do not supportthe creation of a global schema. All other

systems fall in the category of federateddatabases. 01S and CIS propose an oper-ational mapping approach. Following thisapproach, each local database must pro-vide a minimum interface, in the form ofan implementation for a predefine set ofgeneric operations. These operations arebasic operations such as operations foraccessing the instance variables (compo-nents) of an object or operations for cre-ating new objects. More complex opera-tions are built on top of the generic oper-ations.

All federated systems define the globalschema by using the view definition facil-ities of their query language. FIB basesintegration on classification, a techniquethat explores the structure of a class toautomatically place it in a given classtaxonomy by applying appropriate merg-ing constructors. The ViewSystem pro-vides the most comprehensive set of classconstructors. EIS and DOMS define anobject algebra for their model and posethe question of optimality for the set ofclass constructors in terms of query opti-mization and expressive power. An inter-esting issue is how virtual classes sharethe functionality of their base classes.Inheritance is the most popular method,but the classical definition of inheritancefrom a subclass to a superclass must beformally generalized to provide for classesconstructed by methods other than sub-classing. EIS introduces the use of dele-gation as a means for information shar-ing. The usefulness of all approaches tosharing needs to be evaluated by perfor-mance studies. Other interesting issuesin object-oriented integration includemethod resolution and the treatment ofidentifiers for imaginary objects.

Transaction Management. Most sys-tems that discuss transaction manage-ment (DOMS, Carnot, EIS, InterBase* )support extended transaction models. Thegeneral trend is for “customized” or “pro-grammable” transaction management.According to this approach, the user willspecify the criterion of correctness interms of desired relationships betweendifferent subtransactions. DOMS and

ACM Computing Surveys, Vol 27, No 2, June 1995

Page 47: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID
Page 48: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

188 “ E. Pitoura et al.

Table 3. Data Models and Translation

System

Pegasus

ViewSystem

CISIOIS

OMS

DOMS

UniSQfJM

Camot

Thor

FBASE

InterBase*

FIB

Data Model

[ris data model

VODAKiata model

4bstractiata model (CIS)

[integrationiata model (01S)

WJGUE model

FROOM

Unified relationaland object-orientedmodel

Instead of a CDM,uses acommon-senseknowledge base,called Cyc

Based on Argus

Ob’ect-orientedDe~nes a classhierarchy to modelthe integratedsystems

Object-oriented

CANDIDE

Terminologicalknowledge-based

Emphasis onstructure ratherhan on behavior

DD/DM Language

HOSQL

Extension of S L9

Query-based ( )

DML

Programming-basedand set-oriented

QL

Extension of a logic-based query language

Query-based

Extension of a frmctional-based query language

Set-oriented

Extension of a functional-based query language

Set-oriented

SQUM

Query-based

GCLGlobal contextlanguage

Based on extenedfirst-order logic

Based on Argus

Programming-based

FSQL

Extension of SQL

Query-based

InterSQL

Based on FSQL

Provides transactionspecification facilities

Based on classification

(*) theclramckmzationof Ian uagesN basedonf“definitions given in Sec Ion 2.2

Translation

During importation

Sup rts automaticY“.rans atlon of relational

models

During importation

Re;&;~s mapped to

Special Smafblk libraryroutines (classes)su port translation

?o most qommonmfortnatlon sources

Operational mapping

Not discussed

Not discussed

Not necessag,CDM is a supersetof the relational model

Special framesdefined for commoninformation sources

Not applicable

Performed by specialFBASE servers

Performed by s cialt?servers, called S1s

Performed at runtimeby s cial translation

O&m ules

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 49: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation e

Table 4. Integration

Integration(*)

SystemImportation Derived Classes Conflicts

By queries By queries Domain mismatch(semantic)

Virtual classes Virtual classes called Naming andPegasus called producer types unifying types schema mismatch

and the uery that‘isdefines t em,

(schema)Function inherited f~om base

producer expression classes by umfymg mherrtartceObject identification(ldentlty)

Maps external By applying constructorsinformation sourcesto methods Const~ctors supported:

ViewSystem speclahzatlon, gene~alization,grouping, aggregation

Not discussed

Virtual classescalled external classes Virtual classes called

derived classes

CIS/OIS Not supported

By queries and functions (constructors)

OMSVirtual classes called derived classes Not discussedObject-algebra defined with a set of fun$tionsthat produce new sets of objects from exlstmg ones

By queries and functions (constructors)

DOMS Object-algebra defined with a set of functionsNot discussed

that produce new sets of objects from exlstmg ones

UniSQL/M By queriesComprehensivetreatment

CarnotUses articulation axioms to express mappings betweentwo expressions that have eqmvalent meamrrg

Not discussed

ThorNot applicable

Provides for information sharing among heterogeneous sytems through aumverse of objects

FBASE Not supported

InterBase* Not supported

Relationships between the base classes induced byFrB a class] ficatlon method Not discussed

Class constructors are then applied automatically

(*) the description follows the methodology introduced in Section 2.4

Carnot propose ACTA- and DOL-basedtransaction specification languages, res-pectively. This work is based on earlierwork on extended transaction modelsdone in InterBase [Ehnagarrnid et al.1990] and Omnibase [Rusinkiewicz andSheth 1991]. In addition, InterBase* pro-vides an elaborate treatment of the com-mitment problem.

189

6. SUMMARY

Using object-oriented techniques to buildheterogeneous databases is a promisingapproach. Objects provide a naturalmodel of a heterogeneous environment.Modeling resources as objects and theirservices as methods hides the hetero-geneity of their implementation and res-pects their autonomy. At a lower level,

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 50: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

190 “ E. Pitoura et al.

Table 5. Query and Transaction Processing

System Transaction Management

Pegasus Not supported

ViewSvstem I Not srumorted

CISIOIS Not supportedI

OMS

Nested (cooperative) transactions

Goal: Customize transaction management

No formal definition of correctness

DOMS

UnlSQL/M

Goal: Programmable transaction management

Two types of transactions :

(i) multisy~tem transactions (extendedtransactions defined using ECA rules)

(Ii) multidatabase transactions (traditional‘ heterogeneous transactions)

Supports concurrency control

Assumes a meDare-to-commit state

Goal: User-specified correctnessCamot Provides a language based on ACTA for defining

relationships between subtransactlons

Not a multidatabase, buta (homogeneous) distributed DBMS

Thor Basis of concurrency control are objects

Commitment protocol : 2PCOptimistic concurrence control algorithm

1Prlmarv coDv schema or reohcatlon

FBASE I Not supported

InterBase*

FIB

] Supports the Flex extended transaction model

Prowdes transaction specification language fordefining the model

Provides an elaborate treatment of commitmentat the subtransaction level

Not supported

Constructs based on the DOL transactionspecification language may be used tospecify dependencies among subquerles

providing an obiect-oriented model for thedata in tie het&ogeneous database facil-itates the expression of relatiorm and theresolution of conflicts that exist betweenentities at different component databasesystems. Finally, object technology offersan efficient method for modeling and im-plementing heterogeneous transactionmanagement and for supporting the useof semantic information to allow moreconcurrency.

Unfortunately, the abundance of mod-els and techniques makes the study and

evaluation of object-oriented approachesintricately difficult. In this paper we havepresented a unifying analysis of theprocess of building object-oriented het-erogeneous database systems. Variousmethods have been examined, and anumber of real-life systems have beencompared. We believe that this compre-hensive review will enhance our under-standing of these issues, substantiate theuse of object-oriented techniques, andhelp put into perspective existing andfuture projects.

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 51: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 191

REFERENCES

ABITEBOUL, S. AND BONNER, A. 1991. Objects and

views. In Proceedings of the ACM SIGMODACM Press, New York, 238-247.

AGHA, G. 1986. Actors. The MIT Press, Cam-

bridge, Mass.

AHMED, R., ALBERT, J., Du, W., KENT, W., LITWIN,

W., .mD Sw, M.-C. 1993. An overview ofPegasus. In Proceedings of the RIDE-IMS

(April), 273-277.

AHMED, R., DESCHEDT, P., KKNT, W., KETABCHI, M.,

LITWIN, W., RAFH, A., AND SHAN, M.-C. 1991.Pegasus: A system for seamless integration of

heterogeneous information sources. In COMP-

COIV 91 (March), 128-136.

AHMED, R., DESCHEDT, P., Du, W., KENT, W.,KETABCHI, M., LITWIN, W., Rim, A., AND SHAN,

M.-C. 1991. The Pegasus heterogeneousmultidatabase system. IEEE Computer 24, 12

(Dec.), 19-27.

ALBERT, J., AHMED, R., KETABCHI, M., KENT, W., AND

SW, M.-C. 1993. Automatic importation ofrelational schemas in Pegasus. In Proceedings

of the RZDE-IMS (April), 105–113.

BADRINATH, B. R. AND RAMAMRITHAM, K. 1988.Synchronizing transactions on objects. IEEETrans. Computers 37, 5 (May), 541-547.

BANERJEE, J., CHOU, H.-T., GMU, J. F., KIM, W.,WOELK, D., AND BALLOU, N. 1987. Data

model issues for object-oriented applications.

ACM Trans. Office Inf. Syst. 5, 4 (Jan.), 3-26.

BARGHOUTI, N. S. AND KAISER, G. E. 1991. Con-currency control in advanced database applica-tions. ACM Comput. Surv. 23, 3 (Sept.),269-317.

BATINI, C., LENZERINI, M., AND NAVATHE, S. B.1986. Comparison of methodologies for

database schema integration. ACM Comput.

Surv. 18, 4 (Dec.), 323-364.

BEERI, C., BERNSTEIN, P. A., AND GOODMAN, N.

1989. A model for concurrency in nested

transaction systems. J. ACM, 36, 2 (April),

230-269.

BERNSTEIN, P. A., HADJILACOS, V., AND GOODMAN, N.1987. Concurrency Control and Recovery inDatabase Systems. Addison-Wesley, Reading,Mass.

BERTINO, E. 1991. Integration of heterogeneous

data repositories by using object-oriented views.In Proceedings of the First International Work-

shop on Znteroperability in Multidatabase Sys-tems (April), 22–29.

BERTINO, E. 1992. A view mechanism for object-

oriented databases. In Advances in DatabaseTechnology—EDBT ’92, C. Delobel and G. Got-tlob, Ed~., Springer Verlag, New York, 136–151,

BERTINO, E., NEGRI, M., PELAGGATI, G., ANDSBATELLA, L. 1988. The comandos integra-tion system: an object-oriented approach to theinterconnection of heterogeneous applications.

In proceedings of the Second International

Workshop on Object-Oriented Database Sys-tems (Sept.), 213–218.

BERTINO, E., NEGRI, M., PELAGGATI, G., AND

SBATELLA, L. 1989. Integration of heteroge-

neous database applications through anobject-oriented interface. Inf. Syst. 14, 5,407-420.

BLAIR, G. S., GALLAGHER, J. J., AND MALIK, J. 1989.Genericity vs inheritance vs delegation vs con-formance vs. . . . JOOP (Sept. \Oct.), 11-17.

BREITBART, Y., GARCIA-M• LINA, H., AND

SILBERSCHATZ, A. 1992. Overview of multi-

database transaction management. VLDB

Journal 1, 2, 181-239.

BREITBART, Y., GEORGAKOPOULOS, D., AND

SILBERSCHATZ, A. 1991. On rigorous transac-tion scheduling. IEEE Trans. Softw. Eng. 17,9

(Sept.), %54-960.BRETL, R. ET m. 1989. The GemStone data man-

agement system. In Object-Oriented Concepts,Databases, and Applications, W. Kim and F. H.Lochovsky, Eds., ACM Press, New York,

283-308.

BRIGHT, M. W., HURSON, R., AND PAKZARD, S. H.

1992. A taxonomy and current issues in mul-

tidatabase systems. IEEE Computer (March),50-60.

BUCHMANN, A., Ozsu, M. T., HORNICK, M.,

GEORGAKOPOULOS, D., AND MANOLA, F. A. 1992.A transaction model for active distributed sys-

tems. In Database Transaction Models for Ad-vanced Applications, A. K. Elmagarmid, Ed.,

Morgan Kaufmann, San Mateo, Calif., 123-158.

BUKHRES, O. A., ELMAGARMID, A. K., AND MULLEN,

J. G. 1992. Object-oriented multidatabases:Systems and research overview. In Proceedingsof the International Conference on Informationand Knowledge Management (Baltimore, MD,

Nov.), 27-34.

BUKHRES, O. A., CHEN, J., Du, W., ELMAGARMID,

A. K., AND PEZZOLI, R. 1993. InterBase: An

execution environment for heterogeneous soft-

ware systems. IEEE Computer, (Aug.), 57–69.

CASTELLANOS, M. AND SALTOR, F. 1991. Semantic

enrichment of database schemas: An object-ori-ented approach. In Proceedings of the FirstInternational Workshop on Interoperability inMultidatabase Systems (April), 71-78.

CATrELL, R. G. G. Ed. 1993. The Object Database

Standard: ODMG-93. Morgan Kaufmann, SanMateo, Calif.

CHEN, J., BUKHRES, O., AND ELMAGARMID, A. K.

1993. IPL: A multidatabase transaction speci-fication language. In Proceedings of the 1993

International Conference on Distributed Com-puting.

CHOMICKI, J. AND LITWIN, W. 1992. Declaratwedefinition of object-oriented multidatabasemappings. In Proceedings of the Internat~onalWorkshop on Distributed Object Management

(Edmonton, Canada, Aug.), 307-325.

ACM Computmg Surveys, Vol. 27, No. 2, June 1995

Page 52: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

192 ● E. Pitouraet al.

CHRYSANTHIS, P. K. AND RMMAMRITHAN, K. 1994Synthesis of extended transaction models usingACTA ACM Trans. Database Syst 19, 3

(Sept.), 450-491.

COLLET, C., HUHNS, M. N., AND SHEN, W.-M. 1991.

Resource mtegratlon using a large knowledgebase in Carnot. IEEE Computer 24, 12 (Dec.),

55-62.

CONNORS, T AND LYNGBAEK, P. 1988. Providing

uniform access to heterogeneous information

bases. In Proceedings of the Second Interna-tional Workshop on Object-Orzented Database

Systems( Sept.), 162-173.

CZEDJO, B. AND TAYLOR, M. 1991, Integration ofdatabase systems using an ob]ect-oriented ap-proach. In Proceechngs of the Fu-st Interna-

t~onal Workshop on Interoperabll~ty m Multl -database Svstems (April), 30-37.

DAYAL, U. 1989 Queries and views in an object-

oriented data model. Database Programmmg

Languages, Proceedings of the 2nd Interna-tional Workshop. Morgan Kaufmann, San Ma-

teo, Calif.

DAYAL, U., BUCHMANN, A. P., AND MCCARTHY, D. R.

1988. Rules are objects too A knowledgemodel for an active, ob] ect-oriented database

system. In Advances m Database Technology—EDBT ’88, Springer Verlag, New York,127-143.

DAYAL, U. AND HWANG, H. 1984. View definition

and generalization for database integration ina multidatabase system. IEEE Trans Softw.Eng. 10, 6, 628-645.

DEVOR, C,, ELMASRI, R., LARSON, J., RAHIMI, S., AND

RICHARDSON, J. 1982, Five-schema architec-

ture extends DBMS to distributed applications.

Electron. Des. (March 18), 27-32,

Du, W. AND ELMAGARMID, A. K. 1989. Quasi seri-ahzability: A correctness criterion for globalconcurrency correctness in Interbase. In Pro-ceedings of the International Conference on VeryLarge Databases (Amsterdam).

DUCHENE, H., KAUL, M.j AND TURAU, V. 1988.

VODAK kernel data model. In %oceedmgs ofthe Second International Workshop on ObJect-

Oriented Database Systems (Sept.), 174-192.

ELMAGARMID, A. K. (Ed.) 1992. Database Trans-

action Models for Aduanced ApplicationsMorgan Kaufmann, San Mateo, Cahf.

ELMAGARMID, A K , LEU, Y., LITWIN, w?,, AND

RUSINhIEWICS, M. 1990. A multidatabasetransaction model for InterBase. In Proceed-ings of the 16th International Conference onVery Large Data Bases (Aug. 1990), 507-518,

ELMAGARMID, A. AND Pu, C. (Eds.) 1990. SpecialIssue on heterogeneous databases. ACM Com-put. Surv. 22, 3 (Sept.),

FANG, D , HAMMER, J , AND MCLEOD, D. 1992, An

approach to behavior sharing in Federateddatabase systems In Proceedings of the Inter-national Workshop on Dmtributed Object Man-agement (Edmonton, Canada, Aug.), 66–80

GAGLIARDI, R.j CANEVE, M., AND OLDANO, G. 1990.An operational approach to the integration ofdistributed heterogeneous envmonments. InProceedings of the PARBASE-90 Conference

(Miami Beach, Fla, March), 368-377,

GALLAGHER, L J. 1992. Object SQL: Languageextensions for object data management. In Pro-

ceedings of the 1st Internat~onal Conference onInformation and Knowledge Management,

GARCIA-S• LACO,M., CASTELLANOS, M., AND SALTOR,F 1993. Discovering interdatabase resem-

blance of classes for interoperable databases.In proceedings of the 2nd International Work-shop on In teroperabzllty in Multldatabase Sys-tems, 26–33.

GELLER, J , PERL, Y., AND NWHOLD, E. J 1991.Structure and semantics in 00DM class speci-fication. SIGMOD Rec 20, 4 (Dee), 40-43.

GELLER, J., PERL, Y., NEUHOLD, E., AND SHETH, A.1992. Structural schema integration with full

and partial correspondence using the dualmodel. Znf. Syst. 17, 6, 443–464.

GEORGAKOPOULOS, D., HORNICK, M,, AND KRYCHNIA~,P. 1993. An environment for the specifica-tion and management of extended transactions

in DOMS. In Proceedings of the RIDE-IMS

(April), 253-257.

GEORGAKOPOULOS, D., HORNIC~, M., KRYCHNIAK, P,,AND MANOLA, F. 1994. Specification andmanagement of extended transactions m a pro-

grammable transaction environment, In Pro-ceedz ngs of the 10th In ternatzona 1 Conferenceon Data Engmeermg (Feb.).

GEORGAKOPOULOS, D., RUSINKIEWICZ, M,, AND SHETH,A, 1991. On serializability of multidatabase

transactions through forced local conflicts. InProceedings of the 7th International Conference

on Data Engmeermg (Kobe, Japan, April),314-323.

HADJILACOS, T. AND HADJILACOS. V. 1991. Trans-action synchronization in object bases. J. Com-put. Syst SCL 43, 2–24.

HEILER, S. AND ZDONIK, S 1990 Object viewsExtending the vision In Proceedings of the 6thInternational Conference on Data Engmeenng,

86-93.

HEILER, S., HARADHV~LA, S., ZDONIK, S., BLAUSTEIN,B., AND ROSENTHAL, A, 1992 A flexibleframework for transaction management in en-

gineering environment. In Database Transac-tion Models for Advanced Apphcatzons, A KElmagarmld (Ed.) Morgan Kaufmann, SanMateo, Cahf., 88–121,

HERLIHY, M P. AND WEIHL, W. E. 1991. Hybridconcurrency control for abstract data types. J,Comput S.vst Sci 43, 25-61

HUHNS, M N , JACOBS, N , KSIEZyK, T., SHEN, W -M.,SINGH, M. P., AND CANNATA, P. E 1992. En-terprise information modehng and model inte-gration in Carnot In Enterprise ZntegratzonModehng, proceedings of the First Interna -

ACM Computmg Surveys, Vol 27, No 2, June 1995

Page 53: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 193

tional Conference, The MIT Press, Cambridge,

Mass., 290-299.

KAUL, M., DROSTEN, K., AND NEUHOLD, E. J. 1991.Viewsystem: Integrating heterogeneous infor-

mation bases by object-oriented views. In IEEEInternat~onal Conference on Data Engmeermg,

2-1o.

KENT, W. 1993. The objects are coming! Comput.Standards Interfaces 15.

KIFER, M., KIM, W., AND SAGIV, Y. 1992. Query-

ing object-oriented databases. In Proceedings

of the 1992 ACM SIGMOD Conference,

392-402.

KIM, W. 1990. Introduction to Object-OrientedDatabases. MIT Press, Cambridge, Mass.

KM, W. 1992. The UniSQL\M system. Personal

communication, Sept.

KIM, W. AND SEO, J. 1991. Classifying schematic

and data heterogeneity in multidatabase sys-tems. IEEE Computer 24, 12 (Dec.), 12-17.

KIM, W., CHOI, I., Gw, S., AND SCHEEVEL, M.1993. On resolving schematic heterogeneityin multidatabase systems. Int. J. Parallel Dis-

trib. Databases 1, 251–279.

KLAS, W., FANKHAUSER, P., MUTH, P., RAKOW, T.,

AND NEUHOLD, E. J. 1995. Database integra-

tion using the open object-oriented databasesystem VODAK. In Ob~ect-Oriented Multz-

clatabases, Prentice Hall, Englewood Cliffs,N.J., 1995, to appear.

KRISHNAMURTHY, R., LITWIN, W., AND KENT, W.1991. Language features for interoperability

of databases with schematic discrepancies. InProceedings of the ACM SZGMOD, 40-49.

KULKARNI, K. G. 1993. Object orientation and the

SQL standard. Comput. Standards Interfaces15, 287-301.

KULKARNI, K. G. 1994. Object-oriented exten-sions in SQL3: A status report. In Proceedingsof the 1994 ACM SIGMOD Conference (May),478.

LARSON, J., NAVATHE, S., AND ELMARSI, R. 1989.

A theory of attribute equivalence in databaseswith applications to schema integration. IEEETrans. Sof3w. Eng. 15, 4 (April), 449-463.

LI, Q. AND MCLEOD, D. 1991. An object-oriented

approach to federated databases. In Proceed-ings of the Fu-st International Workshop onInteroperabihty m Mult~database Systems

(APril), 64-70.

LIEBERMAN, H. 1986. Using prototypical objects

to implement shared behavior in object-ori-ented systems. In Proceedings of 00PSLA ’86

(Sept.), 214-223.

LISKOV, B. 1988. Distributed programming inArgus. Commun. ACM, 31,3 (March), 300-312.

LISKOV, B., DAY, M., AND SHIKA, L. 1992. Dis-tributed object management in Thor. InProceedings of the International Workshop onDistributed Object Management (Edmonton,Canada, Aug.), 1-15.

LITWIN, W., MARK, L., AND ROUSSOPOULOS, N. 1990.Interoperability of multiple autonomous

databases. ACM Comput. Suru. 22, 3 (Sept.),267-293.

MANNINO, M. V., NAVATHE, S., AND EFFELSBERG, W.1988. A rule-based approach for merging gen-eralization hierarchies. Info. Syst. 13, 3,257-272.

Mwom, F. AND HEILER, S. 1992. An approach tointeroperable object models. In Proceedings ofthe International Workshop on Distributed Ob-ject Management (Edmonton, Canada, Aug.),

326-330.

Mwou, F., HEILER, S., GEORGAKOPOULOS, D.,

HORNICK, M., AND BRODIE, M. 1992. Dis-tributed object management. Int. J. Intell. Co-

operative Info. Syst. 1, 1 (June).

MOTRO, A. 1987. Superviews: Virtual integration

of multiple databases. IEEE Trans. Softw. Eng.13, 7 (July), 785-798.

MULLEN, J. G. 1992. FBASE: A federated object-

base system. Int. J. Comput. Syst. Sci. Eng. 7,2 (April), 91-99.

MULLEN, J. G. AND ELMAGARMID, A. 1993. Inter-SQL: A multidatabase transaction program-ming language. In Proceedings of the 1993

Workshop on Database Programming Lan-

guages.

MULLEN, J. G., KIM, W., AND SHARIF-ASKARY, J.

1992. On the impossibility of atomic commit-ment in multidatabase systems. In Proceedingsof the 2nd International Conference on SystemIntegration (Morristown, N.J.).

NAVATHE, S., SAVASERE, A., ANWAR, T., BECK, H.,AND GALA, S. 1994. Object modeling using

classification in CANDIDE and its application.In Advances m Object-Oriented Database Sys-tems, Springer Verlag, New York.

NICOL, J. R.j WILKES, C. T., AND MANOLA, F. A.1993. Object orientation in heterogeneous dis-

tributed computing systems. IEEE Computer26, 6 (June), 57-67.

OBJECT MANAGEMENT GROUP. 1991. The commonobject request broker: Architecture and specifi-

cation. OMG Dec. 91.12.1, Dec.

OBJECT MANAGEMENT GROUP. 1992. Object man-agement architecture guide. OMG Dec. 92.11.1,Sept.

Ozsu, M. T. AND VALDURIEZ, P. 1991. Principlesof Distr~buted Database Systems, Prentice-Hall,

Englewood Cliffs, N.J.

PAPADIMITRIOU, C. 1986. The Theory of Database

Concurrency Control. Computer Science Press,Rockville, Md.

PAPAZOGLOU, M. P. AND MARINOS, L. 1990. Anobject-oriented approach to distributed datamanagement. J. Syst. Softw. 11, 2 (Feb.),

95-109.

PATHAK, G., STACKHOUSE, B., AND HEILER, S. 1991.EIS/XAIT project: An object-based interoper-

ACM Computmg Surveys, Vol. 27, No 2, June 1995

Page 54: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

194 “ E. Pitoura et al.

ability framework for heterogeneous systems,Comput. Standards Interfaces 13, 315-319,

PEDERSEN, C. 1989. Extending ordinary inheri-

tance schemes to include generalization. In

Proceedings of 00PSLA ’89 (Oct.), 407-417.

PITOURA, E. 1995. Extending an object-oriented

programming language to support the integra-

tion of database systems. In 28th AnnualHawaLi International conference on SystemSc~ences (HICSS-28) (Maui, Hawaii, Jan.),707-716.

RAJ, R, K.j TEMPERO, E., LEVY, H. M., BLACK, A. P.,HUTCHINSON, N. C., AND JUL, E. 1991. Emer-ald: A general-purpose programming language.Softw. Pratt, Exper. 21, 1 (Jan.).

RAMAMRITHAM, K. AND CHRYSANTIS, P. K. 1992. In

search of acceptability criteria: Databaseconsistency requirements and transaction cor-rectness properties. In Proceedings of the Inter-

national Workshop on Distributed Object Man-

agement (Edmonton, Canada, Aug.), 120–140.

RUSINKIEWICZ, M. AND SHETH, A. 1991. Multi-

transaction for managing interdependent data.IEEE Data Eng. Bull. 14, 1 (March),

SALTOR, F., CASTELLANOS, M., AND GARGIA-SOLACO,M. 1991. Suitability of data models ascanonical models for federated databases. ACMSIGMOD Rec. 20, 4, 44-48.

SAVASERE, A., SHETH, A., GALA, G , NAVATHE, S.,AND MARKUS, H. 1991. On applying classifi-

cation to schema integration. In Proceedings ofthe Fu-st International Workshop on Interoper-abdlty m Multldatabase Systems (April),258-261.

SCHALLER, T., BUKHRES, O. A., CHEN, J., ANDELMAGARMID, A. K. 1993. A taxonomic andanalytical survey of multidatabase systems.Tech. Rep. CSD-TR-93-040, Purdue Univ., WestLafayette, Ind.

SCHOLL, M. H. AND SCHEK, H.-J. 1990. A rela-tional object model. In Proceedings of the Inter-national Conference on DatabaseTheo~—ZCDT ’90 (Dec.), 89-105.

SCHOLL, M, H., SCHEK, H. J., AND TRESCH, M. 1992.Object algebra and views for multiobjectbases.

In Proceedings of the Internat~onal Workshopon DwtrLbuted Object Management (Edmonton,Canada, Aug.), 336-359.

SCMREFL, M AND NEUHOLD, E. J 1988 Objectclass definition by generalization using upwardinheritance, In proceedings of the IEEE Inter-national Conference on Data Engineering, 4– 13.

SCHWARTZ, P. M. AND SPECTOR, A. Z. 1984. Syn-chronizing shared abstract types. ACM Trans.Comput. Syst. 2, 3 (Aug.), 223-250.

SHETH, A. P., GALA, S. K., AND NAVATHE, S. B.1993. On automatic reasoning for schema in-tegration. Int. J. Intell. Cooperattue Inf. Syst.2, 1 (March).

SHETH, A. AND LARSON, J. 1990. Federateddatabase systems, ACM Comput. Surv. 22, 3

(Sept.), 183-236.

SHETH, A. P., LARSON, J, A., CORNELIO, A,, AND

NAVATHE, S. B. 1988. A tool for integratingconceptual schemas and user views. In Pro-

ceedings of the 4th International Conference onData Engineering (Feb.), 176-183.

SKARRA, A. H. 1991. Localized correctness speci-fication for cooperating transactions in an ob-

ject-oriented database. IEEE Bull. OfficeKnowl. Eng. 4, 1,79-106.

SKARRA, A. H. AND ZDONIK, S. 1989. Concurrencycontrol and object-oriented databases. In Ob-ject-Oriented Concepts, Databases, and ApplL -catzons ACM Press, New York, 359–421.

SNYDER, A. 1986. Encapsulation and inheritance

in object-oriented programming languages. InProceedings of OOPSLA ’86 (Sept.), 38-45.

SOLEY, R. M. 1992, Using object technology tointegrate distributed applications. In Erzter-

prise Integration Modehng, Proceedings of the

Fwst International Conference, MIT Press,Cambridge, Mass., 446-454,

STEIN, L. A. 1987. Delegation is inheritance. Inproceedings of OOPSLA ’87 (Oct.), 138-146.

TAYLOR, C. J. 1992. A status report on open dis-

tributed processing. Fwst Class (Object Man-age. Group Newsl.) 2, 2 (June/July), 11–13,

THOMAS, G., THOMPSON, G. R., CHUNG, C.-W.,BARKMEYER, E., CARTER, F., TEMPLETON, M.,

FOX, S., AND HARTMAN, B 1990. Heteroge-neous distributed database systems for produc-

tion use. ACM Comput. Suru. 22, 3 (Sept.),237-265.

TOMLINSON, C., LAVENDER, G., MEREDITH, G,,

WOELK, D., AND CANNATA, P. 1992. TheCarnot extensible service switch (EES)—Sup-

port for service execution. In EnterpriseIntegration Modeling, proceedings of the FwstInternational Conference MIT Press, Cam-bridge, Mass., 493-502.

TSICHRITZIS, D. AND KLUG, A 1978. TheANSI/X3 /SPARC DBMS framework Inf.Syst. 3, 4,

UNGAR, D, AND SMITH, R. B. 1987. Selfi The powerof simplicity. In Proceedings of OOPSLA ’87

(Oct.), 227-242.

WEGNER, P. 1987. Dimensions of object-basedlanguage design. In Proceedings of- 00PSLA

’87 (Oct.), 168-182.

WEIHL, W. E. 1988. Commutativity-based con-currency control for abstract data types. IEEETrans. Computers, 37, 12, 1488-1505.

WEIHL, W. E. 1989 Local atomicity properties:Modular concurrency control for abstract datatypes. ACM Trans. Program, Lang. Syst. 11, 2(April), 249-282.

WEIKUM, G. 1991. Principles and realization ofmultilevel transaction management. ACMTrans. Database Syst. 16, 1 (March), 132-180.

WOELK, D., SHEN, W.-M., HUHNS, M., AND CANNATA,P. 1992. Model driven enterprme informa-tion in Carnot. In Enterprwe Integration Mod-

ACM Computmg Surveys, Vol. 27, No, 2, June 1995

Page 55: Object orientation in multidatabase systemspitoura/distribution/acm-cs95.pdf · Object Orientation in Multidatabase Systems EVAGGELIA PITOURA, OMRAN BUKHRES, AND AHMED ELMAGARMID

Object Orientation ● 195

cling, Proceedings of the First International ZHANG, A. AND 13LMAGARM1D, A. K. 1993. A theoryConference, MIT Press, Cambridgej Mass.,301-309.

of global concurrency control in multidatabasesystems. VLDB J. (July).

WOELK, D., CANNATA, P., HUHNS, M., SHEN, W.-M., ZHANG, A. AND PITOURA, E. 1993. A view-basedAND TOMLINSON, C. 1993. Using Carnot for approach to relaxing global serializability inenterprise information integration. In Proceed- multidatabase systems. Tech. Rep. CSD-TR-93-mgs of the Second In ternat~onal Conference on 082, Purdue Univ., West Lafayette, Ind.Parallel and Distributed Information Systems

(Jan.), 133-136.

Received January 1994; flnalrevlsion accepted November l994; accepted April 1995,

ACM Computing Surveys, Vol 27, N0 2, June 1995