Download pdf - Object-Oriented Databases: Design and Implementation

Object-Oriented Databases: Design and

Implementation

John V. Joseph Satish M. Thatte

Craig W. Thompson David L. Wells

Reprinted from PROCEEDINGS OF THE IEEE

Vol. 79, No. 1, January 1991

PROC/79/1/ /41186

Object-Oriented Databases: Design and Implementation

JOHN V. JOSEPH, MEMBER, IEEE, SATISH M. THATTE, SENIOR MEMBER, IEEE ,

CRAIG w. THOMPSON' SENIOR MEMBER, IEEE, AND DAVID L. WELLS, MEMBER, IEEE

Invited Paper

Object-oriented database systems aim ar meeting the data modeling. performance. cooperative design, and version management requirements of next-generation applications, such as CAD, CAM, CASE, hypermedia. and expert systems. These needs cannot be mer with conventional database systems, which have been developed primarily for business and finan cial applications. Object-oriented database (OODB) systems represent the confluence of ideas from object-oriented programming languages and database management. The paper presents key features of OODB 's, provides a taxonomy of approaches to OODB 's, and discusses key OODB architectural and implementation issues, deo~ign alternatives, and rradeoffs. It provides a brief summary of a varierv of OODB systems, both research prototype.! and commercial systems. Finally, it discuso~es industry efforts ro accelerate a consensus that can lead to standards in the OODB area.

l. INTRODUCTION

Many " next-generation" applications , such as computer-aided design and manufacturing systems, computer-aided software engineering, multimedia and hypermedia information systems, and artificial intelligence expert systems require databases that can support objects of a wide variety of rypes with the ability to express complex relationships among objects. Conventional database systems (network, hierarchical, or relational models) are not adequate to support the requirements of these applications. Some of the key requirements are: modeling power, performance, longterm and cooperative " transactions' ' necessary for design activities, and version and configuration management.

Object-oriented database (OODB) systems represent the confluence of ideas from object-oriented programming languages and database management. Object-oriented programming languages, such as Smalltalk , Common Lisp Object System, and C++, provide rich data abstraction capabilities including the powerful modeling capabilities based on the ability to define abstract data types and construct type hierarchies that pem1it property inherit

ance. Database systems provide long-tem1 reliable data storage, multiuser access, concurrency control, query, recovery, and security capabilities. OODB's combine the advantages of objectoriented programming languages with those of database systems.

Manuscript received August 17, 1990: revised October 12. 1990. The authors are with the Information Technologies Laboratory, Texas

Instruments Inc .. Dallas. TX 75265. IEEE Log Number 9041186.

For instance, an OODB is not just a repository of "passive" data, such as numbers and character strings, but can contain a rich variety of complex objects such as arrays , vectors , graphics bit maps, and pictures along with procedures that operate on objects in response to the messages sent to them.

In the past few years, two schools within the database research community have emerged to respond to the needs of these nextgeneration applications. The first school has advocated the approach of extending relational database technology with object extensions. The Third Generation Database Manifesto [ I] articulates the position of this school. The school has advocated a ground-up approach of developing object-oriented database technology, as articulated in the Object-Oriented Database Manifesto [2]. Several research prototypes and a few commercial products representing both schools have been developed in the past five years. Both schools have vigorously debated their positions [3] . It is not the primary purpose of this paper to compare or evaluate the two distinct approaches taken by the schools. A brief comparison of the approaches can be found in [4]. This paper is concerned with characterizing OODB's. Since OODB's is a rapidly evolving fie ld, some of our statements may be controversial and reflect our biases; we believe this is unavoidable.

We do not expect OODB's to replace conventional databases in commercial data processing applications for several years. The strength of OODB's will be their ability to support next-generation applications. Conventional databases and their extensions and OODB's will coexist to support different activities of an enter

prise. Although OODB technology has great potential , there are still

many questions to be answered and challenges to be met before the technology can be considered mature. The variety of commercial products and research prototypes indicates that consensus and standards may be several years away . Performance benchmarks are just beginning to emerge. Feedback from application developers and resulting improvements are just beginning.

This paper will provide the reader with the current status of the rapidly emerging field of OODB's. Section II provides background information on object-oriented concepts and may be skipped by readers with this background. Section III describes the requirements of next-generation applications and how conventional databases fail to meet these needs. The section introduces

0018-9219/9 1/0 100·0042$0 1.00 © 1991 I EEE

PDf"\rt:'Cf'\ 1 1\lr..c:' r u:; TUC II:' C'C "f"\1 ""1'0 llro.TA I IAII.lf f ,\DV lf\01

the tenninology used to describe OODB's and presents an introduction to OODB's in tenns of the features they are expected to have; a taxonomy of various OODB approaches is also included. The first three sections of the paper are of a tutorial nature. Section IV also has a tutorial style but is geared more toward a reader interested in the state of the art in OODB research. This section discusses key OODB implementation issues, design alternatives, and trade-offs. Section V provides short sketches of a variety of OODB systems, both research prototypes and commercial systems. Section VI discusses some industry efforts to accelerate

consensus that can lead to standards in the OODB area. Section VII concludes with a summary of the paper along with a d iscussion of open issues.

II. OBJECT-ORIENTATION

Object-orientation is becoming increasingly popular in the design and implementation of computer based systems. This is because the object metaphor provides a natural way to map realworld objects and their relationships directly to computer representations. In this section, we will define and illustrate the objectoriented concepts and examine how these concepts provide a natural way to model real-world entities and thei r relationships. For further elaboration and related ideas, see [5)-[8] .

The fundamental concepts of object-orientation are objects, object idenriry, classes, and inheritance. These concepts first appeared in programming languages. Many of the ideas of objectoriented programming date back to Simula-67 [9], [ 10). Research in object-oriented programming has produced new programming languages, notably Smalltalk [ I I], Eiffe l [5], and Trellis/Owl [12]. Object-oriented programming has also been supported by extensions to existing languages; some notable examples are: C + + [ 13] and Objective-C [ 14) as extensions of C; and Flavors [15), LOOPS [16), and Common Lisp Object System (CLOS) [17] as extensions of Lisp. The field is now sufficiently mature that there are ANSI standardizat ion efforts for the CLOS [ 17) and C + + [ 18!languages. Excellent introduct ions to object-oriented programming can be found in [19]-[2 1].

Frames [22] is an object-oriented knowledge representation scheme adopted in languages such as FRL [23) and KRL [24]. Commercial products like KEE from Intell icorp and ART from Inference use frame-based knowledge representation. In the area of databases, there is an overlap between concepts in the objectoriented models and semant ic data models [25], [26]. Most semantic data models do not take advantage of the power of the class concept in that the models do not support methods or inheritance. Object-oriented database research attempts to bring to databases the full power of the object -oriented concepts of objects and object identity, classes, and inheritance.

Productivity gains from the use of object-oriented concepts in programming, system design, and databases are widely claimed. Experiments by the customers of Productivity Products International, Inc. claim that for 30 graphics, CAD, and spreadsheet applications the amount of code using the Objectivc-C language was four times smaller than the code using the C language [27]. This 4X reduction in program code size also implies an attendant reduction in development and maintenance effort and cost. The developers of a program to balance ai rl ine over-booking and noshows estimate that using the object-oriented approach a llowed two programmers to complete in th ree years a task that nonnally would have required 20 people and a multimillion dollar budget [28]. In add ition, the use of object-oriented concepts in program design and implementation appears to provide significant

JOSEPH <'I a/.: OBJECT·OR!ENTED DATABASES

improvements in programmer productivity by fac ilitating software reuse [5], [6], [28).

A. Concepts and Definitions

Object-oriented systems share a common set of concepts. These concepts are described in this section. As yet, there is no consensus on a qualify ing set of core requ irements fo r a system to be called object-oriented .

1) Objects and Object Identity: a) Objects: The one th ing that is common to all object-ori

ented systems is the notion of "object." Objects are entities (data structures in a computer) that are used to represent abstract or concrete real-wo rld things in the application domain being modeled . Examples of real-world entities include : a boiler in a computer aided manufactu ring (CAM) application, an adder in a computer-aided design (CAD) application, and a battle ship in a military application. Objects are also used to model entities that are purely artifacts of a computer-based system, such as a window in a window system or an input/output buffer.

An object is an entity that has a local state and an ability to manipulate its local state in response to external requests . The local state of an object is the set of values of its attributes (variously called instance variables, propert ies, data members, or slots). T he external requests are called messages; and the program code that operates on the state to change it in response to messages is called a method. The collection of all messages defined fo r an object constitutes its abstract imerface or type . The collection of all methods of an object defi nes its behavior. The principle of data abstraction or encapsulation states that the local state and the methods of an object are not visible to users of the object; they may only interact w ith the object by making requests to the object through messages. This princ iple promotes modularity and maintainabil ity. Since the user of an object cannot make assumptions about the implementation and internal representations of the object, the underlying implementations can be changed without affecting users.

b) Object idenriry: Each object is associated with a unique identifier, regardless of its current state [29]. The idea is that an object has an existence which is independent of its value. Identity is a stronger concept than simply a value describing an object; it cannot be changed in the same way other values describing the object can be changed. Although not supported by most objectoriented systems, it might even be possible fo r an object to change its type and retain its identity .

Since identity is a stronger concept than value, it is possible to disti nguish between two objects that have the same value. To see the distinction, consider a relation SHIP (NAME, CAPTAIN) in a relational database cons isting of "name-captain" tuples. We may ask what happens if two ships named '"Sea Breeze" also happen to have captains named "Long John Silver." Considered as two ship objects, we can distinguish between them based on their object I D's regardless of their values. We cannot distinguish them as tuples in a relation, since a relation is a set and a tuple cannot be a member of a set more than once. The object identifiers in object-oriented databases and the record pointers in hierarchical databases are similar; the major difference is that object identifiers arc logical pointers; whereas, record pointers are physical pointers. A consequence of this is that object pointers, but not record pointers, can be used for referential integrity [30], [31) . It is possible to simulate object identity in a value-based system (in a relational database, for example) by introducing object identifie rs explicitly . This, however, places the responsibility of ensuring identifier uniqueness and referential integrity on the user.

43

2) Class: An object has attributes and behavior. Class is a means of grouping together objects that share the same attributes and behavior. A class is implemented by choosing a collection of attributes, or instance variables, in which to store the internal state of the instances, and by writi ng a method of each message defining the abstract interface of the instances of the class. A class is, therefore, the implementation of the abstract interface or type of an object; and an object's structure and behavior are defined by its type and its class. Members of the class are called insTance objecTs or instances. In object-oriented systems, each object is an instance of some class. Class implements the data modeling concept insTance-of. The distinction between "type" and "class" is important for the discussion of architectural choices and implementation issues in Section IV and is further e laborated there.

Class composiTion hierarchy: A class consists of a set of attributes. The domain of an attribute may be a class that, in turn, may have attributes with domains as classes. This nested structure of a class can give rise to a directed , possibly cyclic , graph representing the composition relationship between a class and its attributes. In object-oriented systems, the composition relationship is the equi valent of the data modeling concept of aggregation. The class composition hierarchy is orthogonal to the class inheritance hierarchy discussed below. The composition hierarchy provides a way to model rich , complex data structures without first ftattening out the structure. Examples of aggregation relationships are is_yarT_ofand is_owned_by.

3) Inheritance: Object-oriented systems allow the user to derive a new class from an existing class. The new class inherits all attributes and methods of the original class and may define additional attributes and methods and redefine inherited methods. The new class. a subclass. specializes the original class. The original class is called a superclass of the derived class and is its gener

alization. Inheritance realizes the data modeling concept is_ a. It reduces the need to specify redundant information and hence simplifies updating and modification. It is also used to create objects that are almost like other objects with a few incremental changes. Mechanisms like this are important because they make it possible to declare that certain specifications are shared by multiple parts of a program. Inheritance helps to keep programs shorter and more tightly organized. The power of inheritance is in the economy of expression that results when a class shares description with its superclasses . The common use of class libraries is a good example of software reuse through the inheritance mechanism. Pro

grammers build upon basic objects provided by class libraries by specializing the classes in the libraries.

When a class can have only one immediate superclass, it is called single inheriTance; when a class can inhe rit from multiple classes, we have mulTiple inherirance. Multiple inheritance increases sharing by making it possible to combine descriptions from several classes. The graph resulting from the subclasssuperclass relationship among classes is referred to as the inheritance hierarchy.

B. lllustration of the Concepts

In this subsection, the concepts discussed above are illustrated with an example . Figure I shows a Ship class. Instances of this class may be used to model real-world ships . Ship has attributes Next_ Port, Position, Speed and Heading. Next_ Port points to an object of class Port , defined elsewhere, using an object identifier; Position is a pair of real numbers representing longitude and latitude; Speed, and Heading are real numbers representing knots and compass heading, respectively. Ship's ab-

MESSAGES

Estim:ue_ Time_of_ArrivaJ ___.,..

Ge!_Ne.x r_Pon --..

Gei_Speed

Get_Heading __..

Get_Position ___..

Set_Nex1_Po.si1ion ~

ATTRIBUTES

Next_Pon

Position

Speed

Heading

S.._Speed--.. . ----- -· ... - .--- . _ - ___ .. ...... __

Set_Heading

Set_Position--..

Fig. 1. Ship class.

METHODS TO HANDLE MESSAGES

Ship

Class

stract interface consists of messages Get Next Port , Set Next_ Port , Get_ Speed, Set Speed: G;t_ Heading-:Set_ Heading, Get_Position, Set_Position, and Estimate_ Time_ Of_Arrival. The Get_ and Set_messages get or set the values of the corresponding attributes. The method implementing the Estimate_ Time_ Of _Arriva l message computes its result by sending a message to the Next_ Port object asking where it is , finding the current time, and then using Position, Heading, and Speed to determine when the port will be reached (an error is signaled if the ship is not going in the right direction to reach the port). Note that since the method is hidden from the user it is possible to change the way Estimate Ti me Of ' Arrival is computed without affecting the user's code. It i; als~ possible to reimplement the class with a different set of attributes without affecting the user; for example, the attribute Position may be replaced by two attributes Latitude and Longitude. Ship's methods cannot directly read or change the state of Next_ Port, only Port's methods can do these. Because all interactions with Next_Port are through Port's abstract interface, changes to the implementation of Port do not affect Ship.

The attribute Next_Port is an instance of class Port; suppose that Port has an attribute called Located_ln which is an instance of class City. The same Port object could be pointed to by other objects; e.g. , a State object could point to the Porr object as My _Port . Also, a City object could point to a State object by an attribute ln_State. This, as illustrated in Fig. 2, is how com-

Ntxl_do" Loaled_ln Port ,. ------~

My_Port Stale

Fig. 2. Composition hierarchy.

position hierarchies are formed. It is natural to model application entities as objects and construct a graph of these objects , with relationships among them real ized as a composition hierarchy.

To illustrate inheritance, consider a new class Cargo_Ship as a subclass of Ship (see Fig. 3). A Cargo_Ship has all the properties of Ship, but can also carry cargo. To support this, Cargo_Ship adds the attribute Items and the following messages to the abstract interface of Ship: Load_ltem and Unload_ltem, which take as an argument an Item object to be loaded or unloaded, and List_ltems, which returns a list of all Items currently on board. At the Cargo_Ship level , it might be necessary to reimplement the Set_ Speed method by restricting the maxi-

MESSAGES

Estimate_Time_of_Arrival ______________.

Get_Ne ;o;t_Pon ~

Get_Speed ______...

Get_Heading _______.._

Get_Pos ition _____...

Set_Nc:xt_Position ______________.

ATTRIBUTES

Next_Port

Position

Speed

Heading

Set_Speed __________.,. ----------~..-.... ---------~--------...,...--~---..._

Set_Heading _____.....

Set_Position ...------...

MESSAGES

luad_!tem ___..

Unload_llcm ______.,.

METHODS TO HANDLE MESSAGES

j ADDITIONAL ATTRIBUTES

Items List_he ms ___________.. F-----~--,-----~~--~~-

METHODS TO HANDLE ADDITIONAL MESSAGES AND

REDEFINED METHODS

Fig. 3. Inheritance hierarchy.

Ship

Class

Cargo_Ship

Class

mum Speed of the Ship when carrying certain kinds of Items. For example, it might be desirable to require a maximum speed of 10 knots for Cargo_Ship' s carrying oil when within 20 nautical miles of a coast.

III. 0BJECT-ORIEI'TAT!ON AND DATABASES

Computer-based solutions are being sought for tasks that were previously thought to be too complex to be fully automated. Tasks like factory automation. command and control applications, multimedia informations systems, and very large scale integration (VLSI) design involve processes dealing with a large number of objects , with different structures and complex behavior, interconnected in intricate networks. We refer to computer-based solutions to such tasks as next-generation applications. As objectoriented technology becomes more accepted, it is being applied to next-generation applications. In these applications , object bases are much larger than a machine's virtual memory , represent information that must be shared by multiple users at many sites, and must be preserved after the processes which created the object bases terminate. The field of "object-oriented database systems," (also called object data bases , object data management, and persistent object bases) has emerged primarily as a result of research into supporting these object bases. A variety of related technologies like " object-oriented analysi~ and design," "objectoriented communications," ''object-oriented application integration frameworks ," and "object-oriented interfaces " are also important for the success of next generation applications; however, discussion of these topics is outside the scope of this paper. Some of these are discussed in [5] and [6]; also many excellent text books are available on conventional database systems [32], [33].

In Section III-A. we describe the database requirements of nextgeneration applications; in Section III-B, we examine why conventional database systems fai l to satisfy some of these requirements . In Section 111-C, we present the features and characteristics of OODB's. These features and characteristics define a database model that addresses the deficiencies of conventional database systems. Relational database researchers try to remedy the deficiencies by extending or enhancing the relational model (!]. Their approach has a certain appeal because it is an evolutionary migration path and there is a large installed base of rela-

JOSEPH et a/.: OBJECT-ORIENTED DATABASES

tiona! databases. However, we believe that the ability to make programming language structures directly persistent , without flattening them into tables, has a bigger appeal to next-generation applications. There are lively debates on the relative technical merits of the two approaches, for example [3]. When the debates move from academia to the users, the market place will determine the relative merits of the two approaches.

A. Database Requirements of Next-Generation Applications

Data management requirements of next-generation applications will be increasingly typified by the following requirements [1],

[2], [34], [35]:

Rich Data Modeling: As described in Section II, object-oriented concepts can be used to model application data and relationships in a natural manner. This modeling is realized in the transient memory of the computer. Programming languages support object-oriented concepts by means of a rich system of user defined classes. Many applications need to deal with persistent data (data that lives after the processes that created them terminate) and share the data among multiple users. These applications will benefit from extending the powerful modeling capabilities of the transient environment to persistent, shared data.

Navigational as well as Query Access to Objects: Large applications typically create complex graphs of objects; it must be possible to efficiently navigate through these graphs even after they have been stored in a database. This is particularly important for applications such as hypermedia , which require efficient interactive performance. However, as applications become very large, it is often inconvenient to retrieve every object by name. It is also quite natural to retrieve objects by a query, e.g., in a software engineering application, retrieve definitions of all functions that call a given function. Thus , associative retrieval based on predicates needs to be supported, thought not necessarily for all persistent objects .

Sharing of Objects Among Application Systems: Next-generation applications consist of different subsystems that may be implemented in different programming languages. The subsystems need to communicate with each other by exchanging data. It is, therefore, imperative that the database be able to handle data generated by different programming languages. Since object implementation is hidden in the object-oriented style, the language in which a particular object is implemented should be irrelevant to users of that object; this is not of much use if the database cannot support multiple language objects . Since adaptability to change is crucial to databases, it must be possible to use old data in new environments. By supporting multiple languages, it becomes easier to adapt to new languages or variants of old ones.

Seamlessness: We refer to the integration of a database with the rest of the programming environment in a nonobtrusive manner as seamlessness . The differences in the data models supported by databases and programming environments create a seam. Programming environments support a rich data model. The objects created in application programs are, typically, richly interconnected to reflect real-world relationships such as is_yart_of, is_a, and is_owned_by. They have rich types: numbers, characters, strings, lists, arrays, vectors, bit maps, and procedures. Much of this information could be in multiple media, such as numbers, text, graphics, images, video, and audio . If this large collection of types and interconnectivity must be mapped into a different data model (simple types, poor support for relationships like is_part_of, i~>_a, and is_owned_by) for storage, much of the da!a abstraction provided by the object-oriented approach is lost . For

45

this reason, many next-generation database system developers believe that it is imperative that the object model of the application program and the database be as similar as possible. Seamlessness requires that the database support as rich a data model as those found in programming languages.

Transactions Appropriate for Cooperative Design Work: Database systems for next-generation applications must clearly support at least conventional concurrency and transaction mechanisms. In addition, design appl ications, like CAD, operate on data for much longer duration (days to months) than conventional databases (seconds to minutes). Long-duration transactions need to deal with the fact that locking data for the duration of the entire transaction is undesirable for several types of applications. The idea that a transaction loses all its work if it cannot commit atomically is also inappropriate for transactions that run for a long time. There should be additional mechanisms to support cooperative work, "partial commit," and visibility of transaction data outside the transaction. Researchers in CAD databases have studied several techniques (workspaces, nested transactions, versioning) [36], [37] to deal with long-duration transactions in cooperative design work.

Support for Evolution of Object Instances and Classes: Objects in a database are created using "templates"; these templates are relations in a relational system and classes in an object system. Application programs are not static entities; they undergo change, especially in large applications. It is not possible to know the correct structure of all templates when the application is first launched. The templates undergo change based on better information, change in the environment, or changes in requirements. Such changes, discussed in Section IV-E-2), are referred to as schema evolution. Persistent objects also undergo version changes. Objects evolve through their entire life-cycle. It is necessary to provide support for managing this life cycle evolution. Change management aspects of next-generation applications are elaborated in Section IV-E.

Distributed, Platform Independent, Object Storage: Next generation applications deal with large numbers of objects of varying sizes. For example, a YLSI CAD application, modeling a millon tmnsistor VLSI chip, deals with gigabytes of information. A single I ,024 by I ,024 color bit map with 4 bits / pixel represents 4 megabits of information; there may be tens of thousands of such images, along with text, graphics , video, and audio in a large multimedia application. The volume of such information and the need to share it effectively among widely separated users require that stomge be distributed. Because such an environment is likely to be heterogeneous, storage must be platform independent. Information maintained in database systems is vital to the organization that created it and is of value beyond the lifetime of the platform on which it was created. As a result, data must be able to migrate gracefully from one generation of hardware and operating system platforms to the next.

Other Requirements: As database systems continue to evolve, better understanding will emerge regarding the requirements in areas like triggers (alerters, demons) , rule systems, and constraint languages [38], [39]. No matter what functionality is added, some baseline requirements like adequate performance, reliability, robustness, and easy-to-use inte1jaces wi ll remain appl icable.

B. Convemional Databases are Inadequate for Next-Generation Applications

Database management systems provide efficient access to large amounts of persistent data. They also provide I) transaction man-

agement for correct, efficient, and concurrent access by multiple users, 2) access control for limiting data access to authorized users only , 3) long-term reliable storage of data and recovery from media and system failures, and 4) support for one or more query language for data definition and data manipulation. Database management systems for next generation applications should, of course, have all these provisions that conventional databases, such as relational databases, support.

Conventional databases, however, are often inadequate to serve the needs of next-generation applications for five principal reasons I) lack of expressive data modeling power, 2) the so-called "impedance mismatch" between programming languages and database systems, 3) inadequate interactive performance to support next-generation applications, 4) lack of appropriate mechanisms for supporting long transactions, and 5) lack of appropriate mechanisms for supporting schema evolution and version management.

Lack of Expressive Data Modeling Power: Conventional databases have largely met the demands of business applications such as payroll, accounting, inventory control, airline reservations, and electronic fund transfers . These applications are typified by very large amounts of well-structured information, limited types and

structures, and transactions that last for short lengths of time (usually a few seconds). The success of relational database systems in meeting the demands of business applications is primarily because of the mathematical simplicity of the relational data model, founded on set theory [40], and on simple, powerful declarative query languages, such as SQL. However, this simplicity is a hindrance when it comes to supporting next-genemtion applications, since 1) conventional relational systems cannot support complex data types (arrays, objects, class definitions, functions) and interobject references (as between the Ship and the Port in the example of Section II and 2) these data manipulation primitives do not include programming language control structures (conditional clauses, procedure calls, selection, iteration, and recursion). Next-generation applications require databases to support the same level of expressive power provided by programming languages.

Impedance Mismatch: Conventional databases and programming languages support different data models and different paradigms for manipulating objects. These differences are usually referred to as an impedance mismatch. Impedance mismatch decreases application programmer productivity in two ways. First, programmers with complex problems (represented in the rich data model of the programming language) cannot easily map these problems to the simpler data model of the conventional databases. Second, even if the database uses some rich data model, if that model does not correspond closely to the data model in a host programming language, programmers would have to use different languages and modeling paradigms in the two environments. For many applications, this amounts to 30% or more of application code just to do translation between application language data structures and database structures [41], [42].

Performance Problems: One important reason why conventional databases fail to meet the needs of data-intensive objectoriented applications is lack of performance. Commercial database systems are not fast enough to support simulators and interactive design tools . As a consequence, most CAD systems, for example. perfom1 their own data management on top of the file system . If a relational system is used at all, it is only as an index package to support associative access. A typical CAD task is unlike most data processing transactions, which involve getting tuples from a relation and updating them, or selecting large groups

of tuples from one or more relations and performing similar operations on them (e.g., taking a join to generate a report or updating a salary field in each employee tuple to post a raise). The CAD

task also starts with a selection to pull out the pieces of a design, but then continues with many dissimilar fetch and store operations, as it navigates through a web of CAD objects. The access paths on the selected data follow connectivity of the real-world

entities, not the logical structures of the database. The cost of a relational query to fetch single, already identified

objects is necessarily excessive for the following reasons [43]:

• Each fetch or store incurs the cost of a procedure call from the application program to the database. That overhead is insignificant on a data procesJing transaction that accesses a field in all tuples in a relation~ but is a burden when accessing a single tuple. A procedure call cannot compete with simple offset addressing for accessing a field.

• Connections between entities in a relational system are through keys. At least one address translation is required to get from a key to the location of a tuple.

• Normalization and other encodings of complex design structures impose additional levels of indirection between an entity and its components. Invoking a query processor to optimize a join of just a few tuples is very expensive. The common strategies for transactions and recovery that work well in data processing systems are locking and logging. Both put a lot of overhead on transactions that do individual updates to tuples. Neither has been validated as the optimal approach in an environment with long transact ions and data fields that may change many times before commit. Each tuple in a relational database is in some set . Insertions, deletions, and access of set elements require maintaining indices associated with the set. Persistent objects do not need to share the burden of set maintenance if they are never going to be accessed as clements in a set.

The activities in a conventional data processing system and a next-generation application system are of different nature. As a consequence, the demands made by these systems on a database are very different. There is a distinction between performance measures that are appropriate for conventional data processing applications and next-generation applications. Several benchmarks have been developed for measuring performance for conventional databases, such as the Wisconsin benchmark (44], and the TP I benchmark [45]. These benchmarks are closely tied to relational database usages , which emphasize operations specific to the relational model and high-volume transaction processing , respectively. Recently, benchmarks for next-generation database applications have been developed [46]-[49]. These include measures for navigation ("pointer chasing'' performance), traversal across multiple aggregation and generalization relationships, clustering, "blobs" (binary large objects- multikilobyte values), versions, and interactive response time.

Lack of Mechanisms for Supporting Long Transactions: As we stated in Section III-A, next-generation applications need support for long-duration transactions and cooperative transactions. Conventional database management systems assume that transactions are short duration and lock very little data . Under this assumption. confl icting lock requests are infrequent and the cost of redoing an aborted transaction is minimal. In next-generation applications, the cost of redoing an aborted transaction and the cost of an operation being blocked because of conflicting lock requests may be prohibitive. Also, conventional transaction mechanisms assume that all transactions started in a computing sess ion ter-

JOSEPH et a/. : OBJECT-ORIEI\'TED DATABASES

minate (commit or abort) in the same session; this assumption is not always valid in applications supporting design activities. A new model of transaction to support next-generation applications needs additional implementation mechanisms. Some of these mechanisms are discussed in Section IV-C-3).

Lack of Support for Schema Evolution and Versioning: Traditional databases have provided no support for version management. A database is thought of as having a single state , namely

the current state. Even if historical or evolut ionary data were present, such data were for the use of the database system's recovery purpose. The database management system did not provide any way for applications to access the historical information. Versioning tools to manage the life-cycle evolution of entities do not get support from the conventional database management system.

Schema evolution, in traditional databases, is considered the venue of the systems administrator. This is inappropriate in nextgeneration applications where the evolution of the schema is as much a part of the application semantics as the creation and evolution of objects modeled by the schema. The important role of schema evolution in object-oriented programming and OODB's is further clarified in Section IV-E-2).

C. Object-Oriemed Database (OODB) Data Models .

Object-oriented databases (OODB's) offer solutions to meet the requirements of next-generation applications. As characterized in Atkinson et a/. (2], OODB's have object-oriented data models similar to those found in programming languages. OODB's offer support for sharing of large object bases by multiple users . They provide support for long transactions and for versioning and configuration management of instances and types. They provide support for queries. OODB's have the potential for providing performance adequate for next-generation applications.

At the current time, there are a number of commercial and research prototype OODB's. Some of these systems are surveyed in Section V. These systems take different approaches to meeting next generation application requirements. They have different features and strengths. The field of OODB's is too new for a general agreement on a "definitive list" of features and characteristics; but OODB's are beginning to exhibit similar functionality.

1) Features and Characteristics: Ullman [50) defines the term "object-oriented database management system" as a class of programming systems with the capability of a DBMS (management of large amounts of persistent data , transaction-based concurrent access, data model, and query language), along with the objectoriented features (object identity, encapsulation, inheritance, object composition, and complex objects) discussed in Section IIA. Several more detailed characterizations of OODB's exist in the literature [2). [35], [51]. Table I lists features and characteristics of an OODB; most of these features are discussed in various sections of this paper. Database features like distribution and security. which are orthogonal to object-orientedness, are not listed in the table. Not all current OODB's support all these features.

The remainder of this section provides a taxonomy of OODB"s based on their data model.

2) A Taxonomy oJOODB's: The response of the research community to meeting the database needs of next-generation applications can be categorized into two schools , based on the data model of choice. The first school has advocated value-oriented database systems; the second has advocated object-oriented data-

47

Table 1. OODB Features and Characteristics. base systems. In this section we briefly compare value-oriented and object-oriented data models. This is followed by a description of different approaches to object-oriented data modeling. A taxonomy based on the different approaches is illustrated in Fig. 4; some of the systems listed in the illustration are discussed briefly in Section V. A detailed exposition of value-oriented databases is outside the scope of this paper.

Feature

Complex Objects

Object Identity

Types and Classes

Encapsulation

Inheritance

Dynamic Binding ~nd Polymorphism

Seamlessness

Persistence

Secondary Storage Management

Transactions and Concurrency Control Recovery

Query Facility

Design T ransactions

Change Management

Description

The ability to define data types with a nested structure and to manage composition hierarchies: Sections 11-A-2), III-C, li-A-2). IV-A- I)

The ability to distinguish two objects independent of the values of attributes; Section 11-A-1)

The ability to organize similar objects and their implementation into an abstraction; Section 11-A-2)

The clear separation between the visible semantics and the implementation of objects; Section III-C

The abil ity to derive new classes from existing classes; Sect ions 11-A-3), 11-B

The ability to bind messages to different methods depending on type: methods may be bound at compile time or run time

Integrat ion of the database with the rest of the programming environment in a nonobtrusive manner: Sections lll-B, 111-C-2)

Existence of objects beyond the life time of the processes that created them: Section IV -B

Efficient data access by supponing clustering , indexes, buffering, and query optimizations: Sections IV-B-1), IV-B-3)

Concurrent access to data by means of atomicity . controlled sharing via locks, serializability ; Section IV-C- 1)

Ability to recover from software and media fai lures

Efficient high level declarative access to objects in addition to navigational programmatic access: Sections 111-C-2), IV-A-3). IV-D

Long running and nested transactions; Section IV-C-3)

Database suppon for managing the evolutionary life cycle of objects and classes; Section IV -E

a) Value-oriented versus OODB 's: In value-oriented data models like the relational model, relationships between different objects are stored implicitly, by comparison of values of attributes. For instance, Ship (USS Rendezvous , · · · Los Angeles, · · · .) and Porr(Los Angeles, · · · .) "match" by value in the Next_Port attribute of Ship and the Port_N ame attribute of Port. There are many value-oriented data models, ranging from extensions to the relational model to logic as a data model. The nested relational model relaxes the constraint that relations be in first normal form [32] . A relation could then contain another relation as the value of a field of a tuple. However, such an extended relational model does not allow a relation to share, or " point to" another relation, i.e. , there is no support for the notion of object identity. Other extensions allow more complex data types than the primitive types of numbers, characters, and strings [52]. A main advantage of the extensions approach is its relative simplicity, based on incremental evolution of the well-known relational model.

Over the last decade, many new data models have been proposed. They include the entity relationship model [53], many semantic data models [25], [54], [55], and many object-oriented data models [43] , [56] . In fact some researchers [50] have argued that the earliest database systems (hierarchical and network model databases) were object-oriented in the limited sense of supporting object identity, even though they had no user-defined types and no notion of state or behavior encapsulation. The object-oriented models do not have the simplicity of the relational model , and many challenging research issues related to data modeling, query languages, and query optimization are the subject of active research.

Among OODB's, we distinguish two further categories: persistent language-based OODB's and query-based OODB's.

I UNIVERSE OF DATA MODELS I VALUE-ORIENTED OBJECT -ORIENTED I

Relational I Entity-relationship I . 082 ·ORACLE ·UNIFY

Semantic Data Models • DAPLEX ·SDM

Extended-relational Nested Rc.lat.onal Persistent Language Based

Objtct Suppon (navi&llional ~query suppon)

• lktkdey Postgres I New Object-Oriented-languages

·DEC trellis/Owl (~o queries)

Query-based Existing J:,anguages

-HI' Iris Dual Type Systems logic . MCC Orion

- Datalog (Stanford) · S.:nev Curpuro~IIOfl Q(n~un¢

·LDL (MCCI · At:ur Ol

Single Type System .n Zeil&elst . Ontologie Ontos

Obiect-Orienied Data Models

I<'ig. 4. Taxonomy of object-oriented databases.

48

b) Persistent language based OODB's: This category includes object-oriented data models that are the same or very close to the abstract data typing cap~bilities of an object -oriented programming language. The objective is that a database should be integrated with the rest of the programming environment as

seamlessly as possible. A truly seaniless integration of computational storage (based on main memory or virtual memory) and data storage environments (based on secondary storage) requires that both use the same language and data model. This means that the same data types should exist in both transient and persistent environments and that the instances of these data types should be manipulated by the same operators. The lifetime of objects should not matter to programs manipulating them. It is also important that the same model of sharing and object identity be supported in both environments. This goal is consistent with the notion of "orthogonal persistence," as articulated in [57].

Even within this category there are different approaches. Some researchers have developed a new object-oriented programming language to support persistent objects [12], [58]; others have extended existing object-oriented programming languages to deal with persistence [56], [59], [60]; many have stayed close to an existing object-oriented programming language and accommodated persistence. In this last category there are 11 number of commercial OODB's and research prototypes that support a single mainstream programming language well; examples are Smalltalk in Servia Corporation's Gemstone [59], Common Lisp in TI Zeitgeist [61] and Symbolics Statice [60], and C/C+ + in Ontologic Ontos [62].

Although these efforts have met the goal of seamless persistence to a large extent, no system has achieved the goal with respect to multiple languages. It may be argued that the goal of seamless persistence of multiple languages is unachievable. The goal of seamlessness may be relaxed to mean that persistence is achievable with minimal effort; for example, the language data model remains the same, but additional information may have to be supplied by a programmer for making certain classes of objects persistent. Achieving seamless persistence (in the relaxed sense) with respect to multiple host languages that share persistent objects is still a research issue.

To support seamlessness, programs must be able to interact with the database either by sending messages to objects held by the database (using the same syntax as they would for nonpersistent objects) or by explicitly retrieving objects and acting on them directly. Applications written in object-oriented programming languages such as CLOS and C++ typically use both approaches. These two ways of accessing the database are essentially navigational , since they tend to make use of embedded interobject references.

The goal of a seamless integration of database and programming environments is ambitious. There are concepts critical to

the database domain, such as concurrency control and transaction atomicity, that are supported poorly, if at all, in programming languages. Programming languages deal with the current state of objects and operations; there is no natural way to deal with past states. In contrast, databases are primarily for keeping of historical records. We are learning that computational environments can benefit from many of the database amenities like transactions, concurrency, and history and that databases can benefit from programming language amenities like user-defined types and a rich data model. The way to realize such benefits will probably require augmenting the programming language or data model in cases

IO'>FPH Pr "' · ORJFrT.nRJFNTFn nATARA~F~

where a relevant construct does not exist, and will require the programmer or user to perform operations to use the database that would not be required if all data were transient. These concerns and issues are the domain of persistent programming language research [61], [63], [64].

c) Query-language-based OODB's: Persistent programming languages support all programming language entities as first-class objects. Query-based OODB's follow the relational database heritage and support sets as the only kind to first-class (persistent) objects. Examples of this approach are HP Iris [65] and Berkeley Postgres [52], [66].

In this approach, applications are developed using a variety of host programming languages, such as Lisp or C. Applications interact with the OODB using a set-oriented query language to retrieve or update objects. The type system and control structures of the host programming language(s) are significantly different from those of the object-oriented query language. There is no claim of seamlessness; there is an "impedance mismatch" between the host language(s) and the query language. However, the query language makes allowances for the object-orientedness of the environment; methods are allowed in selection predicates and path expressions are permitted, as in navigational persistent language OODB's. Path expressions are implemented via queries instead of "pointers.'' The extent of a class (the collection of all instances of the class) is implicitly a set that can be queried; users may also define other explicit sets.

One of the main attractions of a query language like SQL is its declarative nature; declarativeness is important in relational systems since it allows them to have simple, yet powerful query languages whose queries can be optimized to yield acceptable performance. Some of this declarative nature is sacrificed when methods are allowed in queries. On the other hand, there is a concern that implementation of path expressions as queries is counter-intuitive and slow.

A synthesis between persistent language-based and query-based approaches to OODB's is likely. Such a synthesis allows firstclass persistent objects but also optionally supports sets and a declarative query language. Section IV -D provides more detail on object queries.

This section of the paper examined the database requirements of next-generation applications and how these requirements are not adequately met by conventional databases. One approach to addressing the deficiencies of the conventional databases leads us to the area of OODB's. We listed the features and characteristics of OODB's and presented a taxonomy of OODB's based on data models. The next section goes into the architectural and implementation considerations in building an OODB.

IV. OODB ARCHITECTURE AND IMPLEMENTATION

This section provides a look at the key design decisions in OODB's. OODB's built or proposed to date differ in their 1) object models , 2) mechanisms for storing and retrieving persistent objects, 3) concurrency control and transaction management, 4) query models and methods of processing queries, and 5) management of evolving objects and class definitions. These aspects of OODB's are discussed in detail. Particular emphasis is placed on the key design choices and their implications on the remainder of the system components and the user/application interface functions. Representative interfaces and internal organizations of a number of OODB's are discussed. Performance improvement

dO

techniques are covered throughout the section. Other aspects of database management systems (DBMS's) such as access control , user-friendly interfaces, report generation, access to data held in other databases, and the ability to share information across het

erogeneous platforms have so far received very little treatment in the OODB community. We discuss these aspects only briefly.

A. Object Model

Applications interact with an OODB through an object model specifying the OODB's functionality. Since different object models provide different capabilities or have different performance objectives, the object model has major implications on the OODB 's internal organization and implementation. Although all OODB's manipulate objects, their object models vary widely. Our discussion of OODB architecture begins by clarifying I) treatment of types and classes, 2) how objects are identified by applications, and 3) how applications interact with objects stored in the OODB.

1) Treatment of Types and Classes: Recalling our discussion from Section II , objects are defined by interface (type) and implementation (class), with each object instance having a unique object identity (OlD). A class definition augments a type definition with information about attribute names, attribute types, and internal methods. Thus in a sense, a class definition comprehends its corresponding type definition. However, since many class definitions can be compatible with a single type definition, this relationship need not be one-to-one. The result is that , in principle, a type can be implemented by more than one class. However, most popular object-oriented languages do not make a clear distinction between type and class. with the result that the relationship between types and classes becomes one-to-one. As we shall see, this has profound impact on the ability of the OODB to manage type and class evolution (Section IV-E-2). In our subsequent discussions, we shall use the term type to mean the abstract interface and the term class to mean the additional information added as part of the specification of the implementation. This seems in accordance with the actual usage in the fie ld. The reader is referred to Wegner and Cardell i [67] and Moss and Wolf [68] for further discussions of type theory and the distinction between type and class . We now examine how OODB's treat type and class, and the implications of these differences .

Class describes the physical structure of object instances. By comprehending the physical definition alone, OODB's can materialize an object 's state (recreate it or restore it from secondary storage) and store that state when requested. However, this does not give the OODB the ability to interact with the object's interface by invoking its methods (which are part of the type definition).

Even though all OODB's comprehend class , there is much variation as to what classes can look like. A primary question is whether the attributes of a class instance can be complex nonobjects (armys, lists, structures, etc.), or are constrained to be only primitive types (numbers, characters, etc.) and references to other objects. If complex constructs are not supported. much flexibility is lost to the class developer, with a probable loss of efficiency in the implementation of the methods of the class. OODB's that attempt to extend existing programming languages typically support the wider interpretation. In some cases (notably Exodus/ E [64] and Object Store [69]), non-objects are supported as firstclass entities by the OODB to ensure compatibility with a programming language that combines object-oriented and conventional constructs.

2) Identifying Objects: Each separately persistent object must have a unique OlD within the OODB. The OlD space must be large enough to provide unique OlD's for all persistent objects over the OODB's lifetime. To preserve information hiding, the OlD (as presented to the application) should be independent of the location where the object is stored. Location independence does not preclude lower levels of the system from having local, location-specific OlD's or from assigning OlD's based on expected scope to minimize OlD size. There must be one or more name managers in the system to map OlD 's to locations. The name space of OlD's may be partitioned among the various name managers. If this is done, a multipart OlD (similar to a phone number consisting of an area code, exchange, and number) is generally used to allow piecewise determination of which name manager to use. This scheme (used in Ford er al. [61] and Moss and Sinofski [70]) takes advantage of the locality of interobject references whi le still supporting nonlocal references and location independence.

Modularity in object-oriented applications is achieved by partitioning an application's world into logically cohesive objects and allowing these objects to exchange information with other objects via messages. To do this, objects must have some way to refer to each other through inter-object references . There are several ways in which interobject references may be specified. These ways are stated as follows.

• The simplest approach is to embed the references within the referencing objects. This is typically implemented in objectoriented programming by memory pointers and in OODB's by replacing the memory pointers with OlD's . This closely models the OOP paradigm, and is generally used by OODB's that provide a persistent programming model. Drawbacks of this approach are that embedding the references makes objectoriented queries much more difficult (Section IV-D) , complicates object translation (Section IV -B), and makes it difficult to dynamically add unanticipated references in statically typed languages. A second approach is borrowed from the entity-relationship [53] model where all interobject references are outside the scope of the objects themselves. In this case. relationships based on OlD's can be added at will, and queries over relationships can be resolved without recourse to the objects themselves. The disadvantage is that resolving references is slower, since they are not directly accessible from the objects. A third approach is to force all inter-object references to take the form of queries embedded within the referencing object [66]. This provides good support for optimized object-oriented queries over relationships, and since queries can specify single objects, subsumes the embedded OlD approach . This approach requires a query processor and appears to be somewhat slower because of the query processor's overhead.

3) Interacting with Persistent Objects: There are two fundamental issues with respect to interaction models; namely, who controls actions (the appl ication or the database) and whether the interactions are via navigation or queries.

a) Active and passive object models: In the passive object model, object state is materialized in the application 's workspace by the OODB and operated in the workspace by the application. Once in the application's workspace , the application may send

messages to the objects, or in the case of non-objects (unencapsulated data), operate on them directly. By contrast , in the active object model, activity takes p)ace under control of the OODB and takes the form only of messages sent to objects.

The active object model has advantages in that it allows the OODB to restrict the operations from the abstract interface that may be executed by a particular application (presenting an object view similar to a schema view in a relational database), and allows the OODB and the object to decide jointly where a given method will execute. Remote execution under OODB control is particularly valuable if there is a disparity between the sizes of the objects involved in the message operation and the size of the result. Small objects producing large results can choose to execute their methods on the application's machine, while large objects producing small results can choose to execute wherever they currently reside. Some objects, such as windows or Vrinter queues , may have logical or environmental reasons for executing at a particular place. The active object model also allows the use of specialized hardware, as long as the object can be materialized in that environment, and facilitates scheduling based on network load.

In the passive object model, a restriction to only class instances is still possible, and is often done since it eases the task of materializing/storing objects, and provides a purely object-oriented interface to the OODB. However, restricting only class instances to be first-class objects introduces a seam with respect to popular languages such as C++ and CLOS, which allow mixed use of encapsulated (class instances) and unencapsulated data. There is an active debate as to whether this is a restriction that not only simplifies the OODB developer's task, but also forces better program design on the application programmers. ObjectStore and E allow unencapsulated data; Zeitgeist, Orion, and 0 2 do not.

b) Navigation versus queries: The Persistent Programming Language approach [63], [64] allows the user to make programming language objects (encapsulated, and possibly unencapsulated, data structures) seamlessly persistent without changing the data model that the programmer sees. Normal programming language operations then "navigate" through persistent data following interobject references. This approach does not necessarily support sets , unless they are already supported in the language being made persistent. A criticism of this approach is that it does not preserve data independence any better than older network or hierarchical data models. Many applications (such as CAD) naturally access their data one object at a time. For these applications, navigational access may be more important than the support of sets. Seamless persistence provides several advantages, including ease of application development, type checking support at the database interface, and first-class persistence for all data types rather than just sets.

Another approach [52], (56], [65] extends the relational model to provide better support for user-defined types, sub-types, and functions. In this approach, database objects are always members of sets. Sets are the only first-class persistent type. An odd consequence is that a persistent array must be wrapped in a one-row one-column table, and accessed with a query. This approach does not preclude database types mirroring program types; a data model independent of any programming language is defined. It adds object-oriented properties to the database language but does not solve the impedance mismatch problem (Section IIJ-B) between programming languages and the database. Since the new type system does not span both languages, type checking across the programming language/OODB interface is not supported. A major advantage of this approach is that it may lead to upward compatible , object-oriented extensions to industry standard SQL.

Between these two extremes is a third approach, which orthogonally extends object-oriented programming languages with both sets and persistence [58], [591, [61]. In these systems, all queries are against sets (collections), but navigation is also supported and


not all instances of a type need be stored in a set. This is permitted since it is recognized that many applications do not need set-oriented queries and it is inefficient to force all instances to reside in a set. A consequence of this approach is that not only persistent sets but also transient sets can be supported [71].

B. Storing and Retrieving Persistent Objects.

The most fundamental function of a database is to provide a way for objects to persist beyond the scope of an individual program execution. Persistence may be provided using a single-level storage (persistent memory approach) or a two-level storage (stor

age server approach). In this section, we discuss the two-level

storage architecture in some detail; a detailed discussion of the single-level storage architecture is beyond the scope of this paper.

To achieve persistence, an OODB developer has two choices.

o Extend the conventional programming environment into a persistent programming environment in which the results of a program remain in the program's memory after the creating program terminates. This approach, referred to as persistent memory approach, is based on a large, shared persistent virtual memory in which all programs execute. Objects are persistent if they can be addressed within this (potentially garbage collectable) space [72], [73]. Create a separate storage server to which objects are written from the application program's workspace before the program terminates and from which objects can be retrieved by subsequent program executions. This approach, referred to as storage server approach, requires a transfer of objects to the storage server; however, depending on the interface, this transfer may be invisible to the application.

The persistent memory approach provides the ultimate in seamlessness between programming environment and database since they are , in fact, one and the same. However, no existing programming language provides all the capabilities required for a truly unified persistent programming environment. PS-ALGOL [63] is an attempt to design and implement the "ideal" persistent programming language.

Storage servers can be implemented in several ways. The choice of server style is determined by the object model to be supported and by the access characteristics of target applications. In the following, we discuss several classes of storage servers and how they move objects to/from an application's computational memory and how these servers can be used to support the OODB's object model. Some performance issues related to storage servers are also discussed.

1) Storage Server Models: Storage servers can be classified by their unit of transfer and control, and by the level of semantics associated with the objects managed by the server. Using these criteria, storage servers can be classified into domains of increasing complexity and power:

o type less pages servers (Exodus [64] and ObjectStore [69]; o typeless object servers (Zeitgeist [61], Mneme [70], and

ObServer [74]; o class-based object servers (Postgres [52], Iris [65], and pro

posal by Wiederhold [75]; o type-based object servers (Orion [56] and 0 2 [76].

Of course, it is possible to augment any of these server types by adding a layer of software, or to increase system modularity by suppressing some capabilities; however, for our purposes , we shall consider only the basic levels of functionality of each model.

51

a) Typeless page servers: Page servers do not directly manipulate objects; instead, they manipulate virtual memory pages on which objects (or portions of objects) are known to reside. This is accomplished by creating a persistent virtual mem

ory parallel to the target machine's (transient) virtual memory. Pages in the persistent virtual memory have the same format as pages in the transient virtual memory (thus page servers tend to be hardware architecture specific because of page format differ

ences between platforms). When programs execute, they reference their virtual memory in the normal fashion. However, some of their virtual memory pages are identified as overlapping a portion of the persistent virtual memory. Accesses to these pages cause a page to be copied from the persistent virtual memory of the page server into the transient virtual memory. The contents of the page are then operated on by the program in the normal way. When the program is finished with one of the persistent virtual memory pages, it is released and, if modified, copied back to the page server.

Since the page formats are the same in both memories, transfer costs are lower in page servers than in other storage server models that must do substantial work to materialize objects. However, this apparent simplicity and speed is offset by the need to have a way to trigger the page transfers and to resolve addresses properly after a page has been loaded. In Exodus/ E. which supports objects that are not class instances , pointer following is the c ritical operation. Under the page server scheme, pointers cannot be virtual memory addresses . Instead, they are represented as offset within the page, or as references into another page. Therefore, any

pointer following operations in the application language must be redefined to compute a real virtual memory pointer from the offset and buffer address . This involves a preprocessor to replace all pointer references with the appropriate computation and to ensure that the buffer is resident and its start address is known. As a result of all this overhead, pointer fo llowing is somewhat more expensive in page server schemes. More serious issues are the optimization of call s to the page server (it is inefficient to repeatedly pin an already resident page when an object is referenced many times in a tight program loop), and buffer management policies to avoid thrashing. On the other hand, if the OODB supports a pure object model (all interobject references are by object ID for transient as well as persistent objects), pointer dereferencing is not an issue because all such references must go through a level of indirection to map OlD to virtual memory address anyway. This is the case in SmallTalk-like languages used by Servio Corporation's Gemstone 000 8.

To improve performance, pages may be buffered at the server and/or the client. Another performance consideration is clustering. When a page containing a requested object is brought in , if other objects on the page are also requi red, the total number of pages to be moved/buffered is reduced. Since the unit of concurrency control is the page, it is poss ible fo r incidentally coresident objects to cause unnecessary concurrency conflicts; proper clustering reduces this effect.

b) Typeless object servers: Object servers move and control access to individual objects or groups of objects . Typeless object servers understand only some notions of " abjectness:" namely, identity, the fact that an object has a type (without knowing what the type is), and that objects may be related to other objects by embedded OlD, externally speci fied relationship, or query. The type or class of individual objects is uninterpreted by typeless object servers; these servers cannot execute methods or access

the states of objects they manage . For storage into typeless object servers, object state is trans-

lated into a string of bits or bit-buckets; the process is reversed during object materialization. T ranslation preserves the structure of the object graph and decomposes the graph according to a set of persistent object boundary rules (discussed below). Translat

ing an object graph requires translation routines for each primitive and constructor data type in the supported language. To preserve sharing semantics, visited cells must be so marked to prevent processing on successive visits . Object boundary rules defi ne the extent of a persistent object. The ability to change these rules and to select different translation primitives fo r objects provide a degree of independence between logical and physical representations.

When translating an object graph for storage, the persistent object boundary rules determine what goes into a bit-bucket. One rule could be to make every object instance into a separate persistent object. Since objects tend to be small , this is inefficient in both space (management overhead per object) and time (many separate disk accesses) . It is more efficient to partition the object graph into somewhat larger subgraphs, and store each of these as a separate persistent object. On the other hand, to increase incre

mentalism and improve concurrency granularity, it is desirable to increase the number of objects. For this reason, it is probably desirable to allow the appl ication programmer some control over where object boundaries lie. Object boundary rules also have implications on concurrency and access control processes, since in an object server, both muSt know the scope of the controlled object.

What should constitute the state of an independently persistent object? In the example of Section li-B, assume that a ship references a port. Is the port part of the ship? If the port is directly reachable from more than one ship, it must be a separate object , since otherwise the port 's state would be copied into the state of more than one ship. Then if some information about the port is changed from one ship, there is no effective way to ensure that the changes propagate to the other ship. However, consider an oil bunker object that is never referenced except as part of the ship. Is it a separate object, or is it part of the state of the ship object? The root cause of this confusion is that in nonpersistent languages such as C++ and CLOS, interobject references are generally implemented by memory pointers. Thus no clear distinction is made between being · ' part of" an object and being "referenced by" an object. T his distinction must be made in the OODB world (see Section IV-A), and appears to be the cause of an unavoidable seam. However, it can be argued that this forces application programmers to think clearly about object identity.

c) Class-based object servers: Like typeless object servers, class-based object servers manipulate individual objects or groups of objects . However, they are also able to interpret and use the objects' state to provide additional service, particularly queries based on object state (see Section IV-D). Since the ability to manipulate object state directly in the server is critical for classbased object servers, they are typically built on top of relational

databases. Class-based object servers map object state directly into rela

tional tuples . Each class definition defines a relation, and each instance of the class becomes a tuple in the relation. If an object gets some of its attributes by inheritance, the implementor can flatten out the inheritance graph and define a single relation for all the attributes of the class . This is called horizontal partitioning [56]. In vertical partitioning [77 j, a relation is defined per inherited definition, with the object state spread out through all these relations . In the Ships example (Section li-B, Fig. 3), with horizontal partitioning there would be a single relation Cargo_Ship

whose attributes correspond to all attributes from both the Cargo_Ship class and its parent Ship class; with vertical partitioning there would be two relations Cargo_Ship and Ship, with the object identifier as a key and with an individual object having its slot values stored as a tuple in each relation. In the latter case,

reconstructing a cargo ship requires a join of the Ship and Cargo_Ship relations. In either case, constructing the computational state of an object is straightforward using relational queries. Vertical panitioning makes it costly to construct an object

from several tuples in different relations; horizontal partitioning makes class evolution more complicated.

The approach is limited by the data types that the relational system can store in tuple attributes. For example, separate handling must be provided for types such as arrays and methods that are not supported by the relational database. Also, if the object contains other objects (as opposed to referencing them by name), these contained objects must be stored in separate relations and, again, the costly operation of constructing object state from multiple tuples arises. There are advantages to this approach for systems that stress a query interface, and most systems using this approach, in fact, emphasize queries over navigation. Also,

because a mature database technology is used as an underpinning, existing database services such as SQL queries, backup and recovery, access control, and concurrency control can be used.

d) Type-based object servers: Type-based object servers not only operate on individual objects, but have the ability to execute the object's methods. The ability to execute methods allows computation to be moved from the application to the storage server, allows the server to execute type-based object-oriented queries, and allows these queries to be further optimized

Method execution requires the storage server to first materialize a computational representation of the objects; this can in fact be supported by page servers or either of the other two classes of object servers. Thus the extension of one of the other servers to a type-based object server can be accomplished by means of a software layer similar to that which supports the materialization of objects in the application's computational memory. The choice of whether to tightly bundle the added capabilities of type-based object servers with an underlying server or to make them separate modules in the OODB manager is a trade-off between performance and modularity.

2) Supporting rhe OODB 's Object Model: Systems that strive for seamlessness must have a transparent way to get objects from the OODB into computational memory. For example, if a Get_Port message is sent to a ship object, the application ought not to be concerned whether the port is acrually a separately persistent object; to do so would break the abstraction provided by the object-oriented model. In fact, it is possible that when application sending the message was written, the port was really part of the ship object, but was later separated out because of the needs of other applications. Ideally , the Get_ Port message ought to behave the same in both cases. Thus one could argue that, from a programmer's perspective. an object consists of everything reachable from the object. The object's root must be explicitly fetched using its OlD; most systems provide a name server to map user-friendly names to OlD's.

Once the root is identified. objects that it references should appear without further application intervention. This is known as object faulting, and can be supported in a number of ways. In page servers, transparent retrieval (and saving) is implemented by a compiler preprocessor or compiler modifications to generated code 10 pin pages before they arc required and unpin/flush them after use [64]. In OODB's employing object servers, fault-

JOSEPH eta/.: OBJECT·ORIE'ITED DATABASES

ing is achieved by either augmenting the message dispatch procedure [59] to perform the residence check and retrieval if required, or by forcing references to unfetched objects to cause a trap to the OODB (typically by an illegal memory address or memory contents), which can then materialize the object [61).

3) Storage Server Performance Issues: a) Clustering and prefetching: One school of the OODB

community believes that following interobject references is the dominant way of accessing objects from OODB's. If application

experience with the first generation of OODB's shows this to be true, these interobject references may be used to improve performance. Since the cost to retrieve a group of objects that are physically colocated on the disk is essentially the same as the cost to retrieve one of these objects, disk access time can be reduced by clustering together objects likely to be accessed in the same session. Clustering schemes have long been used in databases; the major difference in OODB's is that the stored objects themselves provide a rich source of information that can be used to drive the clustering mechanisms. Some of the open issues arc the following: I) to what extent interobject references actually drive access patterns, 2) how to cluster with respect to multiple applications with dissimilar access patterns, and 3) whether clustering based on data type or clustering based on interobject references is better. Performance improvements of over 60% because of clustering have been reported by CACTIS [54). ObServer [74] is astorage server that dynamically clusters according to reference patterns.

Prefetching refers to physically moving certain objects off of disk into a buffer or cache before they are requested, expecting that they will be requested soon. Prefetching has the advantage of reducing communication bottlenecks and has a positive impact on application performance. It is based on much the same strategy as clustering. A potential problem with prefetching is illustrated by the claim that the disparity in processor and disk speeds will make it impossible to determine far enough in advance which objects are likely to be needed [78].

b) Using the storage server's object materializarion capabilities to support parallelism: The encapsulation of objects reduces the degree to which objects are dependent on their environment. This faci litates moving objects around a network to evaluate their methods where it can be done most efficiently. Encapsulation appears to simplify the parallel execution of objectoriented code because of the minimal dependence on environment. The object translation capability of OODB's can be used to implement the actual movement of objects between machines. This is an area that presents opportunities to improve performance, particularly with respect to object-oriented queries, since queries tend to have a great deal of parallelism.

c) Measuring storage server petformance: DeWitt eta/. [79] simulated the performance characteristics of page and object servers. The results indicate that the object server approach will perform poorly with read-only applications that tend to scan large data sets, but will perform generally better than a page or file server for applications performing many updates. The simulation did not investigate possible similar benefits in the object server model from prefetching related objects.

C. Concurrency Control and Transaction Management

One of the primary purposes of a database is to allow sharing of information. As such , an OODB must regulate access to information by multiple concurrent users, each of whom is potentially unaware of the existence of the other users. This section I) summarizes concurrency control and transaction management as

53

understood and applied in conventional databases, 2) presents OODB-specific techniques that promise a highe r degree of conCUITent access than achievable using conventional techniques , and 3) discusses how OODB's can support cooperative design envi

ronments . 1) Transactions: Any database system must support the notion

of atomic, recoverable, and serializable transactions [80]. Atomicity means that a series of operations has an all or nothing effect on the database; either all operations succeed or all fail. This is necessary so that applications sec a consistent state. Recoverability can be provided at many levels, and OODB's do not require or impose anything special in this area. Serializability means that if the operations of two atomic transactions are interleaved, the result is as if one ran to completion before the other started. Serializability is considered to be sufficient (but not necessary) for ensuring the proper behavior of independent transactions in a multiuser system.

A transaction tree in which a transaction may contain subtransactions is called a nested transaction [81]. Nested transactions are applicable to database systems other than OODB's. The results of committed subtransactions arc visible only to their parents. The results of subtransactions may or may not be recoverable to any given resiliency; this is purely an efficiency issue .

In an OODB , it makes sense for a method invocation to be treated as a transaction, since the actions that implement the method appear atomic to the message sender. Since methods often send messages to other objects, a natural nesting of transactions occurs. Also. because one of the goals of object-oriented programming is to facilitate code reuse , it must be possible to use existing objects that use transactions as part of larger transactions. If object abstraction is not to be broken. this must be possible without modifying the methods of existing object classes. Thus nested transactions are extremely important in OODB's. To date, nested transactions have been primarily a research topic with no commercially available implementations, to our knowledge. A simpler approach often used in practice involves using one global, s ingle-level transaction to wrap a collection of OODB operations.

If each method invocation is a transaction, it is essential that the cost of a subtransaction be minimal, as they will be so frequent. This implies that sub-transactions should not be recoverable. However, it is often desirable for an object to have its methods create recoverable results. If these methods are then invoked from inside another method (itself a transaction), there is now a need for subtransactions to be recoverable, else the behavior of the object changes. For these competing reasons, it is desirable to separate the notion of recoverability from that of atomicity. It should be possible to specify how recoverable a transaction will be. There must also be rules to ensure that results read by a recoverable transaction are recoverable. This is still a research area and, to our knowledge, no implementation of a "variable weight" transaction mechanism exists.

2) Type-Specific Transaction Mechanisms: An object-oriented

approach to concurrency control is the notion of type-specific concurrency control [821, [83] . Typically, concurrency control is obtained by examining the read/write behavior of transactions. However, since the behavior of a type is completely defined by its interface, it is possible to construct atomic objects that can be used concurrently in ways not possible in schemes based on read/ write behavior only. For example, it is legal to both enqueue and dequeue from a queue object simultaneously , but this cannot be constructed using conventional protocols that depend on read/ write behavior.

Griffeth, Moss, and Graham [84] extend the notion of atomic

objects with the concepts of abstract and concrete atomicity. A transaction is thought of as a movement from one abstract state to another, by means of a sequence of abstract actions. A very flexible concurrency scheme can be built by considering layers of abstractions, in which an abstract action at one level is implemented as a sequence of concrete actions at the next lower level. As long as the actions at each individual level produce a serializable schedule at that level, the total schedule will also be scrializable. Different criteria for serializability (two phase locking, type-specific, etc.) can be employed at the various levels. This allows a much larger class of legal schedu les than would be possible without the levels of abstraction. For example, if simple two-phase locking is used at each level, it would be impossible to release locks anywhere in the schedule if additional locks were required later. This restriction docs not hold in this model, since two phasedness is required to hold only within a single abstract operation. As a result, concurrency is increased. This model is particularly well suited for object-oriented databases, since composition of objects provides natural levels of abstraction and encourages the use of type-specific concurrency control for individual objects. Without the notion of abstraction levels, concurrency would either have to be enforced at the object level based on read/write behavior or encapsulation of the subobjects would have to be broken to allow a higher level type spec ific concurrency control scheme to be implemented.

3) Long and Cooperative Transactions: Since one of the initial uses of OODB's is expected to be support of next-generation applications (see Section III) , there has been considerable attention paid to the concurrency control requirements of such applications, in particular, in computer-aided design (CAD). Design tasks generally involve a team of designers cooperating for a period of days to months. The long duration of these tasks (transactions) means that the concurrency control strategies used in conventional databases arc not appropriate [36]. Traditional databases enforce serializable schedules of transactions, with the major differences being in the size of the information grain whose access is individually controlled. Since the traditional transactions are of short duration (seconds to minutes), trad itional schemes rely on the blocking transactions terminating quickly, thus enabling other transactions to continue execution. When the transactions execute for long durations, this scheme does not work; a long-running transaction may block out other transactions for days or months.

Support for long transactions is crucial in an OODB. The first issue is to ensure that long transactions can save their intermediate state. This can be accomplished by checkpointing (traditional) , nested transactions (see above), or piggy-back transactions (which allow a long transaction to be split into a series of shorter transactions that run sequentially, passing their locks directly to their successor to prevent the intervention of another transaction from outside the sequence). These techniques do not allow increased concurrency.

Sagas [85] relaxes the restriction that the subtransactions of the saga execute without external interference by requiring that each subtransaction be supplied with a compensating transaction that can undo the transaction's effects should the saga abort, even when another unrelated transaction has al ready executed. Additional concurrency can also be gained by type-specific concurrency control (see above). This allows schedules whose individual operations are not read/write seria/izable. However, the behavior of the resulting schedule is still serializable in the sense that transactions only see a globally consistent database state.

In a design team, designers often look at each others' incomplete or inconsistent results to guide their own work. The concept

of cooperative transactions relaxes the database consistency requirements to allow transactions to view each others' partial results under certain conditions.

One mechanism to support cooperative work is a check-in/ check-our system [86] . Designers wishing to cooperate "check out" design objects from the global database into a private workspace. In the private workspace, designers operate on the objects outside the database's concurrenci control. When the designers are done , the objects are "checked in" to the database. To the global database. which enforces minnal concurrency control, the entire collection of operations in the private workspace appear as a single transaction. The check-in and check-out operations are normal (short) transactions against the global database. The checkin/check-out scheme supports flexible concurrency control while not requiring changes to the database's concurrency manager; but it delegates much of the responsibility for data integrity and concurrency control to the designers.

OODB's can also support cooperative work more directly by providing and enforcing a wider variety of lock types than the customary read and write locks [87]. An example is the notify lock, which causes the holder of the lock to be notified in the event that another transaction modifies the locked object. This notification can then be used by the lock holder to either trigger a reread of the object, or a negotiation with the modifying transaction to resolve any inconsistencies. This scheme provides services by which consistency can be maintained, but does not actual! y enforce consistency.

Read-only or multiversion databases also can provide additional concurrency [88]. Each write creates a new version of the object; thus it is possible for several transactions to be creating new versions simultaneously. This has the drawback that there is no uniform way to reconcile or merge competing versions. The database could provide some tools to identify conflicts in need of resolution when merging versions .

When the set of possible transactions is known in advance, it is possible to define a transaction group [89] whose members are known not to conflict and thus can be interleaved in any order. The transaction groups arc determined by examining the semantics of the individual transactions. A similar approach was used in the System for Distributed Databases (SDD-1) [901. Korth and Speegle [9 1] define local consistency constraints based on pre and post conditions for individual transactions. In this case, if a transaction has a less rigid requ irement for concurrency control , it can run under more relaxed conditions, perhaps allowing the use of uncommitted or only partially consistent objects . Skarra [92] combines the concepts of local consistency and transaction groups into a scheme allowing dynamic transaction groups.

D. Object-Oriented Queries

Database queries retrieve or manipulate information that satisfies some predicate . In other words, information is accessed based on its value, rather than its identity. In relational databases , this corresponds to the retrieval of a relation of tuples via a query language such as SQL. In relational database systems, the targets and results of queries are relations: in object-oriented systems, the targets and results of queries are sets of objects. Object-oriented queries differ from relational queries in three main respects: I) allowable predicates and response sets, 2) semantics of relationships and inheritance, and 3) query optimization techniques.

1) Allowable Predicates and Response Sets: Objects selected in an object-oriented query can be detcnnincd by a predicate involving either the object's abstract interface (type) or the

JOSEPH el at.: OBJECT-ORIENTED DATABASES

object's implementation (class). Queries over type are purely object-oriented; however, to support them, the OODB must be able to execute the object's methods. Thus queries over type are restricted to OODB's supporting an active object model. Queries over class can be implemented by OODB's supporting only a passive object model. When supporting queries over types, the 0008 may choose to allow only non-side-effecting methods to be used in a query predicate. This avoids the necessity of deciding what to do about an unintended side effect caused by the execution of a query. Similarly, restricting queries over class to apply only to certain attributes is reasonable when the OODB uses a relational database to implement its storage server. In such systems, some attribute values (primitive values) are stored in a way understandable by the relational database, while others (bit-buckets, see Section IV-B) are not. The SQL engine of a relational database can support queries over primitive attributes , while additional work would have to be done to support queries over packed attributes. By restricting queries to primitive attributes only, a simple query capability can be added easily.

In a relational database, heterogeneous responses (response sets containing more than one type of tuple) are not possible. However, in OODB's heterogeneous responses are possible (though not always supported) for object-oriented queries. For example, Postgres [52] supports heterogeneous sets, and thus it would be possible to send a List_ Items message to a set of both Ships and Cargo _Ships in the example of Section II-8; other systems would consider this to be an error since Ship does not have List_ ltems in its interface.

2) Semantics of Relationships and inheritance: Objects have a much higher level of semantics than relational tuples. The additional semantics arise in two ways; namely , I) rich interobjcct reference semantics expressible in multiple ways and 2) the use of inheritance to express relationships between classes. Queries over objects must comprehend these additional semantics.

Interobject reference may mean relationship , containment. connectivity, or ownership. For example, assume that we have a part object containing a list of references to other part objects . If a query asks for the list of all sub-pa1ts of the part , it is necessary for the query processor to know if this list of part objects is a list of sub-parts or a list of other parts to which this part is attached. For this reason, OOD8's that define a new type system often define explicit types of known interobject references. In persistent language based systems, where references are embedded within the objects as OlD's, there must be a way to communicate similar information to the query processor.

Inheritance is used for many purposes [68], [93]; two examples are subtyping and code sharing. It is important for the query processor to know the meaning of the inheritance in a particular use. This is illustrated by the following example. If there is a class Android that inherits from class Person to share code, it is not meaningful for a query asking for persons with some characteristic to return any androids. However, in the example of Section li-B, a retrieval of Ships with speed greater than 50 could return Cargo _Ships. Again, OODB's that define a new data model are free to restrict the meaning of inheritance, while persistent language systems must retrofit to the meaning of inheritance in the language.

3) Optimization of Object-Oriented Queries: Query optimization in relational databases is accomplished by mapping a query into a graph of algebraic operators like join, semi-join, project,

and select and then transforming the graph to a more efficient execution graph. To know "how" the graph can be rearranged, the optimizer must know which operations commute, distribute,

55

and associate. To know which ordering is ''better,'' the optimizer must have knowledge of such things as estimated cost to perform an operation, expected result size, and existence of indexes. These are known for relational algebra on which relational query languages are based. Since object-oriented queries allow arbitrary methods to be used as part of the predicate of a query, neither their algebra nor performance characteristics can be known at the time the optimizer is written.

Optimizing queries in the presence of arbitrary methods is an open issue. Several systems allow database methods to be implemented with arbitrary code. This promotes seamlessness but allowing methods blocks the usual query optimization strategies. Also, side-effecting methods may cause iterating over a collection in different orders to give different results (Andrews [58] presents a scheme for using blocks with identifiable end-markers to avoid this problem). Graefe and Maier [94] explore a mechanism to make the implementation of methods visible to the query optimizer. This is a violation of encapsulation if the optimizer is seen as an application; it is a reasonable extension if the optimizer is seen as a system module that may break encapsulations in a disciplined manner. Systems like Postgres restrict methods to contain only data manipulation language (DML) commands that can be optimized. Other systems do not permit methods in queries; but do allow methods to be executed against objects retrieved by class-based queries.

Relational extensions [95], [96] showed how abstract data types (ADT) from a programming language, restricted to not containing pointers, could be imported/exported into relational database fields and how operations on ADT's could be used in queries. This work also showed how to extend standard indexes to provide fast access paths for ADT's and how to define entirely new index types like KWIC and R-trees. These extensions were made by registering ADT's, operators, and abstract indices with the query optimizer. We see a trend toward providing open query optimizers to support not only abstract indexes but also semantic query optimization, cooperative response, incremental view update, incremental query reformulation, and/or parallelism and distributed queries.

A generally useful optimization technique is caching, i.e., saving the result of a computation so that it can be reused rather than recalculated. This becomes very useful if method results are allowed in query predicates. A generic cache management system implemented in Lisp is described in [97]. Postgres uses caching to compute derived representations of complex objects, like forms, so they are immediately available when needed. If the data from which they were derived are modified, a trigger recomputes the cached representation. Similar data caching mechanisms can be used to support view materialization [98], [99] and could be used to maintain consistency of compute indexes.

E. Change Management

There is general agreement in the research and industrial community [71], [ 1 00] that change management is an important service to be provided in an object-oriented environment. Change management (CM) may be provided as an integral part of the data model or as a distinct layer decoupled from the data model. Some OODB architectures have embedded CM as an intrinsic part of the model [65]; others implement it as a distinct layer [61]. Most existing software change management systems [101]-[103] interact closely with file systems and only support the management of change at the file level. An object-oriented environment needs a change management system that provides a unified environment

supporting the evolution and configuration of arbitrary objects. Systems like PIE [104] and Common Lisp Framework [105] address this need in specific domains by providing an object-oriented framework to manage change at the level of granularity indicated by the semantics of the application. A reference model for a generic CM system is described in Joseph et at. [106].

1) Change Management Definitions: CM may be defined as a consistent set of techniques that aid in evolving the design and implementation of an abstraction. These techniques may be applied at many levels to record history and explore alternatives (versions), manage a layered design (configurations), and maintain consistency during evolution and across multiple representations (transformations) . The operational model of change management is tightly related to the transaction model in the system. A model supporting nesting of transactions, exceptions, and notifications is needed for a full implementation of a CM system. Since inheritance is an important part of object-orientedness, a CM system also needs to understand the additional constraints that the inheritance model imposes on the environment. It is important that the system support objects of arbitrary types at different levels of granularity. Users are also concerned about interfaces to the change management system; for example, graphical presentations and query faci lities. CM should document the evolution of objects for purposes of validation [I 07], traceability and reuse. Change management systems exist to assist the management of data; this support role implies that a CM system that is obtrusive, low performance, or unfriendly may not be used.

a) Versions: the life cycle of a system is a set of discrete activities occurring during its development and use. The objects in the system evolve during this life cycle. A snapshot of an object during this evolution is called a version of the object. This snapshot may be distinguished from others by its creation time or by some other quantitative or qualitative attributes. An object is represented by its many versions during its life cycle. The version derivation sequence of an object reflects how the object evolves. The simplest way an object could evolve is linear: changes to the object always occur on the current version. In many applications, however, changes often occur in a nonlinear fashion. Therefore, a change management system needs to support alternatives or branching versions. Also, the desired object may be selected from a combination of several possibilities, implying the system needs to support merging of versions. Since objects are usually structured hierarchically, the creation of versions of an object can trigger versions of other objects in the hierarchy. It is often desirable to allow the application to control the triggered creation of versions in a hierarchy [ 108].

A version of an object may be represented as a delta from a "previous" version. Such differential representations save space when the object is large or when the changes are small and frequent. Differential representation trades off computation time for storage space.

b) Configurations: It is frequently necessary to design an object by composing it from other objects (see Section II). A configuration of an object is its specification as a composition of other objects. The specification may be bound either statically or dynamically. In static binding, the exact versions of the components that make up the configuration are specified; in dynamic binding one may use various schemes to delay the binding of versions of components [104], [109]. It is sometimes appropriate to bind the component objects only to their interfaces; this way a change management system can select the appropriate implementation of the object dynamically. A configuration structure is, in general, a directed acyclic graph. The links in this graph may be

adorned with properties to associate with an object and its components. One such property is selective inheritance of some properties of the composite object. Properties common to many components of an object may be attached to the object and inherited by these components. This kind of inheritance promotes performance and consistency. Another link property is that of ownership [110]; this property asserts that a component exist only by virtue of its being a part of another object. Physical hierarchies can be simulated in object-oriented design by means of ownership links.

c) Transformations: Operations applied to objects during their life cycle are transformations. There are transformations such as editing, simulation, and analysis applied to particular views of an object. There are also transformations, variously called translations, compilations, expansions, or synthesis, to bring an object from one view to another. In heterogeneous or multiuser environments, there are transformations to transport objects across machines or development environments. As a result of transformations, new objects are created. These new objects are different from the original objects in versions, configurations, or representations. The transformation aspect of change management addresses issues such as change notification, change propagation, dependency tracking, and constraint maintenance. What is required, in general, is a constraint specification and management component. A body of work exists in the area of constraint specification and enforcement [111]-[115]. A transformational design paradigm for software development and application of the paradigm to YLSI design are studied in Mostow and Balzer [116].

2) Schema Evolution: The objects in an OODB are defined by a set of type and class definitions. The inheritance structure of an object-oriented system defines a relation is-a-subtype/subclass on this set of types/classes. Under this relation, the set of types/ classes is a directed acyclic graph called the schema. The schema evolves by changes to the set of behaviors associated with a type, the structure of the type/class hierarchy, the physical organization of the class instances, or the methods implementing the behavior [59], [ 117].

Schema evolution can be seen as an application of the change management system where type/class definitions are the objects to be versioned. Inheritance imposes a configuration structure on the types/classes. The user is concerned with the effect that a versioning of a type/class has on existing instances; this is the domain of the transformation aspect of change management. Conceptually, therefore, a change management system covers all aspects; but there are difficult and interesting practical issues because of the extensional semantics (the associated instance and methods) of a class [87]. When a type/class evolves, the semantics may require that existing instances of the type/class change to conform to the new definition. The operational details and policies are provided by the change management system. The policies about when (immediate or at access time) and how (versioning or overwrite) the instances are updated for conformity have a major influence on performance and functionality. To assure structural consistency of old objects with new programs and to assure that existing programs continue to work, it is necessary to keep versions of the types, classes, instances, and programs. The programs need to know the particular versions of the types/classes that are correct for them. A fully functional change management system is required at all stages of software development and use to guarantee the correct schema evolution semantics. Schema evolution in object-oriented systems is inherently more complex than schema evolution in relational databases; this is because of the additional semantics associated with objects- namely , inher-

JOSEPH eta/.: OBJECT-ORIENTED DATABASES

itance and behavior. Most existing object-oriented systems (programming and database) have very limited schema evolution capabilities. CLOS supports a limited form of schema evolution [17]. The reader is referred to Joseph et al. [71] for a discussion of schema evolution support in their systems by the designers of some OODB's.

Schema evolution presents one of the most dramatic rationales for the separation of type and class. The promotion of existing instances from one class definition to another, and the recompilation of programs to correspond to new definitions is a major expense. By separating the concepts of type and class, many of these operations can be avoided for certain kinds of schema evolution. For example, allowing several classes to implement the same type and then requiring programs to interact with objects by their type allows instances of old and new classes implementing the same type to coexist in memory, thus eliminating the need for instance promotion or program recompilation.

F. Other Database Issues

Given the immaturity of the OODB field, issues such as access control , remote database access, interlanguage sharing of objects, and user interfaces have not received adequate attention. However, these are areas that are likely to receive more attention in the next few years.

I) Access Control: Because OODB's store active objects, it is possible for the objects to perform their own access control. For example, an object could demand authentication before it performed some service. An interesting issue is how access control will interact with querying, since queries are generally performed under control of the database rather than the application. The question then becomes: who needs authentication, the application or the OODB? Also, would this preclude some optimization strategies, since some reorganizations of the query graph might be precluded by the need to maintain the access checks in the same relative positions in the graph?

2) Remote Database Access: Because OODB's present an abstract view of information that is representation independent, OODB' s may form an excellent way to access heterogeneous databases, presenting the user the impression that there is a single OODB being accessed. A problem with this approach is to coordinate the transaction/commit mechanisms of the "foreign" databases to ensure atomicity.

3) lnterlanguage Sharing: It is desirable to be able to share objects across hardware/software platforms, and also between different programming languages. When sharing objects across platforms/compilers, the data types' representations may be different based on different hardware word length, byte ordering, formats of structures like arrays and records, etc. This implies that the OODB must translate objects; systems that copy uninterpreted byte strings will not be able to support such sharing. There has been substantial effort [ 118]-[ 121] in other contexts to provide translation of primitive types (such as integers and reals) between machine classes. For OODB work, this must be extended to complex data structures. Some applicable work is the MIT Mercury system [122], but Mercury does not preserve sharing within a graph structure.

Sharing across language boundaries raises the additional complication that not all data types are supported in all languages. Even for data types that appear to be universally supported, the semantics vary from language to language. For example, some languages require arrays to be homogenous (all data elements of the same type) while others do not. Further, some languages allow

57

an array to grow in size as needed, while others require its size to be declared at compile time. A system wishing to share representations between languages must decide whether it wishes to take a "least common denominator" approach that supports the least powerful features of the supported languages, or perform some runtime checking that will prevent a data structure from being moved to an inappropriate environment. Each approach has

the disadvantage of requiring a progmmmcr to know the potential usages of objects. Additionally, since methods may be arbitmry programs, they cannot be translated between languages. This would either require coding presumably identical methods in the various languages supported (with the resulting uncertainty of equality), or destroy the objcct-orientedness of the system entirely. While limited sharing is likely in a few years, a total and transparent ability to share across language boundaries remains a distant hope.

In OODB's such as Trellis/Owl [1 2] that support a separate data model , access from "conventional" programming languages is by means of messages sent from the conventional language to objects in the data model. This requires a dual type system, but does allow objects in the data model to be used from multiple languages in much the same way relational detabases can be used from multiple languages today.

4) User Interfaces: OODB's are likely to use the same user interface technology as other applications, s ince user interface technology is becoming increasingly independent of backend applications. Generdl-purpose object-oriented user interface toolkits like Motif [123] or Interviews [124) will provide application program interfaces including libraries fo r tables, forms, and other presentation styles . Higher level User Interface Management Systems (UIMS's) and User Interface Builders (UIB's) will hide even these programming interface details from users.

Data structure inspectors and schema browsers/editors, which OODB's will usc, will be needed in ordinary programming environments independent of whether OODB's supply the persistence. In fact. it will be most desirable if the same editors will operate on persistent and transient objects.

On the other hand , OODB's will make it eas ier to store and manipulate multimedia data (image, video, audio) , as well as graphics and text data , making it easier to construct much larger multiuser, distributed hypermedia systems and spatial database applications. OODB's make it possible to use object queries to query mail , document structure, and other semistructured data to build rich views of information.

V . A SURVEY OF OODB SYSTEMS

In this section, we present a brief survey of a few typical OODB systems. This is not a comprehensive listing. There are many interesting systems that we are not going to cover. These are: DEC Trellis/OWL'~ [ 12], Symbolics Statice [60), Object Design ObjectStore [69], Versant OBJECT-Base [125], and Objectivity Objectivity/DB [126]. The information below is based on published material and may not be the most up to date as these systems are undergoing rapid evolution.

Of the seven systems surveyed here , the first two (Ontos, Gemstone) are commercial products and the third (MCC Orion) has a commercial variant; the others are research prototypes .

Figure 4 shows how these systems fit into the taxonomy of

databases discussed in Section III-C-2).

'"Trellis is a trademark of Digital Equipment Corporation.

A. Omologic Ontos

Ontos [62] is a comme rcial object-oriented database system developed by Ontologie. It provides persistence to C++ pro

grams . Instances of classes that inherit from an Object class can be made persistent. Such classes have some restrictions to guarantee that the size of instances can be detem1ined by Ontos. Additional methods to support persistence may also need to be defined for such classes. Ontos interfaces directly to programs written in C++ via a client library of C++ functions and classes. The library , which is linked d irectly into application programs, provides applications access to the database by means of persistent classes , schema classes, Aggregates , and exception classes. Schema classes are used to manage the class infonnation (data dictionary) and provide run-time type information; Aggregate classes form the basis of arrays , lists and sets and deal efficiently with groups of objects via associated Iterators; exception classes are used for consistent detection and handling of run-time error conditions.

Ontos supports concurrent users through a transaction mechanism based on locks. Transactions have a number of options to

support short and long transactions. Ontos provides both pessimistic and optimistic concurrency control and checkpointing . A programmatic SQL (embedded in C++) is provided for associative access to data . An assoc iative query may include methods or functions in any of its clauses; any side effects are visible in the database. The query may iterate over instances of classes or of Aggregates.

Ontos can be distributed over a network of nodes, each of the same hardware family and qperating system. The current sup

ported families are Sun/Unix, Apollo/Unix , HP/Unix. Yax!VMS, and PC/OS2. Ontologie customers are mostly in the engineering and CAD/ CAM markets. A previous OODB product from Onto

logic was called Ybase (58].

B. Servia Corporation Gemstone

Gemstone"' [59] is a commerc ial object-oriented database system developed by Servio Corporation. A language , OPAL, based on Smalltalk [II] is provided for data definition , data manipulation, server access, and general computation. Secondary storage management , concurrency control, transactions, and work spaces arc managed by the STONE subsystem built on top of a file system. The GEM subsystem supports the OPAL language and libraries of OPAL classes and methods. Gemstone also provides a module callable from multiple languages to link with applications running on a PC. Associative access to objects is provided through a calculus limited so that queries are viewed as OPAL procedures. A detailed discussion of indexing in object-oriented databases and the particular implementation choices in Gemstone appears in Maier et al. [43] . Gemstone is designed as a multiuser system and supports transactions, replication of data , and multilevel authorization control. It also provides the capability to

extract data from SQL relational database systems. The OPAL programming environment that runs on the local workstation includes a class browser, workspace manager, inspector, and debugger. There is no support for version control o r configurat ion

management. Gemstone is written in C and runs on DEC (VAX and DEC

station) , Sun (Sun-3 and Sun-4), and IBM (RS6000) computers. Client applications may be written in seveml languages and run on various IBM PC 's and Apple Macintoshes in addition to the

'"Gemstone is a registered trademark of Servio Corporation.

machines listed above. Servio Corporation's customers are in CAD/CAM, CASE, and text-oriented applications including configuration management and documentation systems.

C. MCC Orion

Orion [56], [127] is a database system developed in the Advanced Computer Architecture Program at MCC. The major o~jective of Orion is the integration of a programming language With a database system. Orion does this by adding persistence and sharing to objects created and manipulated in object-oriented applications. The Orion data model supports multiple inheritance, s~h~ma evolution, versioning of objects, composite objects, assoCiative queries, and transact ion management. Queries are done on members of a class. Only objects that are instances of Orion classes can be made persistent. The application interface to Orion is an object-oriented extension to Common Lisp. The query language allows user-defined functions in the selection predicates; there are some restrictions on the objects the queries can return. Orion provides programming level control to do physical clustering of objects, maintain ing secondary indexes, and transaction ~anagement. Details of transaction management in Orion appear m Garza and Kim [37); implementation details of the buffer management scheme are given in Kim eta!. [128].

Orion- ! SX is a multiuser version of Orion in which a single server provides persistent object management to multiple workstations. Orion-2 is a fully distributed version of Orion-ISX. Orion was implemented in Common Lisp on Symbolics 3600 workstations and then ported to SUN workstations running UNIX. A commercial product called ITASCA, based on Orion technology is marketed by Itasca Systems Incorporated.

D. HP Iris

Iris [65), [ 129] is a research prototype of an object-oriented database system developed by Hewlett-Packard. The Iris system consists of 1) a query processor, or object manager, that implements the object model, 2) a storage manager subsystem providing access paths, concurrency control, backup, and recovery, and 3) a collection of programmatic and inte ractive interfaces. The data model supports structural and behavioral abstractions. In Iris, information about objects is modeled using relationships. Attributes are modeled via functions whose values are derived from the relationships. The query processor translates Iris queries and operations into an internal relational algebra format. The Iris storage manager is similar to the RSS in System R[130]. The storage manager provides for the dynamic creation and deletion of relations, transactions with check-pointing, and indexing. One of the interfaces is an object-oriented extension to SQL. The two main extensions are that direct references to objects rather than keys are used and that functions defined by the user or Iris can appear in SELECT and WHERE clauses. A second interactive interface to Iris is a structure browser to view the Iris schema data. Version ~ontro l is coupled with the data model; schema versioning is Implemented by a schema version identifier that is associated with each object .

The Iris prototype is implemented in C on HP-9000/320 UNIX workstations. The storage is on HP's ALLBASE Relational DBMS. There is a version of Object SQL embedded in Lisp to access Iris from Lisp applications.

E. Tl Zeitgeist

The Zeitgeist [61) OODB system under development at Texas Instruments is designed to support design applications and large-

JOSEPH ~~a/.: OBJECT-ORIENTED DATABASES

scale object-oriented applications by providing a seamless interface to persistent objects from programming languages . The Zeitgeist architecture is modular and is composed of the persistent object store, the object management system, the set-oriented

query interface, the change management system, and the user interface system. Persistent object store provides storage for objects, concurrency and control primitives, and atomic transactions. Object management system oversees the translation of objects between computational and stored formats [131], and a transparent, on-demand retrieval mechanism called object-faulting. The programmatic interface is via messages to objects. Zeitgeist does not introduce a new data-model; instead, it supports the data model of the programming envi ronment. A change management system implemented as an abstract machine on top of Zeitgeist provides versioning and configuration support . Schema Evolution is supported using the change management system. A hypermedia system provides database browsing.

There are two implementation of Zeitgeist: one running on Unix workstations and supporting applications written in C + +; another running on Unix workstations and TI Explorer Lisp machines and supporting CLOS applications. Both implementations currently support single server, multiclient configurations. Zeitgeist is being used by computer-aided design and manufacturing appl ications within Tl.

F. Altafr 0 2

02 [76] is an object-oriented database system being developed by the Altair consort ium in France. It has the functionality of a DBMS (pers istence, disk management, sharing, and query language) and that of an object-oriented system (complex objects, object identify, encapsulation, typing, inheritance, overriding, extensibility, and completeness). 0 2 supports a set of database programming languages, C02 (a combination of the 0 2 data model and C programming language), and Basic02 (a combination of the 02 data model and Basic programming language); a set of user interface generation tools (LOOKS); and a programming environment (OOPE). 0 2 consists of eight functional modules : OOPE, the alphanumeric interface; LOOKS, the language processor; the query interpreter; the schema manager; the object manager; and the disk manager.

The disk manager takes care of I/0, data placement, indexing, and buffering. The object manager maps the abstract object data model onto the disk representation. The schema manager deals with schema information such as types and programs. The language processor manages the data definition language commands and the compilation of programs. It also populates the schema by sending orders to the schema manager. The query interpreter is responsible for interpreting queries using the object manager and the schema manager. LOOKS manages the screen, displays objects and values, and handles their interaction with the object manager. OOPE is the programming environment; it uses LOOKS to display and manage data on the screen. The alphanumeric interface provides direct access to the various languages of the system, without using graphical facilities .

G. Berkeley Postgres

Postgres [52], [66] is a prototype database system developed at the University of California at Berkeley. It is, from different points of view, an "extended relational" DBMS, a " nested relational" DBMS, and an OODB. It extends relational databases in several ways. Where most relational systems provide only a few built-in types and operations on them, Postgres provides three

59

kinds of user-defined types and three kinds of user-defined operations on them. The system supports a limited number of base types. Any user can define new abstract data types (ADT's) by specifying functions to convert instances of the type to and from the character string base data type. These ADT's cannot contain pointers. Finally, Postgres supports constructed types and structural inheritance. The user can register functions, written inC or Lisp , that operate on ADT's and pose queries using these ADT's and ADT-specific operations. To permit optimization in queries, the user can define one and two operand operators that define equality and ordering so that B-trees can be used as indices. The user can even define new index types to POSTGRES using a registration protocol. Finally, the user can define POSTQUEL functions , which allow a field in a relation to be defined as a query (nested relation). As in HP Iris, stand alone data structures cannot be made separately persistent; the only first class persistent objects are sets.

Postgres also implements a rules system. Triggers can be defined as "always" or " never" running Postgres data manipulation commands that maintain consistency relationships. A complex marking scheme is used to implement Postgres rules efficiently, supporting eager and lazy evaluation. Postgres implements a "no-overwrite" storage manager instead of a traditional write-ahead log. In this scheme, modified or deleted records remain in the database, making transaction aborts instantaneous. Since old records as well as current ones are available in the database, queries in past states of the database are supported. However, queries that span past database states are hard to spec-ify.

Postgres was implemented originally in C and Lisp and later reimplemented in C. A Postgres prototype is available from Berkeley.

VI. OODB STANDARDiZATION EFFORTS

As Sections III-V have indicated, there is a large diversity in approaches for OODB's. The need for experimentation in approaches continues. On the other hand, there are now several OODB products and substantial research prototypes. Several researchers have stated requirements for OODB 'sand have offered surprisingly similar definitions and initial specifications for consideration [I]. [2]. [35].

There is a real need, driven by industry and government [132]. for reaching consensus on OODB functionality. The value of a standard OODB would be interoperability and interchangeability. A standard would insulate applications using OODB technology from incidental differences between different OODB systems. A number of industry groups are working to accelerate the convergence process toward OODB standardization. A "standard" OODB may be far in the future; however, it appears that in several areas, such as Persistent (X), where X is an object-oriented language, there are good prospects for consensus leading to standards.

A. X3/SPARC/DBSSG/OODBTG

In January 1989, the Database Systems Study Group (DBSSG), one of the advisory groups to the Accredited Standards Committee X3 (ASC/X3), Standards Planning and Requirements Committee (SP ARC), operating under the procedures of the American National Standards Institute (ANSI), established a task group on object-oriented databases (OODBTG).

OODBTG seeks to facilitate further development and use of OODB technology by defining a common reference model for an

OODB, based on object-oriented programming and database management systems models. OODBTG is assessing whether and where standardization on OODB's is possible and useful. Some areas of possible standardization include glossary, reference model, operational model, interfaces, and data exchange. In mid-1991, OODBTG will issue a final report recommending how ASC/X3 should pursue standards in the OODB area , and how OODB standards would be related to existing standards like X3H4 (SQL), X3Jl6 (C++), X3Ji3 (Common Lisp), and others.

B. Object Management Group and Other Application Integration Frameworks

An industrial consortium called Object Management Group (OMG) was formed, in April 1989, to build an object-oriented application integration framework. The objective of OMG is to accelerate the formation of complementary technologies that can provide the basis for improved application portability. OMG now has over 70 members. It is one of several industrial consortia that aim at integration framework~. Others are Portable Commontools Environment, Engineering Information System, CASE Integrated Services, and CAD Framework Initiative.

A unifying theme of all these frameworks is to view future computer systems as collections of applications and services. Examples of common services include a common user interface, a common help system, and a common database system. In objectoriented frameworks, applications invoke services by sending messages. When common services are available, then application designers do not have to reinvent the service for every application, enhancing reuse, and end users get the benefit of a consistent semantics.

Many frameworks groups view an object-oriented database as a backbone of their system and are working in different ways towards a consensus-based solution. Some, like OMG, are planning to issue proposals to industry for a common OODB system. Others like PCTE and CIS have fa irly detailed specifications in progress for enterprise-wide OODB's.

C. Benchmarks and Conformance Testing

As mentioned in Section III-B, there is a growing interest in the OODB community in developing OODB performance benchmarks that will help to quantize and tune different OODB architectures [ 132] . As yet there are no efforts to build benchmarks for standards conformance testing to try certify OODB interopcrability compliance, since formal standards do not exist .

VII. SUMMARY AND CONCLUSION

Object-oriented database systems aim at meeting the data modeling, performance, cooperative design support, and version management requirements of next-generation applications, such as CAD, CAM , CASE, hypermedia, and expert systems. We began this paper with background infom1ation on object-oriented concepts, such as objects and object identity, object classes, inheritance, and message passing. We then described the requirements of next-generation applications . These requirements are: rich data modeling, distributed and platform independent object storage, navigational as well as query access to objects, transactions appropriate for cooperative design work , sharing of objects among application systems, seamlessness, support for evolution of object instances and object schema, and adequate performance. Conventional databases were designed to support commercial data processing applications that are characterized by

simple data types, short-duration operations, and set-oriented

associative access to data. These databases fail to meet the data modeling and performance requirements of next-generation appli

cations.

We described the characteristics and features of an OOOB; the primary novel features are the support for complex objects, the notion of object identity , inheritance , and encapsulation of object

behavior. The characteristics and features are chosen so that

OOOB's can support the needs of next-generation applications with acceptable performance. There are different approaches to designing an OOOB; we have presented a taxonomy of these

approaches.

We presented key OOOB architectural and implementation issues , design alternatives , and trade-offs in the areas of object

data models, persistent object storage and retrieval , concurrency

control and transaction management , query processing, version

management, and schema evolution. We presented a brief survey of seven OOOB systems: two commercial products (Ontologie Ontos , Servio Gemstone) , and five research prototypes (MCC

Orion, HP Iris, TI Zeitgeist, Altai:r 0 2 , Berkeley Postgres). It is clear that there is a substantial divergence on many key issues

within the OOOB community, indicating that consensus and stan

dards on some of the key issues are several years away. However.

since there are several areas where technical consensus is poss ible and there is strong pull from industry for standards , there are good

prospects that standardization activity (already begun) can succeed. We also expect to see further work and consensus on per

formance benchmarks and continued performance improvements.

Interestingly, both the 0008 and conventional database

schools seem to be heading in the same direction , i.e ., toward the use of an object-oriented data model to pem1it richer data

modeling. The two schools differ in how this will be achieved. Not surprisingly, today's established database vendor community

likes to retain the basic relational architecture and accommodate object extensions to it. Today's OOOB vendors feel that the basic

database architecture itself should change, while supporting quer

ies as a necessary capability. The 0008 school and the extended

relational database school are vigorously debating their

approaches. We believe that the approaches taken by the OOOB

school have an advantage over an extended relational database when "seamless " or "low-impedance" integ ration of object-ori

ented programming languages (such as CLOS, C++. Smalltalk)

with database amenities becomes an important ing redient fo r software productivity and reliability. On the other hand. extended

relational database systems have an advantage over 0008's when an evolutionary migration path from wel l-established relational

database technology base is an important cons ideration. Eventual

synthesis between these schools is likely: however, both schools face the same challenging research questions, such as how to permit query optimization in the presence of encapsulation, how to

support cooperative desig n work, and finally how to meet demanding performance needs. The first act played out over the last five years has been exciting, but the drama will continue to

unfold for several more years.

A CKNOWLEDGMENT

The authors greatly appreciate the critique and comments made by anonymous referees on an earlier manuscript of this paper, as

well as comments made by the members of the Zeitgeist 0008

project under way at Texas Instruments. Many of the ideas presented in this paper came out of discussions with the Zeitgeist 0008 project members, as well as feedback received from Zeitgeist users within Texas Instruments.

JOSEPH et a! .: OBJECT-ORI ENTED DATABASES

R EFERENCES

[I] The Commiuee for Advanced DBMS Funcrion . "Third generation database system manifesto," UC Berkeley Tech. Rep. UCB/ERL M90/28, Apr. 1990.

[2] M. Atkinson er a/ .• "The object-oriented system manifesto, .. in Proc. DOOD '89. Dec. 1989.

13] M. Stonebraker era/. , "Panel: Database systems debate," presented at 1990 ACM SIGMOD Con f .. Atlantic City. NJ , May 1990.

[4] C. Stone and D. Hentchel , "Database wars revisited,' ' BYTE. vol. 10, pp. 233-242, Oct. 1990.

[5] B. Meyer. Objecr-Oriemed Software Construction. Englewood Cliffs, NJ: Prentice-Hall, 1988.

[6] G. Booch, Object-Orienred Designwirlz Applications . Redwood City, CA: Benjamin-Cummings, 1990.

[7) G. E. Peterson , Object-Oriemed Compuring. Washington , DC: Computer Soc. IEEE, 1987.

[8] W. Kim. "Object-oriented databases: Definitions and research directions ," IEEE Trans. Knowledge Data Eng., vol. 2, pp. 327-341, Sept. 1990.

[9] 0. J. Dahl and K. Nygaard , "SIMULA-an Algol-based simulation language," Commun. ACM, vol. 9, pp. 671-678, Sept. 1966.

[10] G. Binwistle era/., Simula Begin. Berlin , Germany: Auerbach, 1973.

1 I I ] A. Goldberg and D. Robson, Sma//ralk-80: The Language and Irs lmplememation. Reading. MA: Addison-Wesley, 1983.

112] C. Schaffet1, T. Cooper. B. Bullis, M. Killian, and C. Wilpolt, " An introduction to Trellis/OWL," in OOPSLA '86 Con}: Proc .• 1986.

[1 3] B. Stroustrup, 17le C++ Programming Language. Reading, MA: Addison-Wesley, 1987.

[14] B. Cox, Objecr-Oriemed Programming: An Evolurionary Approach. Reading, MA: Addison-Wesley, 1986.

[15] S. Keene and D. Moon. Common Lisp Classes: A Draft ObjecrOriemed Standard. Cambridge, MA: Symbolics . Inc., 1986.

[16] D. Bobrow and M. Stefik, The LOOPS Manual. Xerox PARC. Palo Alto. CA 1983.

[17] D. Bobrow. L. DeMichiel, R. Gabriel, S. Keene, G. Kiczales. and D. Moon, "Common Lisp object system specification.'' X3Jl3 Tech. Rep. 88-002R, June 1988.

[ 18] M. Ellis and B. Stroustrup, 77te Annorared C++ Reference Manual. Reading, MA: Addison-Wesley. 1990.

[19] M. Stefik and D. Bobrow. "Object oriented programming: Themes and variation~ ... The AI Mag .. vol. 6. 1986.

[20] S. Danfonh and C. Tomlinson. "Type theories and object-oriented programming ... ACM Computing Sttl'l'e.\'S, vol. 20. pp. 29-72, Mar. 1988.

[21] P. Wegner. " Conceptual evolution of object-oriented programming ... Brown Univ .. Providence. RL Tech. Rep. CS-89. Dec. 1989.

[22] M. Minsky. "A framework for representing knowledge.·· in 111e Psychology of Computer Vision. P. Winston. Ed. New York. NY: McGraw-Hill , 1975.

(23] B. R. Robens and I. P. Goldstein. "The FRL primer.'' Mass. lnst. Techno!. , Cambridge. MA, Tech. Rep. AIM 408. Nov. 1977.

[24] D. Bobrow and T. Winograd. ''An overview of KRL. a Knowledge representation language ... Cogniti1•e Science. vol. I. 1977.

[25] M. Hammer and D. McLeod, " Database description with SDM : A semantic database model,·· A CM Trans. Database Sysr., vol. 6, pp . 35 1-386, Sept. 1981.

[26) D. Shipman. "The functional data model and the data language DAPLEX," ACM Trans. Database Syst., vol. 6. Mar. 1981.

[27] E. van Orden, OOPSLA '86 Turoria/ Norebook. New York, NY: Association for Computing Machinery. Sept. 1986.

[28] D. Stamps. "Taking an objective look," Datamation , vol. 5, pp. 45-48, May 1989.

[29] S. Khoshafian and G. Copeland. "Object identity," in OOPSLA '86 Conf Proc., pp. 406-416.

[30] E. F. Codd, "Extending the relational database model to capture more meaning," ACM Trans. Database Syst., vol. 4, pp . 377-387, Dec. 1979.

(31] C. J. Date , ''Referential integrity," in Proc. 7th lm. Conf on Very Large Databases, Sept. 198 1.

[32] J. D. Ullman , Principles of Dawbase Sysrems, 2nd ed. Rockville, MD: Computer Science Press, 1982.

61

[33) C. J . Date , An Introduction to Database Systems, 4th ed. Reading, MA: Addison-Wesley, 1986.

[34) W. Kim and F. H . Lochovsky, Object-Oriented Concepts, Databases and Applications. New York. NY: ACM Press, 1989.

[35] J. Zdonik and D. Maier, " Fundamentals of Object-Oriented databases," in Readings in Object-Oriented Databases, J. Zdonik and D. Maier, Eds. Morgan-Kaufman, 1990, ch. I, pp. 1-32.

[36) F. Bancilhon, W. Kim , and H. Korth. "A model of CAD transactions," in Proc. Int. Conf on Very Large Data Bases. 1985. pp. 25-33.

[37) J . F. Garza and W. Kim, ·'Transaction management in an objectoriented database system," in Proc. ACM SIGMOD /m. Conf on Management of Data, 1988, pp. 37-45.

[38] The Laguna Beach Participants, "Future directions in DBMS research," SIGMOD RECORD, vol. 18, pp. 17-26, Mar. 1989.

[39] D. McLeod, " 1988 VLDB panel on future directions in DBMS research,'' SIGMOD RECORD, vol. 18, pp. 27-30, Mar. 1989.

[40] E. F. Codd, "A relational model for large shared databanks," Commun. ACM, vol. 13, pp. 377-387, June 1970.

[41) M. P. Atkinson, P. J. Bailey, K. J. Chisholm, P. W. Cockshott, and R. Morrison , "An approach to persistent programming," Comput. J. , vol. 26, pp. 360-365, Dec. 1983.

[42] S. B. Zdonik and K. Smith, " Interrnedia: A case study of the differences between relational and object-oriented database systems,'' in OOPSLA '87 Conf Proc. , 1987, pp. 452-465.

[43) D. Maier, J. Stein, A. Otis, and A. Purdy , "Development of an object-oriented DBMS," in OOPSLA '86 Conf Proc., 1986, pp. 472-482 .

[44) D. Bitton. D. J . DeWitt, and C. Turbyfil , "Benchmarking database systems: A systematic approach," in Proc. of the Nimh Int. Conf on Very Large Data Bases, 1983, pp. 8-19.

[45] Anon eta/., "A measure of transaction processing power,'' CMU Tech . Rep ., Apr. 1985.

[46] R. Cattel, W. Rubenstein, M. Kubicar. "Benchmarking simple database operations," in ACM SIGMOD Int. Conf on Management of Data, May 1987, pp . 387-394.

[47] L. Anderson, A. Berre, M. Mallison, H. Porter, and B. Schneider, "The Tektronix Hyperrnodel benchmark," Tektronix Tech. Rep. , Aug. 1989.

[48] D. Dewitt, P. Futtersack, D. Maier, and F. Velez, "A study of three alternative workstation-server architectures for object oriented database systems," Altair Tech. Rep. 42-90, Jan. 1990.

[49] R. Cattel and J. Skeen, "Engineering database benchmark." Sun Microsystems Database Eng. Group Tech. Rep., Apr. 1990.

[50] J. D. Ullman, " Database theory: past and future,'' Keynote speech presented at Principles of Database Syst. Conf. Mar. 1987.

[51] A. Otis, " Reference Model for Object Data Management," National lnstitllle of Standards and Technology, May 1990. Available from E. Fong, NIST, Tech. Bldg. A266, Gaithersburg, MD 20899.

[52] M. Stonebraker and L. Rowe, ·'The design of Postgres,' · in Pro c. 1986 ACM SIGMOD Int. Conf on Management of Data , 1986, pp. 340-355.

!53] P. P. Chen, "The entity-relationship model: Toward a unified view of data," ACM Trans. on Database Syst., vol. I , pp. 9-36, Mar. 1976.

[54] S. E. Hudson and R. King, "Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system,'' ACM Trans. Database Syst., to be published .

[55] R. Hull and R. King, "Semantic database modeling : Survey, applications and research issues," ACM Computing Surveys, pp. 201-260, Sept. 1987.

[56] J. Banerjee, H . T. Chou, J. F. Garza, W. Kim, D. Woelk, N. Ballou , and H . J. Kim, " Data model issues for object-oriented applications," ACM Trans. Office Information Syst., vol. 5, pp. 3-26, Jan. 1987.

[57] M. Atkinson and P. Buneman, " Types and persistence in database programming languages," ACM Computing Surveys, pp. 105-190, June 1987.

[58] T. Andrews and C. Harris, "Combining language and database advances in an object-oriented development environment," in OOPSLA '87 Conf Proc., 1987, pp. 430-440.

[59] A. Purdy, B. Schuchardt, and D. Maier, "Integrating an objectserver with other worlds," ACM Trans. Office Info. Sysr., vol. 5, pp. 27-47, Jan. 1987.

[60] D. Weinreb, N. Feinberg, D. Gerson, and C. Lamb, "An object-

oriented database system to support an integrated programming environment," Cambridge, MA, Symbolics Tech. Rep., 1988.

[61] S. Ford , J. Joseph, D. Langworthy, D. Lively, G. Pathak, E. Perez, R. Peterson, D. Sparacin, S. Thane, D. Wells, and S. Agarwal. ··zeitgeist: Database support for object-oriented programming," in Proc. Second Int. Workshop on Object-Oriented Database Syst., 1988, pp. 23-42.

[62] Ontologie Incorporated, Ontos System Documentation. Billerica, MA: Ontologie , Inc ., Mar. 1990.

[63] M. P. Atkinson, K. J. Chisholm, and P. W. Cockshott, "PSAigol: An Algol with a persistent heap," SIGPLAN Notice, vol. 17 , pp. 24-31, July 1982.

[64] D. DeWitt and M. Carey, "Object and file management in the EXODUS extensible database system," in Proc. Int. Conf. Very Large Data Bases, 1986, pp. 91-100.

[65] D. Fishman , D. Beech, H. Cate. E. Chow, T. Connors, J. Davis, N. Derrett, C. Hoch, W. Kent , P. Lyngbaek, B. Mahbod, M. Neimat, T. Ryan, and M. Shan, " Iris: An object-oriented database management system.·· ACM Trans. Office Information Syst., vol. 5, pp. 48- 69, Jan. 1987.

[66) M. Stonebraker, L. Rowe. and M. Hirohama, "The implementation of POSTGRES, · · IEEE Trans. Knowledge Data Eng., vol. 2. pp. 125-141. Mar. 1990.

[67] P. Wegner and L. Cardelli, "On understanding types, data abstmction. & polymorphism,'' CompLtting Sun•eys, vol. 17, pp. 472- 522, Dec. 1985.

[68] J. E. B. Moss and A. L. Wolf, "Towards principles of inheritance and subtyping in programming languages," Univ. of Massachusetts, Amherst, MA, COINS Tech. Rep. 88-95, 1988.

[69] Object Design , An Introduction to Object-Store, Release 1.0. Burlington, MA: Object Design Inc. Mar. 1990.

1701 J. E. Moss and S. Sinofsky, "Managing persistent data with Mneme: Designing a reliable, shared object interface." in Proc. Second 1m. Workshop on Object-Oriented Database Syst., 1988, pp. 298-316.

[71] J. Joseph, S. Thane, C. Thomp;on, and D. Wells, "Report on the Object-Oriented Databases Workshop," SIGMOD Record, Sept. 1989.

[72] S. Thatte, "Persistent memory: Storage architecture for objectoriented databases, .. in Proc. Int. Workshop on Object-Oriented Database Systems. Pacific Grove. CA. Sept. 1986.

[73] R. Greenblatt, " MOBY address space. " Seminar report on research in progress. Aug. 1985.

[74] S. Reiss, A. Skarra. and S. Zdonik, "An object server for an object-oriented database system." in 1986 Int. Workshop on Object-Oriented Database Sysr., 1986. pp. 196-205.

[75] G. Weiderhold , "Views. objects, and databases," Compw., vol. 19, pp. 37-43. Dec. 1986.

[76] 0. Deux eta/., "The story of 0 2," IEEE Trans. Knowledge Data Eng., vol. 2, pp. 9 1-108, Mar. 1990.

[77) J. Smith and D. Smith, " Database abstractions: Aggregation and generalization,'' ACM Trans. Database Syst., vol. 2, June 1977.

[78] J. Duhl and C. Damon, "A performance comparison of object and relational databases using the Sun benchmark," in OOPSLA '88 Conf Proc .. 1988, pp. 153- 163.

[79] D. 1. Dewitt, S. Ghandeharizadeh, D. A. Schneider, A. Bric ker. H. I. Hsiao, and R. Ramussen , " The gamma database machine project,·' IEEE Trans. Knowledge Data Eng., vol. 2, pp. 44-62, Mar. 1990.

[80] K. Eswaran, J. Gray, R. Lorie, and I. Traiger, "The notions of consistency and predicate locks in a database system," Commun. ACM, vol. 19, pp. 624-633, Nov. 1976.

[81] J. E. Moss. "Nested transactions: An approach to reliable distributed computing," Ph .D. dissertation, Mass. lnst. Techno!., Cambridge, MA, 1981.

[82] W. Weihl and B. Liskov. "Implementation of resilient, atomic data types," ACM Trans. Programming Languages and Syst., vol. 7, pp. 244-269, Apr. 1985.

[83] S. Schwart.. "Synchronizing shared abstract types," CMU Tech. Rep. , 1983.

!84] N. Griffeth , J. E. Moss, and M. Graham, " Abstraction in concurrency control and recovery management,·· Univ . of Massachusetts, Amherst, MA. COINS Tech. Rep. 86-20, 1986.

[85] H . Garcia-Molina and K. Salem, "SAGAS," Princeton Univ. Princeton , NJ, Tech. Rep. CS-TR-070-87, Jan. 1987.

[86) R. Katz and S. Weiss, "Design transaction management," in Proc. 19th ACMIIEEE Del·ign Awomation Conj., June 1984.

[87] S. Zdonik and A. Skarra, "The management of changing types in an object-oriented Database.·· in OOPSLA '86 Conf Proc., pp. 483-495.

[88] Randy H. Katz, " Towards a unified framework for version modeling,'' Univ. of California, Berkeley, Tech. Rep. UCB/CSD 88/ 484, Dec. 1988.

[89] M. Fernandez and S. Zdonik, " Transact ion groups: A model for controlling cooperative transactions,·· in Proc. Workshop on Persistem Object Systems: Their Design, Implementation, and Use, The Univ. of Newcastle. N. S. W., Australia, 1989.

[90] J. Rothney et a/. , "An introduction to a system for distributed database (SDD-1)," Trans. Database Syst. , vol. 5, pp. 1- 17, Mar. 1980.

[91] H. Korth and G . Speegle, " Fonnal model of correctness without serializability, " in Proc. ACM SIGMOD Int. Conf 0 11 Managemem of Data , June 1988.

[92] A. Skarra , " Localized correctness specifications for cooperating transactions in an object-oriented database," Office Knowledge Engineering, vol. 4 , to be published.

[93] A. Synder, "Encapsulation and inheritance in object-oriented programming languages,'' in Proc. Conf on Object-Oriented Programming Systems, Languages, and Applications, 1986, pp. 38-45.

[94] G. Graefe and D. Maier, "Query optimization in object-oriented database systems: A prospectus,., in Proc. Second Int. Workshop on Object-Oriemed Database Syst .. 1988, pp. 359-363.

[95] M. Stonebraker and A. Guttman, "Using a relational database management system for computer aided design data-an update. ·· IEEE Database Eng., vol. 7, pp. 56-60, June 1984.

[96] Texas Instruments Incorporated , RTMS: Relational Table Manageme/11 System Reference Manual. Austin, TX: Texas Instruments Data Systems Group, 1984.

[97] J. Eisen, " A software cache management system. " Texas Instruments CRL-Comp. Sc i. Lab. , Austin, TX, Tech. Rep. , 1985.

[98] J . Blakeley, P.-A. Larson, and F. Tompa , " Efficiently updating materialized views,'' in Proc. ACM SIGMOD Int. Conf on Managemelll of Data, 1986, pp. 61-71.

[99] R. Hanson, "Toward hype11ext publishing : Issues and choices in database design," presented at Hypertext87 Conf, Chapel Hill , NC. 1987.

[100] W. Kent , " Panel: An overview of the Versioning problem," in Proc. ACM SIGMOD 1111. Conf. on Management of Data , May 1989.

[101] S. Feldman. ' ·Make-A program for maintaining computer programs," Software-Practice and Experience, vol. 9, pp. 255-265. Apr. 1979.

[102] D. Moon, R. Stallman, and D. Weinreb. Usp Machine Manual 6th ed. Cambridge, MA: M.I.T. Press. !984.

[I 03] M. J . Rochkind, "The source code control system, '· IEEE Trans. Software Eng., vol. SE-1. pp. 364-370. Dec. 1975.

[104] I. Goldstein and D. G. Bobrow, "A layered approach to software design.·· in Interactive Programming Environments, D. R. Barstow, H. E. Shrobe. and E. Sandwall, Eds. New York , NY: McGraw-Hi ll, ch. 19, 1984, p. 387.

(105] CLF Project, Introduction to the CLF Environment. Marina Del Ray, CA: USC Information Sciences Institute, 1986.

[106] J . Joseph, M. Shadowens, J. Chen, and C. Thompson, "Strawman reference model for Change Management," in Proc. OODBTG workshop on Object-Oriented Database. (NIST Tech. Rep. available from E. Fong, Bldg . A266, Gaithersburg. MD 20899) May 1990.

[107] R. Bhateja and R. H. Katz, "A validation subsystem of a version server for computer-aided design data ," in Proc. 24th ACMIIEEE Design Automation Conf., 1987.

1108] G. S. Landis, '·Design evolution and history in an object-oriented CAD/CAM database, " in Proc. 31st IEEE Computer Society lm. Conf. on Applications of Compwers, 1986, pp. 297-303.

[109] H. T. Chou and W. Kim, " A unifying framework for versions in a CAD environment," in Proc. Int. Con/ on Very Large Data Bases, 1986, pp. 336-344.

[1 10] W . Kim , J . Banerjee, H. T. Chou, J. F. Garza, and D. Woelk, "Composite object support in an object-oriented database system," in Proc. Object-Oriented Programming Systems and Languages Conf, 1987 , pp. 118-125.

[Ill] G. L. Steele, ''The definition and implementation of a computer pro~;;ramming language based on constraiuts," Mass. Inst. Techno!. , Cambridge, MA, Tech. Rep. AI-TR.595, 1980.


(1 12] Wm. Leier. Comtraint Programming Languages-Their Specification and Generation. Reading, MA: Addison-Wesley. 1988.

[113] A. Borning, " Thing/ah-A constraint-oriented simulation laboratory," Ph.D. dissertation, Stanford Univ., Stanford, CA. 1979.

[1 14] A. Borning and R. Duisberg. "Constraint-based tools for building user interfaces." ACM Trans. Graphics, vol. 5, Oct. 1986.

[1 15] A. Boming, R. Duisberg, B. Freeman-Benson, A. Kramer. and M. Woolf, " Constraint hierarchies," in Proc. Object-Oriemed Programming Systems and Languages Conf, 1987, pp. 48- 60.

[ 116] J . Mostow and R. Balzer, "Application of a transformational software development methodology for VLSI design," J. Syst. Software, vol. 4 , pp . 5 1-61, 1984.

[117] J. Banerjee, W. Kim , H. Kim, and H. Korth, "Semantics and implementation of schema evolution in object-oriented databases,'' in Proc. 1987 ACM-SIGMOD 1111. Conf on Management of Data, 1987.

[118] P. H. Stanford, Electronic Design Interchange Format Version 2 0 0, Electronic Industries Association , 1986.

[ 119] Sun Microsystems Incorporated , External Data Representation (XDR). Mountain View, CA: Sun Microsystems, Inc ., Jan. 1985.

[ 120] A. Birrel and B. Nelson, "Implementing remote procedure calls," ACM Trans. Compwer Syst., vol. 2 , pp. 39-59, Feb. 1983.

[121] R. Jones, "Mach and Matchmaker: Kernel and language support for object-oriented distributed systems,'· CMU Tech . Rep., Sept. 1986.

[122] B. Liskov, T . Bloom, D. Gifford. R. Scheifler, and W. Weihl , "Communications in the Mercury System," in Proc. Twenty-First Annual Hawaii bu. Conf on System Science, 1988.

[ 123] Open Software Foundation, OSF/Motif Series. Englewood Cliffs, NJ: Prentice Hall , vols. 1-5, 1990.

(124] M. Linton, J. Vlissides. and P. Calder, "Composing user interfaces with Interviews," IEEE Comput., vol. 22, pp. 65-84, Feb. 1989.

[125] Versant Object Technology Corporation, Object Today! (quarte rly newsletter). Menlo Park, CA: Versant Object Technology Corp., June 1990.

[ 126] Objectivity Incorporated, Objectivity/DB System Overview. Menlo Park, CA: Objectivity Inc .. Mar. 1990.

[1 27] W. Kim , J. F. Garza. N. Ballou, and D. Wnclk, "Architecture of the ORION next-generation database system," IEEE Trans. Knowledge Data Eng., vol. 2, pp. 109- 124. Mar. 1990.

[128] W. Kim, N. Ballou, H. T. Chou. J. F. Garza, and D. Woelk. " Integrating an object-oriented programming system with a database syste m," in Proc. Objected-Oriemed Programming Systems and Langunges Conf. !988, pp. 142-152.

[129] K. Wilkinson, P. Lyngboek. and W. Hasan. " The iris architecture and implementation," IEEE Trans. Knowledge Data Eng., vol. 2, pp. 63- 75. Mar. 1990.

( 130] M. M. Astrahan , "System R: A relational database management system," ACM Trans. Database Syst. , vol. I, pp. 97-137, June 1976.

[131] G. Pathak. J. Joseph, and S. Ford , ·'Object Exchange Service for an object-oriented database system," in Proc. Fifth lm. Conf on Data Eng. , 1989, pp. 27-34.

[132] F. Bancilhon et a/., "Final report on DARPA-NSF-ESPRIT workshop on US/EC collaboration in information technology: Session on OODBs," National Science Foundation. Tech. Rep ., Aug. 1990, (Available from E. Fong, NIST. Tech. Bldg. A266 , Gaithersburg, MD 20899.)

John V. Joseph (Member, IEEE) received the Ph.D. degree in mathematics from Purdue University and the M .S. in computer science from University of North Carolina, Chapel Hill in 1977 and !983, respectively.

From 1977 to 1983 he taught at the University of North Carolina, Greensboro. He joined Texas Instruments, Dallas, TX , in 1983 as the project manager for VHSIC software tools. Since then his research has centered on design tools, object-oriented databases, change man-

63

agement systems, and software engineering. He is a developer of the Zeitgeist Object-Oriented Database. He has published in the areas of change management and object-oriented databases and has two patents pending in these areas.

Dr. Joseph is a member of the Association for Computing Machinery.

Satish M. Thatte (Senior Member, IEEE) received the B.E. (Hons. degree) in electronics engineering with Gold Medal for highest scholastic achievement from the Birla Institute of Technology and Science, Pilani, India, in 1975. He received the M.S. and Ph.D. degrees in electrical engineering from the University of Illinois, Urbana-Champagne, IL, in 1977 and 1979, respectively.

He joined Texas Instruments, Dallas, TX in 1979 and played a leading role in formulating

and initiating TI's VLSI Design for Testability effort in the VLSI Design Laboratory. He was the Principal Technical Investigator of the " Design Test Technology for VHSIC," VHSIC Phase III contract from 1980 to 1983. At Texas Instruments, he was elected a Senior Member of Technical Staff in 1983. From 1983 to 1985 he worked on advanced computer architectures for symbolic computing and artificial intelligence, involving research on memory management (virtual memories, cache management, garbage collection techniques), and database systems architecture. From 1986 to 1988 he was manager of Database Systems branch inTI's Artificial Intelligence Laboratory. He is Director of the Information Technologies Laboratory in the Computer Science Center, where he leads research on object-oriented database systems, hypermedia systems , and advanced information delivery technologies. He has published twentyseven technical papers, and holds eight U.S. and one European patents.

Dr. Thatte is a member of the Association for Computing Machinery.

Craig W. Thompson (Senior Member, IEEE) received the B.A. degree in mathematics from Stanford University in 1971 and the M.A. and Ph.D. degrees in computer science from The University of Texas Austin in 1977 and 1984, respectively.

From 1977 to 1981 he taught at the University of Tennessee, Knoxville . He joined Texas Instruments in 1981. He is currently manager of the Zeitgeist Open OODB project in the Information Technologies Laboratory, Com

puter Science Center, Texas Instruments, Dallas, TX. His research has centered on engineering databases, object-oriented databases, hypermedia systems, and user interfaces. He has published twenty-five technical papers and holds two U.S. patents with three patents pending.

Dr. Thompson is an active member of X3/SPARC/DBSSG/OODB Task Group, Object Management Group, and the Association for Computing Machinery.

David L. Wells (Member, IEEE) received the B.S. degree in applied mathematics and physics, the M.S. in computer science, and the Doctor of engineering degree in computer science in 1975, 1976, and 1980, respectively, from the University of Wisconsin, Milwaukee.

From 1980 to 1986 he was an Assistant Professor of Computer Science at Southern Methodist University in Dallas, TX, performing research in computer security, computer graphics, and database systems. Since 1986, he has

been a Member of Technical Staff in the Information Technologies Laboratory, Computer Science Center, at Texas Instruments, Dallas, TX, where he is a developer of the Zeitgeist object-oriented database.