6
Computer Standards & Interfaces 13 (1991) 287-292 287 North-Holland Object-oriented data modeling in rule-based software development environments Naser S. Barghouti a and Michael H. Sokolsky b " Columbia UniversiO,, Computer Science Department, 450 Computer ~cience Building, New York, NY 10027, USA h lnteractit, e Development Environments, San Fransisco, CA, USA Abstract Barghouti, N.S., and M.H. Sokolsky, Object-oriented data modeling in rule-based software development environments, Computer Standards & Interfaces 13 (1991) 287-292 Object-oriented databases (OODBs) were developed to support advanced applications for which traditional databases are not sufficient. The data management requirements of these new applications are significantly different from more traditional data processing applications. More light needs to be shed on these requirements in order to identify the aspects of OODBs that can lead to standards. We have studied the data management requirements of one class of advanced database applications: rule-based software development environments (RBDEs). RBDEs store project components in an object database and control access to these objects through a rule-based process model, which defines each development activity as a rule. The components are abstracted as instances of classes which are defined by the project's data model. In this paper we discuss the constructs that a data modeling language for RBDEs should provide, and then explore some of the data management requirements of RBDEs. We use the Marvel system we developed at Columbia as an example. Keywords: Object management systems; consistency management; concurrency control; composite objects; long transactions, cooperative transactions; software development environments; rule-based development environments. 1. Introduction The motivation for developing object-oriented databases (OODBs) has been to support advanced applications for which traditional databases are not sufficient. The data management requirements of these new applications are significantly differ- ent from those of more traditional applications. The primary thesis of this short paper is that more light needs to be shed on these requirements in order to identify the aspects of OODBs that can lead to standards. The goal is to make sure that any candidate for standards meets these require- ments. Barghouti is supported in part by the Center for Telecommuni- cations Research. This research was conducted while Sokolsky was at Columbia University, where he was supported by the Center for Ad- vanced Technology. We have studied the data management require- ments of one class of advanced database applica- tions: rule-based software development environ- ments (RBDEs). We have been involved with the design and development of the Marvel RBDE at Columbia University for the past couple of years. Marvel, like other SDEs, generates and manipu- lates large amounts of data in the form of source code, object code, documentation, test suites, etc. Following the recent trend in the software devel- opment environments community, we have at- tempted to utilize database technology to uni- formly manage all the objects belonging to a pro- ject. The aim was to benefit from the advantages of DBMSs such as data integration, application orientation, data integrity, convenient access, and data independence. We thus built Marvel on top of an object management system (OMS) that stores all project components as objects in an object database. The OMS provides access to the objects and maintains the database consistency. We built 0920-5489/91/$03.50 © 1991 - Elsevier Science Publishers B.V. All rights reserved

Object-oriented data modeling in rule-based software development environments

Embed Size (px)

Citation preview

Page 1: Object-oriented data modeling in rule-based software development environments

Computer Standards & Interfaces 13 (1991) 287-292 287 North-Holland

Object-oriented data modeling in rule-based software development environments

Naser S. Barghouti a and Michael H. Sokolsky b

" Columbia UniversiO,, Computer Science Department, 450 Computer ~cience Building, New York, N Y 10027, USA h lnteractit, e Development Environments, San Fransisco, CA, USA

Abstract

Barghouti, N.S., and M.H. Sokolsky, Object-oriented data modeling in rule-based software development environments, Computer Standards & Interfaces 13 (1991) 287-292

Object-oriented databases (OODBs) were developed to support advanced applications for which traditional databases are not sufficient. The data management requirements of these new applications are significantly different from more traditional data processing applications. More light needs to be shed on these requirements in order to identify the aspects of OODBs that can lead to standards. We have studied the data management requirements of one class of advanced database applications: rule-based software development environments (RBDEs). RBDEs store project components in an object database and control access to these objects through a rule-based process model, which defines each development activity as a rule. The components are abstracted as instances of classes which are defined by the project's data model. In this paper we discuss the constructs that a data modeling language for RBDEs should provide, and then explore some of the data management requirements of RBDEs. We use the Marvel system we developed at Columbia as an example.

Keywords: Object management systems; consistency management ; concurrency control; composite objects; long transactions, cooperative transactions; software development environments; rule-based development environments.

1. Introduction

The motivation for developing object-oriented databases (OODBs) has been to support advanced applications for which traditional databases are not sufficient. The data management requirements of these new applications are significantly differ- ent from those of more traditional applications. The primary thesis of this short paper is that more light needs to be shed on these requirements in order to identify the aspects of OODBs that can lead to standards. The goal is to make sure that any candidate for standards meets these require- ments.

Barghouti is supported in part by the Center for Telecommuni- cations Research. This research was conducted while Sokolsky was at Columbia University, where he was supported by the Center for Ad- vanced Technology.

We have studied the data management require- ments of one class of advanced database applica- tions: rule-based software development environ- ments (RBDEs). We have been involved with the design and development of the Marvel RBDE at Columbia University for the past couple of years. Marvel, like other SDEs, generates and manipu- lates large amounts of data in the form of source code, object code, documentation, test suites, etc. Following the recent trend in the software devel- opment environments community, we have at- tempted to utilize database technology to uni- formly manage all the objects belonging to a pro- ject. The aim was to benefit from the advantages of DBMSs such as data integration, application orientation, data integrity, convenient access, and data independence. We thus built Marvel on top of an object management system (OMS) that stores all project components as objects in an object database. The OMS provides access to the objects and maintains the database consistency. We built

0920-5489/91/$03.50 © 1991 - Elsevier Science Publishers B.V. All rights reserved

Page 2: Object-oriented data modeling in rule-based software development environments

288 N.S. Barghouti, M.H. Sokolsky

our own OMS because none of the existing OODBs at the time satisfied our needs.

Through our experience with Marvel, we have characterized key requirements that the OODB must provide in order to support software devel- opment as well as some requirements that are germane to rule-based systems such as Marvel. In this paper we briefly describe the Marvel system, and enumerate its data management requirements. We believe that other advanced database applica- tions such as C A D / C A M systems, office automa- tions systems, etc., have similar requirements. These requirements should be taken into consider- ation when OODB standards are formulated.

2. The Marvel system

Marvel is a rule-based development environ- ment kernel that stores software artifacts in a project database, and defines each software devel- opment activity that manipulates these artifacts as a rule. Forward and backward chaining on the rules is applied to automate some of the chores that developers would have otherwise done manu- ally, to ensure consistency in the project database, and to monitor and enforce a particular model of development. The long-term goal of the Marvel project is to develop a kernel for generating multi-user development environments that use knowledge about the software development pro- cess of large-scale projects to support the needs of multiple developers cooperating on these projects.

The kernel is a controlled automation engine that uses a rule-based process model specification and an object-oriented data model specification. These specifications are written in a language called the Marvel Strategy Language (MSL). MSL specifications are divided into organizational modules called strategies. We envision that libraries of MSL strategies will be built, main- tained and shared by project administrators. The Marvel kernel has facilities to load and merge strategies to produce a target Marvel environment that understands the data model and process model of the project.

The data model is specified in terms of classes, each of which consists of a set of typed attributes that can be inherited from multiple superclasses. Attribute types include simple types, files, sets and directed links. Set attributes contain instances of

other classes as their values, thus implementing composite objects, and giving the Marvel object management system (OMS) a hierarchical traver- sal capability. Links can be generic, or point to specific attribute types or classes, thus giving the Marvel OMS arbitrary graph traversal capability. Existing software systems can be immigrated into Marvel using the Marvelizer tool [3]. The Marvel OMS supports creation and deletion of objects according to the data model.

The process model defines rules that specify the behavior of the tailored Marvel environment in terms of what commands are available and what kind of automation is provided. Marvel supports a model of automation called opportunistic processing, which employs backward and forward chaining to automatically initiate activities. The set of rules that are loaded into a Marvel environ- ment form a network of possible forward and backward chains. Marvel rules are more com- plicated then their expert systems ancestors; each rule contains a condition that must be true for the rule to fire, an activity, which is a general mecha- nism to execute arbitrary, existing tools, and mul- tiple effects that assert the results of the tool into the Marvel objectbase.

When multiple developers cooperate on a pro- ject within Marvel, they share a common database that contains all the objects of the project. These developers start concurrent sessions in order to complete their specific tasks. During their ses- sions, the developers concurrently request oper- ations that access objects in the shared project database. These concurrent operations might violate the consistency of the objects they access if they concurrently change or dependent attributes conflicting ways.

Since most operations

either the same attribute of the same object in

correspond to rules and since chaining might lead to firing other rules that perform more conflicting operations on the data- base, more inconsistencies might be introduced in the database. More generally, the overall behavior of cooperating developers in Marvel can be mod- eled as multiple sets of rules, where multiple rules from each set are fired concurrently to perform operations on the shared project database. This situation is depicted in Fig. 1.

In the rest of this paper, we investigate the data management requirements of Marvel-like software development environments and what facilities are

Page 3: Object-oriented data modeling in rule-based software development environments

Object-oriented data modeling 289

Taskl

I

Data model (object classes)

Multi-Agent RBDE

Rule (subtask) Task2i ? k 3

Conflict detection ~ Conflict resolution ]

~ (deveP~Co;SSmemntodrueles) C °nSi~cteonnetrYolSP~l~fi)cati°n

Fig. 1. The multi-agent problem in Marvel.

needed to make OODBs meet these requirements. We feel that making these facilities standard fea- tures of OODB would enable OODBs to support the data management needs of advanced applica- tions such as software development environments.

3. Data modeling requirements

Marvel, like other SDEs, must understand how a project's data is organized in order to provide assistance. In the software development domain, different projects might impose different organiza- tion on the data, which in this case consists of the project's components. Thus, Marvel needs to acquire knowledge about the structure of data in the project under development as well as how this data is accessed. Although OODBs typically sup- port an object-oriented data model, most of them do not fully exploit the features of object-oriented programming. Specifically, they only utilize the structural aspects of object-oriented modeling since they support only aggregation hierarchies (i.e. objects that have the same attributes are grouped together in one class), generalization and specialization hierarchies (i.e. inheritance), and the unique identification of objects.

Existing OODBs do not support the modeling of the behaviorial aspects of objects such as the

operations that can be performed on each object (called methods in object oriented languages). In- stead the behavioral aspects of the system are modeled in the application that treats objects as passive pieces of data. Thus, rather than providing a uniform and integrated model of data and the operations that can be performed on it, existing OODBs separate the two issues, foregoing the expressive power that can be gained by extending object-oriented features such as multiple inheri- tance, overriding, and late binding of objects to the development activities.

The project database in Marvel is defined by a structurally object-oriented data model, in the sense of an object data model that defines the organiza- tion and structure of the object hierarchy that constitutes the object-oriented database. Our primary innovation in Marvel is to treat the rules themselves as the 'methods' of these objects, and employ behavioral concepts from object-oriented programming to integrate the rule system with the shared project database.

In particular, we define the shared project database using an object-oriented data model and further define the methods of these objects via rules, where the rules specify the objects manipu- lated by each development activity, the condition that must be satisfied to initiate the activity, the tools or other facilities that are employed in carry-

Page 4: Object-oriented data modeling in rule-based software development environments

290 N.S. Barghouti, M.H. Sokolsky

ing out the activity, and the effects of completing the activity with respect to the status of the soft- ware project. Developers are assisted by Marvel's forward and backward chaining on the rules, to automate certain activities, ensure consistency in the project database, a n d / o r monitor developer actions to determine conformance to the desig- nated process and detect divergence from this process.

These behavioral aspects of the data model are typically defined by methods in standard object oriented languages. Methods prescribe the oper- ations that can be perfomed on instances of each class. Each method applies only to instances of the class in which it was defined, and they are invoked by sending a message to an object requesting its invocation. Multi-methods are methods that apply to instances of more than one class. Thus, while attributes describe the structure of objects, meth- ods describe how they behave when they are sent messages from the human developer or other ob- jects. Methods, like attributes, can be inherited from a class to its subclasses.

Marvel unifies the behavioral aspects of data modeling with the process model by defining the methods that operate on objects via rules. Each rule has a set of typed formal parameters that are instantiated with objects of the same type (i.e. class) when the rule is invoked. The rule's condi- tion is a logical clause that is essentially a read-only query on the values of the attributes of the objects that are passed as parameters, while the effects change the values of the objects' attributes. Thus, each rule is equivalent to a multi-method that applies to the classes that are the types of the formal parameters, and, like methods, rules can be inherited. Thus, if a rule has a formal parameter of a certain class C, the objects that are passed as actual parameters can be instances of either C of any of its subclasses.

Based on our experience in Marvel, we feel that OODBs must provide for both structural and be- havioral aspects of data modeling in order to realistically meet the data management needs of applications similar to SDEs.

4. Consistency maintenance requirements

The consistency problem has been addressed in traditional database management domains such as

banking. In these domains, there is a lack of knowledge ab o u t the appl ica t ion-speci f ic semantics of database operations, and a need to design general mechanisms that cut across many potential applications. Thus, the best a database management system (DBMS) can do is to abstract all operations on a database to read and write operations. All computations are then pro- grammed into transactions that consist of a se- quence of read and write operations. Each trans- action, if executed atomically (i.e. either all of its operations are performed in order or none are), transforms the database from one consistent state to another. When multiple transactions run con- currently, the DBMS can guarantee that the database is transformed to a consistent state with respect to reads and writes by allowing only serial- izable executions of the concurrent transactions [11.

Several existing OODBs use serializability to synchronize concurrent transactions by isolating them, preventing the sharing of data a n d / o r knowledge with other concurrent transactions. Iso- lation guarantees serializability, and thus strict maintenance of consistency, since it makes each agent's work appear as an atomic transaction. Unfortunately, isolation between concurrent transactions in the software development domain unnecessarily obstructs cooperation [2].

Our conjecture is that an OODB can use knowledge about the structural organization of the project's data, the meaning of data consistency for this project, and the semantics of operations per- formed by agents on the database in order to provide an extended transaction model. These pieces of information are different for different projects or different phases of the project. For example, the consistency specification of a small project with a couple of developers who are familiar with all the components of the project might permit those developers to access the same object at the same time. The consistency specifica- tion of a large project with hundreds of develop- ers, however, might require strict isolation be- tween different groups of developers, and might not allow interaction except through a strictly-de- fined interface.

Given the flexible consistency maintenance re- quirements of SDEs, we feel that OODBs should provide an extended transaction mechanism that supports the following:

Page 5: Object-oriented data modeling in rule-based software development environments

Object-oriented data modeling 291

1. Long transactions: Long-lived operations on objects in design environments (such as compil- ing and printing) imply that the transactions, in which these operations may be embedded, are also long-lived. Long transactions need differ- ent support than short transactions. In particu- lar, blocking a transaction until another com- mits is rarely acceptable for long transactions. It is worthwhile noting that the problem of long transactions has also been addressed in traditional data processing applications (bank audit transactions, for example).

2. Composite objects: The complexity of the structure and the size of objects strongly sug- gest the appropriateness of concurrency control mechanisms that combine multiversion and multiple granularity mechanisms. For example, objects in a software project might be organized in a nested object system (projects consisting of modules that contain procedures), where indi- vidual objects are accessed hierarchically.

3. User control: In order to support database operations that are nondeterministic and inter- active in nature, the concurrency control mech- anism should provide the user with the ability to start a transaction, interactively execute op- erations within it, dynamically restructure it, and commit or abort it at any time. The nonde- terministic nature of transactions implies that the concurrency control mechanism will not be able to determine whether or not the execution of a transaction will violate database con- sistency, except by actually executing it and validating its results against the changed data- base. This might lead to situations in which the user might have invested many hours running a transaction, only to find out later when he wants to commit it that some of the operations he performed within the transaction have violated some consistency constraints; he would definitely oppose the deletion of all his work (by rolling back the transaction) in order to prevent the violation of consistency. He might, however, be able to reverse the effects of some operations in order to regain consistency. Thus, what is needed is the provision of more user control over transactions.

4. Synergistic cooperation: Cooperation among programmers to develop versions of project components has significant implications on concurrency control. In C A D / C A M systems

and SDEs, several users share knowledge col- lectively and through this knowledge, they are able to continue their work. Furthermore, the work of two or more users working on shared objects may not be serializable. They may pass the shared objects back and forth in a way that is not equivalent to doing it serially. Also, two users might be modifying two components of the same complex object concurrently with the intent of integrating these components to create a new version of the complex object, and thus they might need to look at each others' work to make sure that they are not modifying the two components in ways that would make integra- tion difficult. To insist on serializable concur- rency control in design environments might thus decrease concurrency or actually disallow desirable forms of cooperation among develop- ers.

5. Conclusions

Object-oriented databases were introduced be- cause of the shortcomings Software development environments are an example of advanced appli- cations that benefit from using an underlying OODB. We presented Marvel, a rule-based SDE the uses an underlying OODB to store the soft- ware artifacts of projects under development. We presented some of the data management and data consistency requirements of SDEs that we have characterized based on our experience with Marvel. We believe that these requirements are broad enough to be applicable to many advanced data- base applications, and thus they have to be con- sidered in any effort leading to the formal stan- dardization of OODB.

References

[1] P.A. Bernstein, V. Hadzilacos and N. Goodman, Concur- rency Control and Recovery in Database systems (Addison- Wesley Reading, MA, 1987).

[2] I. Greif, ed., Computer-Supported Cooperative Work: A Book of Readings (Morgan Kaufman, San Marco, CA 1988).

[3] M.H. Sokolsky, Data migration in an object-oriented software development environment, Master's thesis, Col- umbia University Department of Computer Science, April 1989, Technical Report CUCS-424-89.

Page 6: Object-oriented data modeling in rule-based software development environments

292 N.S. Barghouti, M.H. Sokolsky

Nase r S. Barghouti is a PhD candidate in the computer science department at Columbia University, where he was an IBM Fellow in 1986-87. He expects to receive his degree in 1991. His re- search interests include software de- velopment environments, rule-based systems, and advanced database appli- cations. Barghouti's thesis addresses the problems of concurrency and co- operation in multi-user rule-based de- velopment environments. He received the BS (honours with distinction) and

MS degrees, both in computer science, from Columbia Univer- sity in 1985 and 1987 respectively. He is a member of the ACM, the Computer Society, and Tau Beta Pi.

lvllcnael H. Sokolsky is currently a senior software engineer and project leader at Interactive Development En- vironments, SF, CA. Previously, he was at Hewlett Packard's Design Technology Laboratory, and then at Columbia University as a Research Associate in the Computer Science Department. His research interests in- clude tool integration, software en- gineering, and software development environments. Sokolsky recieved the BS degree in computer science from

the University of California at Santa Barbara in 1985, and the MS degree also in computer science from Columbia Univer- sity, NY in 1989. He is a member of the ACM, and the IEEE Computer Society.