40
Napier University Edinburgh Coursework Object-Oriented and Databases CO42009 Subject: Object-Oriented Query Languages 1

The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Embed Size (px)

Citation preview

Page 1: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Napier UniversityEdinburgh

Coursework

Object-Oriented and Databases

CO42009

Subject:Object-Oriented Query Languages

May 2002

1

Page 2: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

By:Omar AlasamJulien AubryJulien FontaineBénédicte Gruss

2

Page 3: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Table of content

Table of content................................................................................................2INTRODUCTION..............................................................................................31 Background...................................................................................................42. Specification and requirements of an OOQL................................................6

2.1 General requirements and characteristics of an OQL.............................62.2 Desirable properties of an OQL and examples.......................................9

2.2.1 Simple access to objects................................................................102.2.2 Access to complex objects.............................................................102.2.3 Join request....................................................................................102.2.4 Identical treatment of attributes and methods................................112.5 Sets of objects...................................................................................112.2.6 Consequences of class hierarchy..................................................122.2.7 Database modifications..................................................................12

2.3 Differences between OOQLs and SQL.................................................122.3.1 Object Ids.......................................................................................122.3.2 Members........................................................................................122.3.3 Methods.........................................................................................132.3.4 Derived Data..................................................................................132.3.5 Class Hierarchy..............................................................................132.3.6 Joins...............................................................................................132.3.7 Recursion.......................................................................................132.3.8 Extendibility....................................................................................132.3.9 Comparing semantics.....................................................................14

3 Currents Object-Oriented Querying Languages..........................................153.1 LIFOO...................................................................................................153.2 O2QUERY............................................................................................153.3 OQL......................................................................................................163.4 SQL 3....................................................................................................173.5 GemStone.............................................................................................183.6 Iris.........................................................................................................19

4. Query optimisation and future work............................................................204.1 Optimisation Issues:..............................................................................20

4.1.1 Additional Data Types....................................................................204.1.2 Complex Objects............................................................................204.1.3 Methods and Encapsulation...........................................................214.1.4 Identity and Equivalence................................................................21

4.2 Algebras and Calculi Optimisation........................................................214.2.1 Algebraic Optimisation...................................................................214.2.3 Extensible Optimisers.....................................................................22

4.3 Query Optimisation Methodology..........................................................224.4 OOQL Future Directions.......................................................................23

4.4.1 Research in OOQL.........................................................................244.6 Future scope.........................................................................................24

CONCLUSION...............................................................................................25References:....................................................................................................26

Books:.....................................................................................................26Web sites:...............................................................................................26

3

Page 4: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

4

Page 5: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

INTRODUCTION

During the past three decades, the database technology for information systems has undergone several generations of evolution. The transition from one generation to the next has always been necessitated by the increasing complexity of database applications and cost of implementing, maintaining and extending these applications.The last databases generation introduced the notion of objects, derivated from so-called object-oriented programming languages.This new databases were designed to meet the requirements of new applications, were large and complex data structures, multiple data versions, nested and interrelated data were to be used.Those new applications find their places in domains such as: - Manufacturing and real time design- Office automation- Scientific and medical information- Computer aided design.

It is in the mid-eighties that object-oriented databases appeared, providing therefore the ability to store objects whose behaviour and state, and the relationships are defined in accordance with an object data model. Thus Object-Oriented Databases Management Systems (OODBMS) is one of the most promising technologies for the next generation of Database Management Systems (DBMS), although it still lacks a common data model and formal foundations similar to those in traditional databases (relational).ODMG, the Object Database Management Group, which is a grouping of Object-oriented database vendors, targets the problem of standard in published specifications.Though most of the areas related to OODB are now nearly stabilized (Data Model, Object Definition Language, Programming Languages Binding), still not clearly defined is the query language. OQL (Object Query Language) is proposed but has not been accepted as a standard yet.This paper investigates the query languages that would cope with an object-oriented system. The first part introduces the object-oriented database world to be able to define the requirements for the query language in part 2. Part 3 describes an overview of the current languages that are used among databases vendors, and finally the last parts are dealing with query optimisations and the future directions for an OOQL.

5

Page 6: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

1 Background

Of all current available database systems, object-oriented ones represent a promising way of meeting the demand of the most advanced applications, in those situations where traditional, thus relational systems are inadequate. Object Oriented Database Management Systems (OODBMS) are nowadays really popular as a result of the increased popularity of object-oriented languages, and because relational databases are not suitable for new applications. Those applications express the need of complex data structures (extensible type system where types can be defined), complex data access for navigation and query and high performance for real time answer to query.There is not yet a real standard for object-oriented databases, but the Object Database Management Group (ODMG), which is a consortium of database vendors, targets the problem of OODB standard and has published specifications.The standard proposed involves mainly an Object Model, Object Definition Language and an Object Query Language.Even if this standard is not totally accepted, the data model should, at least support the following object-oriented concepts:

Objects and Identity, where each real world entity is modelled as an object and each object is associated with a unique identifier.

Complex Objects, that means a set of attributes is associated to each object. Those attributes can be an object or a set of object.

Encapsulation, which provides data independence and abstraction. Each object contains and defines both the procedures (methods) and the interface with which it can be accessed and manipulated by other objects. The interface is a set of operations that can be invoked on the object. The state of the object is manipulated by methods invoked by the corresponding operations.

Classes are used to group objects which share the same set of attributes and methods. Each object is an instance of some class.

Inheritance, which allow a class to be defined as an instance of one or more class, and to inherit the attributes and methods of such classes. Therefore, a sub-class inherits from super-classes.

Overloading, overriding and late binding, which allow different methods to be associated with a single operation name, leaving the system to determine which method should be used in order to execute a given program.

The standard also include an Object Definition Language (ODL) still specified by ODMG. ODL is not intended to be a full programming language. It is a definition language for object specifications. Database management systems

6

Page 7: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

traditionally provide facilities that support data definition (using a Data Definition Language (DDL)).The DDL allows users to define their data types and interfaces while the Data Manipulation Languages (DML) allows creating, deleting, and reading update instances of those data types.ODL is a DDL for objects types. If defines the characteristics of types, including their properties and operations. ODL defines only the signatures of operations and does not address definitions of the methods that implement those operations.ODL is intended to define object types that can be implemented in a variety of programming languages. Therefore, ODL is not tied to the syntax of a particular programming language.

Finally is the Object Oriented Query Language, OQL (Object Query Language) is proposed by ODMG but is not yet a standard.In the Object Oriented Database System Manifesto [Atk89], some of the early proponents of Object–Oriented Database Management Systems stated that such systems should provide the functionality of an ad hoc query language that may not be done in the form of a query language.But this perception has changed since this was written and it is now recognize that a good query language is one of the cornerstones of an OODBMS alongside an object-oriented model with unique object identifiers. This is the result of two conclusions from the OODBMS community:

Alternatives to a query language are throwbacks to queries of the first generation DBMS, which were representation specific queries, like the ones used in CODASYL.

Users really demand an SQL-like declarative query language, as relational database and SQL are most used.

First tries to develop query languages for OODBMS centred on making particular object oriented programming languages persistent. O++ Database Programming Language (an extension of C++ that supports persistent objects) is an example. These languages provided support for user queries through the programming language itself. Queries were therefore completely non-declarative, specific to the internal representation of objects and usually involved complex pointer traversal through object members. Many thought that this was a sufficient querying capability for OODBMS.

Later on, when the first generation of OODBMS has stabilized, the OODBMS community realized that better querying capabilities were needed.

7

Page 8: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

2. Specification and requirements of an OOQLThe aim of a query language is to allow the user to search a database and find any record of the database. This was achieved in the relational databases with the standard query language SQL (which stands for Structured Query Language). With the extension of relational databases to Object Oriented Databases (OODBs) SQL cannot cope and a new Query Language that has the ability to search among objects has to be defined. In this section we will discuss the requirements and properties of an Object Oriented Query Language (OOQL).

2.1 General requirements and characteristics

The Object-Oriented Database System Manifesto [Atk89] sets a list of a three criteria that a query language should satisfy:

It should be high level. A non-initiate user should be able to send a request in few works of mouse clicks.

It should be efficient. The query should allow to complicated request but should allow the query optimisation somehow.

It should be application independent. The language should not be specified to a particular database but be standard able to cope with any possible database. This is the principle of universality or genericity.

But that was the requirements of the first manifesto. Since the research on OODB has progressed and some other requirements have been added:

Descriptiveness: As it should be a high level language, the OOQL should not be dependent on the structure of the database. The query language should provide a collection of descriptive operators instead of navigational ones depending on the type and structure of the data. It should emphasize the “what” and not the “how”.

Closure: It is not enough to display the result of a query in some kind of tables thus every result should be described into the object data model. That is important for a lot of database features such as composite queries, query optimisation and views. The result of a query should be expressed in such a way that result can be used as an input for other queries.

Completeness: the language should be computationally complete (possibility to perform loops, if…) to enhance the possibility of request.

Expressive power: The language should fill the gap due to relational languages like the recursive traversing of objects.

8

Page 9: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Extensibility: The language should not only support language-defined types or user-defined types but both of them.

Orthogonality. The query language must be orthogonal to the structure of the data and to the persistence. Every object constructor has equal importance, and transient and persistent data should be query the same way.

The aim of an object query language is to work out all the objects stored within a database. This was achieved with SQL in the relational databases. OOQLs have some other characteristics:

Access to objects: in OODB there are two ways to access objects. Each object should be uniquely identified by an OID (Object Identification). Therefore the first way to access data is a navigational way and uses the aggregation properties of objects. From a given OID, the request is executed by searching through the objects referred to by the attributes. The second way is to access to a set of objects based on declarative queries

. A combination of both search methods can be useful. The second type of query works out a set of objects. The navigational method is then used to access the objects.

Aggregation hierarchy. The OQL must be able to query nested objects. This means moving between objects via the relationships that connect them. Without this problem solved, the result of a query would only be limited to the objects that do not refer to anything.

Inheritance hierarchy. Since OODB supports inheritance, a request must specify whether the query must include the inheriting objects of a class or not. And it is the case then it must be a way for the language to query the subclasses.

Join. In object-oriented languages, two types of join exist. First is the explicit join. As in the relational database model, the objects are compared using variables or OID’s. The second is the implicit join and is made by the comparison of object’s paths. A good OQL should allow both type of join queries to be performed.

Recursive queries. A recursive query is a query run on an object and then run again on the resulting objects until the final result is found. OOQL must support recursive queries because it is often the way that objects are created.

Object equality. Different types of equality exist in an OODB: equality of OIDs, shallow equality and deep equality. The query language should be able to make the difference between them.

9

Page 10: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Method invocation. Query using object methods or class methods should be possible. But the database should not be changed (problem of side effects).

Result behaviour. The result behaviour defines the limit of a result to a request. Basically a query should not modify the database except in the case of a request on a precedent result. Nevertheless it could be possible to allow some changes on the data. For instance only insertion but not deletion or modification are accepted.

10

Page 11: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

2.2 Desirable properties of an OQL and examples

This part deals with the desirable properties of an OQL. It is treated using examples of queries. The examples use a SQL-like language but are not instances of an existing OQL. The database to query is represented in figure 1

Figure 1 example of OODB

11

AutomobileDriveCarbody: String

VehicleModel: StringManufacturerColour: String

CompanyName: StringHeadofficeSubsidiariesPresident

AddressStreet: StringLocation: String

SubsidiaryName: StringOfficeManagerEmployees

EmployeeQualifications: StringSalary: IntegerFamilyMember

PersonName: StringAge: IntegerDomicileFleet

VehicleDriveEngineGearing: String

OttoEngineHP: IntegerCC: Integer

Inheritance

Page 12: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

2.2.1 Simple access to objectsA query may simply want to access to an object itself or the attributes of the object. There must be a linguistic distinction for accessing to an object or accessing to its attribute. Example 1

select ffrom f in Vehiclewhere Model = ‘Tipo’

Example 2select *from f in Vehiclewhere Model = ‘Tipo’

In the first example the use of the variable f in both the from-clause and the select-clause indicates that one wants to access the object. To access the attributes of the object the select-clause does not indicate that it requests the object itself but indicates the required attribute of the object. Example 2 illustrates a request to access to all the values of the attributes of object f.

2.2.2 Access to complex objectsObjects are not always simple and may refer to some other objects. Queering an aggregated object should be possible.

Example 3select President.Salaryfrom Companywhere HeadOffice.Location = ‘Rome’

The previous example uses path expression within the query to make a selection on a complex object. The result is the salaries of all the presidents of companies which have their head office located at Rome.

2.2.3 Join requestAs with relational databases, OODB needs to process request on object, which are not in the same class. It means that relationships between objects need to be expressed so that request can go through.

Example 4select p, ffrom p in Person, f in Vehiculewhere p.Name = f.Manufacturer.President.Name

The object of the previous request is to find the persons and the vehicles where the name of the person is the same that the name of the president of the company that manufactures the vehicle. Two classes (Persons and Vehicles) are concerned with the request and then are need to be joined. The

12

Page 13: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

relationship between object are not all defined, it explains the need for join operations.

2.2.4 Identical treatment of attributes and methodsThe possibility to process queries as well on object’s attributes than on object’s methods was explained earlier. The interest may be to have the state of an object at the moment of the request. For example if one wants to know everything concerning the red cars, a method could return the number of cars stored at the moment of the querie. For that the query language should consider equally the attributes of an object and the methods of the same object.

Example 5select Nb_Vehicle()from fwhere color = ‘red’

Nb_Vehicle() is a method that returns the number of vehicle at the moment of the query. In that case the method does not have any argument but it should be possible to pass one if required.

2.5 Sets of objectsThe result of query can be more than only one object but a collection of objects (or set of objects). In that case we have to explain the two roles of a class. First it can define a type (structure and behaviour) of an object. But it can also define a set of those objects.

Example 6select Namefrom Personwhere Age > 50

The query of example 6 gives as result a set of persons who the age is superior to fifty years. But person is both the a class of the objects and the set of the objects. A clearer way to express this query is to give a name to the resulting set of object. Example 7 illustrates the latter.

Example 7MyFamily := select pfrom p in Personwhere Name = MyName

MyFamily is the set of objects that answer the request and can be used for further requests as a new class.

13

Page 14: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

2.2.6 Consequences of class hierarchyThe inheritance properties of an object are very important. Objects of a subclass inherit the properties of their superclass unless the properties are redefined in the subclass. These properties have to be exploited by the query language.

Example 8Select Qualification From EmployeeWhere Name = ‘Peter Smith’

This query asks for the qualification of the employee named Peter Smith. It is to be noticed that the attribute Name is not an attribute of the class Employee but is inherited from the supercall Person. The problem with the inheritance is that the result to a query may not be consistent for the use of further queries. When a request obtains result due to inheritance, the result can be a heterogeneous set of objects. Some objects of the subclass may have more attributes that their superclass. Further queries on the result risk to lead to a type error. For instance, salary is an attribute of the class Employee and not of the class Person. But the result of the query shows in example 8 gives result coming from both classes.A good query language should have a mean to prevent this problem.

2.2.7 Database modificationsFinally the query language should have the ability to modify the database. It means the capacity to create new objects but also to create new classes, instances and methods.This ability to modify the database is natural to the programming language, nevertheless it should also be allow by the query language as a way to make views of the database.

2.3 Differences between OOQLs and SQL

Following are the main divergences between relational and object oriented databases. Those differences are due to the data model involved in a relational or an object oriented database.

2.3.1 Object IdsIn a RDBMS, the atomic data element is a value, with no concept of identity. Objects are provided with an Object ID from OODBMS, and the query language has therefore to deal with complex semantics of equivalence and copying.

2.3.2 MembersAn OODBMS object can have members which contains references or collections of references to others objects. This is done by adding relations

14

Page 15: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

and doing numerous joins in a RDBMS, where the concept of sharing and ownership need to be enforced with an external rule.

2.3.3 MethodsTraditional RDBMS do not support methods at all. Though some third–generation RDBMS (POSTGRES) provide query expression as a data type, this is difficult to typecheck and optimise.

2.3.4 Derived DataIt is useful to accommodate multiple viewpoints on information or to maintain frequently referenced/computed data. This is fully supported with methods in OODBMS, and partially supported in RDBMS with the use of views, which are a form of query.

2.3.5 Class HierarchyOODBMS fully support the notion of inheritance relations between relations, but RDBMS do not support this notion at all.

2.3.6 JoinsOODBMS queries can perform joins on complex objects using several types of equivalence, while RDBMS can only perform joins on simple atomic attributes.

2.3.7 RecursionThis is not supported at all in SQL and most other RDBMS query languages, but in OODBMS, at least simple linear recursion are supported, by a simple notation in the query language.

2.3.8 ExtendibilityOOQLs are extensible in the number of type they can support. This is possible through inheritance, predicates supported and through members’ functions. SQL and RDBMS are not extensible.

Finally, it can be safely said that an OODBMS with a good query language should be able to express all queries expressible in a relational query language. The added semantic information in the data model and the extensibility of the query language will make these queries conceptually clearer if expressed in an OODBMS.

15

Page 16: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

2.3.9 Comparing semanticsOQL is the language defined as a standard by the ODMG, and has been designed so that anyone who is familiar with SQL can use it. The following table show syntax differences for the same query using SQL or OQL. SQL OQLselect Model.name, sales.quantityfrom Manufacturer, Model, saleswhere sales.Fabric = Manufacturer.ID. and sales.Mod = Model.ID and Manufacturer.name="Mercedes"  and sales.country = "Italy"

select c.salefrom c in the_manufacturerswhere c.name = "Mercedes" and c.sale.country ="Italy"

As SQL is not adapted to OODB, early OODB systems had no query language. All access to the database was through the programming language. This has the disadvantage to keep the OODB away from large public as the user had to be a programmer. This is considered as a step backwards in the evolution of DBMSs.

16

Page 17: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

3 Currents Object-Oriented Querying LanguagesAs stated earlier, the ODMG defined a standard and particular specification that an Object-Oriented Query Language must fulfil. But, in the same way that many OODBS have been developed from the standard provided by ODMG, many query languages have been elaborated from this standard.Some of them are completely integrated within the OODBS into the one they operate and some of them have been elaborated from a simple Object-Oriented programming language and could fit into different environments.This section presents some of the most commonly used query languages and gives the main characteristics of those.

3.1 LIFOO

The LIFOO language is a functional language that allows O2 databases to be queried.A LIFOO request is made of a combination of functional operators. LIFOO is a strongly typed language. The type checking is achieved at compilation time.

The main characteristics of LIFOO are:

LIFOO is a pure interrogation language; the update operations are not supported. It thus supports quantification, aggregation and ordering functions but it does not support recursion. A request has for answer an object already existing or a NIL value.

Object’s encapsulation is respected. The software manipulates the objects through its methods; it does not reach the data directly.

LIFOO is statically typed; the type checking of a request is achieved at compilation time. The type declarations that the user has to do are those of the start of the functions that it defines. The other types are inserted automatically.

3.2 O2QUERY

O2Query is another functional language that is also used in order to query O2 systems.O2QUERY is a language that can be used either in an interactive mode or in a programming mode. When in the interactive mode, O2QUERY can freely access to the value of an object without predefined methods or without using those. Doing so, the language violates the encapsulation of the object. This mode is exclusively allowed for interactive requests as the answers to those only depend on the actual value of the objects they manipulate. When integrated into a program, requests are not allowed to access values in this way, they must respect the protection provided by the encapsulation.O2Query uses a high-level language based on SQL.

17

Page 18: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Generic methods are available, for example for copying (copy and deep_copy) or testing for equality (equal, deep_equal…). User-defined collections of objects can be formed.3.3 OQL

OQL is a language directly built from the ODMG definition, and copes with the framework provided by the Object Definition Language.

OQL is close to SQL-92 with, additionally, those characteristics;

It handles complex objects; OQL is an Object-Oriented language

OQL uses the identity (OID) of the objects so as to perform queries

It also bears the path expressions. One can enter a database through a named object, but more generally, once the object has been reached, the system needs a way to navigate from it and reach the data it needs. OQL uses the “.” Notation in order to go inside complex object.

In OQL, a collection in the FROM part can be built from a previous one by following a path which starts from it; for example:

The WHERE clause can be used to define a predicate that is then reused to select the data matching the predicate.

In the FROM clause, collections of objects which are not directly related can be declared. This is achieved thanks to the OQL capability to bear joins.

18

Page 19: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

This query shows the necessity of optimization in query language. The inner subquery must not be computed for every computation of the whole query. It must be computed only once, and the result, being always the same in the inner subquery, must be computed only once.

Unlike SQL, OQL must manipulate complex values. OQL must so be able to create temporary complex values leading to the result. To build a complex value, OQL uses the basic constructors.

OQL allows method calls in the queries as long as this does not generate an error in the type checking. For example:

Finally, OQL handles polymorphism. Polymorphism is the ability to invoke an operation on any of several different objects and have that object determine what to do at run-time. A polymorphic function is one that can be applied in the same way to a variety of data objects. Support for polymorphism involves technical decisions concerning early or late binding among objects and the procedures that invoke their methods.

3.4 SQL 3

SQL is a standard language that has been commonly used in relational databases over the last years. With the appearance of object-oriented databases, this language had to adapt to this new paradigm.

19

Page 20: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

Version 3 of SQL is the answer to this problem, published by the ANSI X3H2 group and by ISO.

In order to move gently from their traditional relational database query language architecture to object-oriented query language architecture, the notion of Abstract Data Type has been introduced. An ADT is used so as to symbolize class definition; specifying a set of attributes and routines.

The other parts of SQL3 that provide the primary basis for supporting object-oriented structures are:

Encapsulation is managed, setting public and private tags to the attributes and the methods

Constructors are used so as to initialize new instances of ADT Destructors are used so as to destroy an instance of ADT and so

release the resources used by this one Actors are used so as to perform any other action on those instances

(Get, set….) A unique OID is associated to all new instances of an ADT. OID value

is stored in an attribute that can not be assigned or updated by users For upward compatibility to SQL92, definition of a table is still

necessary even though the table is a single column with one ADT When a query is performed, the same syntax as SQL92 can be used ,

but queries can now include ADT’s attribute and some actor-functions.

3.5 GemStone

A query in GemStone can be performed exclusively if the collections of objects on the one the query is requested have a class defined.

GemStone uses a combination of Boolean so as to perform a query. For example:

Tasks select: {t | t.man_years > 20}

The result of a query is a set of class identical to the queried class.The result of the query presented earlier is a set of tasks.

Queries can also be a combination of Boolean predicates and path expressions. For example:

Tasks select: {:t | (t.man_years > 20 & (t.leader.specialization = ‘DB’)}

Additionally to the select message, other query protocols such as reject or detect can be used. Reject is the logical complement of the select query.

3.6 Iris

20

Page 21: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

The system used by Iris is known as OSQL. Of all the systems presented earlier, this is the one whose architecture is the closest to SQL. For example, OSQL also uses also uses a system of recursive cursor so as to query data sequentially.OSQL allows user-defined functions to be invocated as queries. The system performs a checking on the arguments passed to queries very close to the type checking performed in many other systems.Thus, the query which selects all tasks with a number of man_years greater than 20 is:

Select tFor each Task tWhere t.man_years > 20;

Attributes and methods are represented by functions in the IRIS data model. A function defines the attribute and an access interface. The interface is public and cannot be hidden from public use but encapsulation is not enforced.Operations of aggregation and ordering are not supported by OSQL but foreign functions can be used. OSQL is an extensible language, but the data model is multi-rooted. Creating generic operations over all objects classes is thus very difficult.

21

Page 22: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

4. Query optimisation and future workOne of the difficulties in Object-Oriented Database Management Systems (OODBMS) is the optimisation of declarative queries. There are several issues that are specific to the object-oriented data model, which makes the optimisation of OODBMS queries more complex. In this section, an overview of these issues is given, discussion of the proposed methods that deal with them and finally an overview of future work and research scope.

4.1 Optimisation Issues:

OODBMSs have aspects that add more complexity to query optimisation.

4.1.1 Additional Data Types

The user definition of new types and classes through inheritance can both assist and prevent optimisation of queries. An example where it helps could be a query involving the intersection of Employees and Supervisors. If Employee is a super class of Supervisor the optimiser can assume that Supervisors are a proper subset of Employees and simplify the join to the set of Supervisors.

An example where it hinders optimisation could involve the union of Students and Employees, with Person being a super class of both. If we wanted to find all supervisors of students and employees we can not perform the union first and then apply a supervisor() to the result, since it may not be defined for Persons, the type of the union. We may be forced to perform the method on each set by itself, and then form the union of the results, which can obviously be very inefficient.

We can see that the union of Students and Employees is guaranteed to have supervisor() defined, and safely apply it to the union, but the requires the optimiser to provide powerful type-inference mechanisms.

4.1.2 Complex Objects

The path expressions and query closure of OODBMS query languages complicate the processing of queries in several ways. One of these difficulties is the building of indices for path expressions, especially in face of arbitrary methods in the path. This is in general a very hard problem, and near unsolvable if methods can have side effects.

Another problem with path expressions is that they suggest an execution order of the path methods, which may well be a very inefficient order. When methods are involved in queries this is further complicated since the optimiser may have no idea what the method execution times and return values are.

22

Page 23: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

4.1.3 Methods and Encapsulation

The difficulty of optimizing methods arises if there is no way of determining the cost of evaluating methods, and the size of their results.

There are several proposals for the solution of this issue. Some systems allow the optimizer to break the encapsulation of methods and examine them to determine cost, which forces the methods to be written in a language understood by the optimizer. Other systems declare the cost of methods as a part of the definition. Yet other systems optimize queries under several assumptions, and then determine at query run-time what assumptions are most valid, based on current cost statistics, and execute the version optimized for those assumptions.

4.1.4 Identity and Equivalence

Another factor in query optimisation is whether the query language allows the creation of new objects, if so, whether those objects have object identities (OIDs). This influences the query optimisation in that it can introduce the need for duplicate elimination, deep equality comparisons where OID compares would otherwise be sufficient, or the need to deal with objects without OIDs.

4.2 Algebras and Calculi Optimisation

There are several different formal query languages of algebras and calculi, which have been proposed for OODBMSs. Algebras and Calculi differ in several respects like expression and support for optimising rewrite rules. Most of these algebras are variable based, i.e. use variables for temporary results. The lack of a standard algebra and calculi has hindered research into OODBMS query optimisation, preventing generalized conclusions to be made from research results.

4.2.1 Algebraic Optimisation

Algebraic optimisation has the advantage that a query can be transformed using well-defined operators. If the algebra is equivalence-preserving, which it should be, each query has a large number of equivalent queries making up the optimisation search space.

On the other hand, simple evaluation is usually unfeasible, because of the large size of the search space. Even so, dynamic programming methods, such as memorization can make this approach feasible. Methods using heuristics are another alternative, examples of which are randomised search and hill climbing. These methods only examine a part of the search space, trying to evaluate those parts of it, which look hopeful in one way or another. These methods therefore cannot guarantee an optimal solution, but that may not always matter if the search is based on incomplete information.

23

Page 24: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

4.2.3 Extensible OptimisersOODBMSs still have no standard data model, any optimiser system needs to be able to change with time. A lot of work has been focused on building ``Optimiser Generators'' which build optimisers based on information on the data model, query language, algebra, calculi and cost model. The systems going furthest in this direction define all aspects of the OODBMS and optimiser as objects and allow extensions through inheritance and method overloading.

4.3 Query Optimisation Methodology

An object-oriented database model support features such as abstract data types, methods, encapsulation, sub typing (or inheritance), complex structures, and object identity. The processing of queries in such a model also requires support for these features. Query optimisation will require new techniques for supporting the object-oriented features. Although many of the problems that must be solved by an object-oriented query optimiser are similar to problems solved by relational and extensible optimisers, there are also many problems that are unique to the object-oriented model.

In a number of papers [1].T. Özsu defines, with co-authors, the query processing methodology. In it, the query-rewrite optimisation is high-level processes where general-purpose heuristics drive the application of rewrite rules, and plan optimisation is a lower level process which generates execution plans based on knowledge of relative costs, statistics and physical structure. A declarative query is optimised as follows:

The calculus expression is first reduced to a normalized form by eliminating duplicate predicates, applying identities and rewriting.

The normalized expression is then converted to an equivalent object algebra expression. The algebra form of the query is a nested expression, which can be viewed as a tree whose nodes are algebra operators and whose leaves represent extents of classes in the database.

The algebra expression is next checked for type consistency to insure that predicates and methods are not applied to objects, which do not support them. This is complicated by the potential heterogeneous nature of query results.

Next the type-checked expression is rewritten using an equivalence-preserving rule system.

Lastly execution plans, or perhaps several different execution plans, are generated, which take into account the actual object implementations.

The simplest implementation of this approach to query optimisation would be to consider all equivalent algebra expressions from the next-to-last step,

24

Page 25: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

evaluate the cost of each of them in the plan generator, and choose the one with the least cost. However, this is unfeasible since it is requires the generation of a number of alternatives. One solution here is to use heuristics, combined with memorization of optimal sub expressions, to generate a single top-level query expression as final input to the plan generator. Another solution is to combine the last two steps into one, which is the approach most often taken in cost-based optimisation.

The optimisation of OODBMS query languages is inherently much more complicated than the optimisation of conventional query languages. There are numerous issues involving types and method evaluations which are much harder to optimise due to the amount of inference the optimiser needs to make in order to be able to efficiently optimise the query.

The future work on query optimisation will address the following issues:

A Standard Extensible Optimiser Framework: Since much of the work in OODBMS query optimisation involves continuous experimentation and construction of models.

Better Cost Models: There need to be better ways of judging the expected cost of evaluating an expression, as well as better ways of finding queries of minimal cost within the set of equivalent queries.

Better Algebras and Rewrite Systems: The most popular optimisation methods today involve algebras and rewrite rule systems. The systems available today operate using heuristics, which are know to be sub-optimal, and are of provably limited power.

Eliminating Method Evaluation: As a rule of thumb we would like to do without evaluating expensive methods, e.g. through selective short-circuiting of conditionals.

Indexing of Path Expressions: The evaluation of path expressions can be very time consuming and is a very good candidate for optimisation.

Pre-computation and Caching: This is a major candidate for massive speedup in query processing time. It is highly likely that the intermediate results of queries, sub queries and path expressions will be the same for many queries. The identification, pre-computation and/or caching of these results, especially when combined with better indexing methods, would speed up queries by orders of magnitude.

It is hard to provide a scheme for evaluating the success of research into query optimisation, the only real criterion being faster average execution time of queries.

4.4 OOQL Future Directions

There is little initiative for innovation in the area of OODBMS query languages, at least as of yet, since OODBMS vendors are still working on the basic issues in their implementations and have not even all fully implemented

25

Page 26: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

ODMG-93 OQL yet. Therefore the ODMG-93 OQL, and slight variants are likely to be the query language for quite some time to come, just as SQL has been virtually unchanged the query languages for RDBMSs for a long time now.

4.4.1 Research in OOQL

The area which may become important in research on OODBMS query languages is that of queries by end-users, people who may not even be really aware that they are accessing a database. As object-oriented techniques continue to grow in all areas of computing, the need to query large collections of objects will become a major issue for end users. Research in this area will be closely tied to that of research in user interfaces, since the query language will be a part of the user interface for the end users. The following is a list of possible work to be undertaken in OOQL research:

A New Super OODBMS Query Language: The construction and universal acceptance of a new theoretically sound and complete OODBMS query language with more expressive power.

Polishing ODMG-93 OQL: including a standard for query optimisation.

Connection to Rule Systems: Queries and updates are bound to trigger rules, and vice-versa. The research aims to define a better synergy between the two.

Graphical Query Construction: Construction of queries via a graphical interface, like a drag-and-drop Query-by-Example associative graphs format of some sort, is likely to be successful as an ad-hoc query builder for end users.

4.6 Future scope

The results that are expected are characterized by the demand for the research, the possible benefits from the research, the likelihood of major discoveries and finally the likely time span of the research.

The work on the ODMG-93 OQL and Rule Systems is considered to be automatically progressing and would be done in the next 5 years as OODBMSs are extended. The resources and backing required for them is merely the continued survival of OODBMS vendors and an active OODBMS user community however the construction of a completely new query language lacks motivation in the OODBMS community. There isn't much demand for the research, the benefits for actual users are not major, the possibility of major breakthroughs is very small and the time span of the research is probably long, since all new languages have to be implemented and used before it can be fully evaluated. These issues would have been fully resolved in 10 years time. The development of a good end-user OODBMS query mechanism, will not resolve itself by incremental changes to existing technology. Its development requires strong industrial backing and very possibly paradigm shifts from existing query mechanism.

26

Page 27: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

As for the demand for research on the Graphical Query Optimisation issue may not be overwhelming at this time. In the near future millions of users worldwide will be working with most of their data in terms of objects, be it within application software like CAD systems or spreadsheets, or just in the file storage system. As computers become networked onto the World Wide Web they will have access to more and more information, most of which can be viewed as objects of different types and interconnections.

CONCLUSIONIt has been seen that, like in the case of relational databases, object-oriented DBMS should provide a powerful, descriptive, generic language. There is not yet a standard that all databases vendors have accepted, but OQL from ODMG and SQL3 from ISO/ANSI are the two trying to become those standards.Neither one nor the other of those query languages do satisfy all the requirements that an object oriented database involves. Nevertheless, the requirements have been agreed by most of the OODB community, so we may think that the future OOQL standard will be the first query language released that meet the requirements. Due to that, the eventual next standard may not be the one specified by ODMG, but the one that will be most used (de facto standard).

27

Page 28: The aim of a query language is to allow the user to search a ...xiaodong/teaching/OODB/report7.doc · Web viewInheritance, which allow a class to be defined as an instance of one

References:

Books:[Atk89] M. Atkinson, F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, and S. Zdonik. The object-oriented database system Manifesto. 1989.http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/clamen/OODBMS/

[Lau97] G. Lausen, G. Vossen, Models and Languages of Object-Oriented Databases. 1997. Addison-Wesley Publishers.

[Ber93] E. Bertino, L. Martino. Object-Oriented Database Systems, Concepts and Architectures. 1993. Addison-Wesley Publishers. [Wom90] K. Wom. Introduction to Object-Oriented Databases. 1990. MIT Press.

[Cat94] R.G.G. Cattell. Object Data Management, Object-Oriented and Extended Relational Database Systems. 1994. Addison-Wesley Publishers.

Web sites:[1] http://www.cs.brown.edu/publications/techreports/reports/CS-91-41.html

http://www.cs.cornell.edu/home/ulfar/

http://www.avalon.net/~wbachman/OOFAQ/oo-faq-S-8.2.html -> current commercial/uni systems

http://aurora.rg.iupui.edu/doc/OO/papers.html -> articles

http://www.cai.com/products/jasmine/analyst/idc/14821Eat.htm -> Obj. DB VS Obj-Relational DB

http://www.dis.port.ac.uk/~chandler/OOLectures/database/database.htm -> Intro to OODB

http://www.cis.ohio-state.edu/~doug/Cis671/Transparencies/ORDBS.pdf -> OQL /SQL3

http://www.cs.nmsu.edu/~tson/classes/cs582/note6.pdf -> OQL

http://www.odbmsfacts.com/articles/why_use_sql_instead_of_an_oodbms.html -> Why use SQL instead of an OODBMS

http://www.comp.nus.edu.sg/~cs2102s/recitation/OODB/p25.presentation.ppt -> oodb

http://dbserver.iie.ncku.edu.tw/~tsengsm/COURSE/webdb-7-query/sld001.htm -> query languages

http://lbdwww.epfl.ch/f/teaching/courses/SlidesBDA/BDobjets/BDAOO_vcourtPourCours.ppt 

http://www.mm.di.uoa.gr/~toobis/seminar/OQL/sld001.htm

http://www.ipipan.waw.pl/~subieta/papers/Adbis2000Tut.ppt

28