12
Evaluating DBOWL: A Non-materializing OWL Reasoner based on Relational Database Technology Maria del Mar Roldan-Garcia, Jose F. Aldana-Montes University of Malaga, Departamento de Lenguajes y Ciencias de la Computacion Malaga 29071, Spain, (mmar,jfam)@lcc.uma.es, WWW home page: http://khaos.uma.es Abstract. DBOWL is a scalable reasoner for OWL ontologies with very large Aboxes (billions of instances). DBOWL supports most of the frag- ment of OWL covering OWL-DL. DBOWL stores ontologies and clas- sifies instances using relational database technology and combines rela- tional algebra expressions and fixed-point iterations for computing the closure of the ontology, called knowledge base creation. In this paper we describe and evaluate DBOWL. For the evaluation both the standard datasets provided in the context of the ORE 2012 workshop and the UOBM (University Ontology Benchmark) are used. A demo of DBOWL is available at http://khaos.uma.es/dbowl. 1 Introduction With the explosion of Linked Data 1 , some communities are making an effort to develop formal ontologies for annotating their databases and are publishing these databases as RDF triples. Examples of this are biopax 2 in the field of Life Sci- ence and LinkedGeoData 3 in the field of Geographic Information Systems. This means that formal ontologies with a large number (billions of) instances are now available. In order to manage these ontologies, current platforms need a scalable, high-performance repository offering both light and heavy-weight reasoning ca- pabilities. The majority of current ontologies are expressed in the well-known Web Ontology Language (OWL) that is based on a family of logical formalisms called Description Logic (DL). Managing large amounts of OWL data, includ- ing query answering and reasoning, is a challenging technical prospect, but one which is increasingly needed in numerous real-world application domains from Health Care and Life Sciences to Finance and Government. In order to solve these problems, we have developed DBOWL, a scalable rea- soner for very large OWL ontologies. DBOWL supports most of the fragment of OWL covering OWL 1 DL. DBOWL stores ontologies and classifies instances 1 http://linkeddata.org/ 2 http://www.biopax.org/ 3 http://linkedgeodata.org/About

Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

Evaluating DBOWL: A Non-materializing OWLReasoner based on Relational Database

Technology

Maria del Mar Roldan-Garcia, Jose F. Aldana-Montes

University of Malaga, Departamento de Lenguajes y Ciencias de la ComputacionMalaga 29071, Spain,

(mmar,jfam)@lcc.uma.es,WWW home page: http://khaos.uma.es

Abstract. DBOWL is a scalable reasoner for OWL ontologies with verylarge Aboxes (billions of instances). DBOWL supports most of the frag-ment of OWL covering OWL-DL. DBOWL stores ontologies and clas-sifies instances using relational database technology and combines rela-tional algebra expressions and fixed-point iterations for computing theclosure of the ontology, called knowledge base creation. In this paper wedescribe and evaluate DBOWL. For the evaluation both the standarddatasets provided in the context of the ORE 2012 workshop and theUOBM (University Ontology Benchmark) are used. A demo of DBOWLis available at http://khaos.uma.es/dbowl.

1 Introduction

With the explosion of Linked Data1, some communities are making an effort todevelop formal ontologies for annotating their databases and are publishing thesedatabases as RDF triples. Examples of this are biopax2 in the field of Life Sci-ence and LinkedGeoData3 in the field of Geographic Information Systems. Thismeans that formal ontologies with a large number (billions of) instances are nowavailable. In order to manage these ontologies, current platforms need a scalable,high-performance repository offering both light and heavy-weight reasoning ca-pabilities. The majority of current ontologies are expressed in the well-knownWeb Ontology Language (OWL) that is based on a family of logical formalismscalled Description Logic (DL). Managing large amounts of OWL data, includ-ing query answering and reasoning, is a challenging technical prospect, but onewhich is increasingly needed in numerous real-world application domains fromHealth Care and Life Sciences to Finance and Government.

In order to solve these problems, we have developed DBOWL, a scalable rea-soner for very large OWL ontologies. DBOWL supports most of the fragmentof OWL covering OWL 1 DL. DBOWL stores ontologies and classifies instances

1 http://linkeddata.org/2 http://www.biopax.org/3 http://linkedgeodata.org/About

Page 2: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

2

using relational database technology. The state-of-the-art algorithm for achiev-ing soundness and completeness in reasoning with expressive DL ontologies isthe so-called Tableau procedure. Current Tableau-based implementations suchas Pellet, Racer and HermiT show very good behavior in practice, but are com-pletely memory-based and thus cannot cope with ontologies that have a largeABox. Several alternative approaches using disk-oriented implementations havebeen presented. These proposals can be classified into three categories. (1) Thosewhich combine a DL main-memory based reasoner with a database, (2) Thosewhich translate the ontology to Datalog and use a deductive database to eval-uate it, and (3) Those which extend the database with reasoning capabilities.Our proposal follows a different approach: The state-of-the-art OWL reasonerPellet4 is currently used to classify the ontology Tbox. Information returned byPellet is stored in a relational database. Class and property instances are alsostored in the relational database as relation tuples. An algorithm which combinesrelational algebra expressions with fixed-point iterations is used to compute theclosure of the of the ontology, called knowledge base creation. The use of an OWLreasoner, like Pellet, to classify the Tbox is crucial in our approach. This allowsthe capture of some Tbox inferences that cannot be obtained by other similarproposals such as those based on disjunctive datalog [1].

This paper presents a description and an evaluation of DBOWL. In orderto evaluate DBOWL we use the standard datasets provided in the context ofthe ORE 2012 workshop5 and the UOBM (University Ontology Benchmark) [2],the extremely well known benchmark for comparing ontology repositories in theSemantic Web. The rest of the paper is organized as follows. Section 2 introducesthe theoretical concepts on which DBOWL is based. Section 3 describes thetheoretical foundation of DBOWL, presenting the process for computing theontology closure. Section 4 discusses the advantages and limitations of DBOWL.The evaluation of DBOWL is presented in Section 5. Finally Section 6 concludesthe paper.

2 Preliminaries

2.1 The Relational Model

The relational model was first introduced by Ted Codd of IBM Research in 1970in a classic paper [3]. The relational model is characterized by its simplicityand mathematical foundation. The relational model represents the database asa collection of relations.

A domain D is a set of atomic values. Atomic means that each value inthe domain is indivisible as far as the relational model is concerned. A commonmethod of specifying a domain is to specify a data type from which the datavalues forming the domain are drawn. It is also useful to specify a name forthe domain, to help in interpreting its values. A relation schema R, denoted

4 http://clarkparsia.com/pellet5 http://www.cs.ox.ac.uk/isg/conferences/ORE2012/

Page 3: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

3

by R(A1, A2, . . . , An) is made up of a relation name R and a list of attributesA1, A2, . . . , An. Each attribute Ai is the name of a role payed by some do-main D in the relation schema R. D is called the domain of Ai and is denotedby dom(Ai). The degree (or arity) of a relation is the number of attributes nof its relation schema. A relation (or relation state) r of the relation schemaR(A1, A2, . . . , An), also denoted by r(R), is a mathematical relation of degreen on the domains A1, A2, . . . , An), which is a subset of the cartesian productof the domains that define R:

r(R) ⊆ (dom(A1)× dom(A2)× . . .× dom(An))

r(R) is defined more informally as a set of n-tuples r = {t1, t2, . . . , tm}. Eachn-tuple t is an ordered list of n values t =< v1, v2, . . . , vn >, where each value vi,1 ≤ i ≤ n, is an element of dom(Ai), or is a special NULL value. NULL is usedto represent the values of attributes that may be unknown or may not apply toa tuple. This notation is used in the rest of the paper.

The terms relation intension for the schema R and relation extensionfor a relation state r(R) are also commonly used. A relation is defined as a setof tuples. Mathematically elements of a set have no order among them.

2.2 The Relational Algebra

The basic set of operations for the relational model has an algebraic topology,and is known as the Relational Algebra. Operands in the Relational Algebraare Relations. Relational Algebra is closed with respect to the relational model:Each operation takes one or more relations and returns a relation. Given closureproperty, operations can be composed.

Relational Algebra operations enable a user to specify basic retrieval re-quests. The result of a retrieval is a new relation, which may have been formedfrom one or more relations. A sequence of relational algebra operations formsa relational algebra expression, the result of which will also be a relationthat represents the result of a database query (or retrieval request). Therefore,it is possible to assign a new relation name to a relational algebra expression, inorder to simplify its use by other relational algebra expressions. Such relationsare called idb (Intensional) relations, unlike the relations in R, which are callededb (Extensional) relations.

Operations in relational algebra can be divided into two groups: Set opera-tions from mathematical set theory (UNION (∪), INTERSECTION (∩), SETDIFFERENCE (\) and CARTESIAN PRODUCT (×)), and operations devel-oped specifically for relational databases (SELECT (σ), which selects a subsetof the tuples from a relation that satisfied a selection condition, PROJECT(π), which selects certain attributes from the relation and discard the other at-tributes, JOIN (◃▹), which combines related tuples from two relations into singletuples) among others.

Page 4: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

4

2.3 DBOWL Ontologies

In order to simplify the implementation of the reasoner, some restrictions areimposed on the OWL ontologies supported by DBOWL. Even so, the ontolo-gies supported by DBOWL are expressive enough for real application in theSemantic Web. DBOWL covers all of OWL 1 DL including inverse, transitiveand symmetric properties, cardinality restrictions, simple XML schema defineddatatypes and instance assertions. Enumerate classes (a.k.a, nominals) are onlypartially supported.

Let P and Q be properties, x be an individual and n be a positive number,class descriptions in DBOWL ontologies are formed according to the followingsyntax rule:

C, D → A (NamedClass) | ¬A (complementOf NamedClass) |C ⊓D (intersectionOf ClassDescriptions) |C ⊔D (unionOf ClassDescriptions) | ∀P.C (allV aluesFrom) |∃P.C (someV aluesFrom) | ∃P.{x} (hasV alue) |{x1, . . . , xn} (oneOf) | >= nP (minCardinality) | <= nP (maxCardinality)

Tbox Axiom DL syntax

SubClassOf A ⊑ BequivalentClasses A ≡ BSubPropertyOf P ⊑ QequivalentProperty P ≡ QdisjointWith A ⊑ ¬BinverseOf P ≡ Q−

transitiveProperty P+ ⊑ PsymmetricProperty P ≡ P−

functionalProperty ⊤ ⊑≤ 1PinverseFunctionalProperty ⊤ ⊑≤ 1P−

domain ≥ 1P ⊑ Arange ⊤ ⊑ ∀P.AAbox Axiom DL syntax

class instance A(x)property instance P (x, y)sameAs x1 ≡ x2

Table 1. DBOWL ontologies axioms

Table 1 shows the Tbox and Abox axioms for DBOWL ontologies. A andB are used for specifying Named Classes and C and D for specifying ClassDescriptions. DBOWL assumes that all individuals are different unless the on-tology includes an owl:sameAs assertion or you inferred it. This is important inreal applications with a large number of individuals where usually it is easierto specify if two individuals represent the same resource than which individualsare different to others. The following restrictions are imposed on the DBOWL

Page 5: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

5

ontologies. These restrictions are related more to the ontology syntax than tothe ontology expressivity:

1. In the Tbox, all OWL constructors are supported. However, class descrip-tions always appear in the ontology as an equivalence or as a superclass of aNamed Class. In an RDF/XML OWL ontology, class descriptions are alwaysinvolved in the definition of a Named Class.

2. In the Abox, only assertions of Named Classes and Property Names aresupported.

3. Only negation of Named Classes is allowed. Nevertheless, a negation of aclass description could be included in the ontology defining a Named Classas equivalent to a Class Description and negating this Named Class.

4. Properties’ domain and range must be Named Classes. In the same way, forasserting a complex property’s domain or range we must define a NamedClass as equivalent to a Class Description and use this Named Class asProperty domain or range.

5. Only disjointness of Named Classes is allowed. As in the previous cases, adisjointness of a class description could be included in the ontology defining aNamed Class as equivalent to a Class Description and disjoining this NamedClass.

3 DBOWL Theoretical Foundations

In this section we present the theoretical foundations of our approach to scal-able OWL reasoning. Although DBOWL is basically a Description Logic rea-soner, it has been designed as an OWL reasoner. This implies that not allthe DL inferences are supported. The main objective of DBOWL is to clas-sify instances in Named Classes and Properties. In order to do this, for eachNamed Class and Property in the ontology, a edb relation RA1(id), . . . , RAn(id),RP1(subject, object), . . . , RPm(subject, object) is defined, being n and m thenumber of Named Classes and Properties in the ontology respectively. Theserelations contain one tuple for each individual or pair of individual asserted asmember of such Named Class or Property.

3.1 Classification Function

In order to classify instances in Names Classes and Properties, we define a clas-sification function F (see table 2). This function takes as input a DBOWLproperty axiom, a DBOWL domain or range axiom (see table 1), or an axiom(A ≡ C), where C is a DBOWL class description and A is a Named Class in theontology or an auxiliary name. The function define a new idb relation by meansof a relational algebra expression, depending on the input type, or invoke thefunction with a new input.

For each Named Class in the ontology, a set of idb relations SAi0(id), . . . ,

SAik(id), i : 1 . . . n are defined. In the same way, for each Property in the ontology

Page 6: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

6

a set of idb relations SPj0(subject, object), . . . , SPjl

(subject, object), j : 1 . . .mare defined. The values of k and l depend on the number of axioms in the ontologyevolving Ci and Pj respectively.

Each SAix, x : 0 . . . k has the following features (similarly for each SPjx

):

– SAix= QAix

, where QAixis a relational algebra expression,

– SAi0= RAi ,

– SAi(x−1)always occurs in Qix , for x : 1 . . . k, and

– if SAjror SPjs

occur in Qix , they represent the last idb relation defined forAj and Pj respectively.

3.2 Knowledge base Creation

In order to create the DBOWL knowledge base, function F is evaluate iteratively,defining the corresponding idb relations, until no new tuples are generated, i.e.until a fixed-point is reached. (SAix

= SAi(x−1), i : 1, . . . , n, and SPjx

= SPj(x−1),

j : 1, . . . ,m).

In order to improve the efficiency of the evaluation, F is expressed as a com-position of four functions, i.e.

F = F1 ◦ F2 ◦ F3 ◦ F4, where,

– F1 takes as input only axioms such as P ⊑ Q, P ≡ Q, P ≡ Q−, P+ ⊑ P ,P ≡ P−

– F2 takes as input only axioms such as ≥ 1P ⊑ A, ⊤ ⊑ ∀P.A– F3 takes as input only axioms such as A ⊑ B, A ≡ B, A ≡ C⊓D, A ≡ C⊔D,

A ⊑ ∀P.C, A ≡ ∃P.C, A ≡ ¬B, A ≡ {v1, . . . , vn}, A ≡ ∃P.{v}– F4 takes as input only axioms such as A ≡ ∃P.{v}

The algorithm proceeds as follows:

1. F1 is evaluated iteratively, defining the corresponding idb relations, until nonew tuples are generated, i.e. until a fixed-point is reached. (SPjx

= SPj(x−1),

j : 1, . . . ,m).

2. F2 is evaluated defining the corresponding idb relations (SAix, i : 1, . . . , n).

3. F3 is evaluated iteratively defining the corresponding idb relations, until nonew tuples are generated, i.e. until a fixed-point is reached. (SAi(x+1)

= SAix,

i : 1, . . . , n).

4. F4 is evaluated defining the corresponding idb relations (SPj(x+1), j : 1, . . . ,m).

5. Steps from 1 to 4 are repeated until no new tuples are generated by step 4,i.e. until a fixed-point is reached (SPj(x+1)

= SPjx, j : 1, . . . ,m).

Page 7: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

7

F(P

⊑Q)

Anew

idbrelationSQ

iis

defi

ned

asπsubject,object(S

Q(i−

1))∪

πsubject,object(S

Pj).

F(P

≡Q)

Anew

idbrelationSQ

iis

defi

ned

asπsubject,object(S

Q(i−

1))∪

πsubject,object(S

Pj).

F(P

≡Q

−)

Anew

idbrelationSPiis

defi

ned

asπsubject,object(S

P(i−

1))∪

πobject,subject(S

Qj).

F(P

+⊑

P)

If(x,y)is

atuple

inSP(i−

1)and(y,z)is

alsoatuple

inSP(i−

1),then

anew

idbrelationSPiis

defi

ned

as

πsubject,object(S

P(i−

1))∪

(πsubject,object((SP(i−

1))◃▹

object=

subject(S

P(i−

1)))).

F(P

≡P

−)

Anew

idbrelationSPiis

defi

ned

asπsubject,object(S

P(i−

1))∪

πobject,subject(S

P(i−

1)).

F(≥

1P

⊑A)

Anew

idbrelationSA

iis

defi

ned

asπid((SA

(i−

1)))

∪πsubject((SPj)).

F(⊤

⊑∀P

.A)

Anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πobject(S

Pj).

F(A

⊑B)

Anew

idbrelationSB

iis

defi

ned

asπid(S

B(i−

1))∪

πid(A

j).

F(A

≡B)

Anew

idbrelationSB

iis

defi

ned

asπid(S

B(i−

1))∪

πid(A

j).

F(A

≡C

⊓D)

Anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

(πid(F

(B≡

C))

∩πid(F

(B≡

D))).

F(A

⊑C

⊓D)

F(A

≡C),

F(A

≡D)

F(A

≡C

⊔D)

Anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πid(F

(B≡

C))

∪πid(F

(B≡

D)).

F(A

≡B

⊔C)

IfI≡

¬B,anew

idbrelationSX

isdefi

ned

asπid(S

Ai)∩

πid(S

Ij),

F(X

≡C)

F(A

⊑∀P

.C)

Anew

idbrelationSX

isdefi

ned

asπobject(S

Ai◃▹

id=subjectSPj).

F(X

≡C)

F(A

≡∃P

.C)

Anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πid(F

(X≡

C))

◃▹id

=objectSPj).

F(A

⊑∃P

.C)

IfP

isafunctionalproperty,anew

idbrelationSX

isdefi

ned

asπsubject(S

Ai◃▹

id=subjectSPj).

F(X

≡C)

F(A

≡¬B)

IfB

≡¬I,anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πid(S

Ij)

F(A

≡¬B)

IfI≡

B⊔C),

anew

idbrelationSX

isdefi

ned

asπid(S

Ai)∩

πid(S

Ij),

F(X

≡C)

F(A

≡¬B)

IfI≡

¬A,anew

idbrelationSB

iis

defi

ned

asπid(S

B(i−

1))∪

πid(S

Ij)

F(A

≡{v

1,...,v

n})

Anew

edbrelationT(id)is

defi

ned

wherer(T)=

{t1,...,tn}andt i

=v i,i:1...n

.Then

anew

idbrelation

SA

iis

defi

ned

asπid(S

A(i−

1))∪

πid(T

).F(A

≡≤

nP)

if(x,y

i),

i:1..n

are

instancesofP,andtheyiare

alldifferen

t,anew

edbrelationT(id)is

defi

ned

where

r(T)=

{t}andt=

x.Then

anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πid(T

).F(A

≡∃P

.{v})

Anew

idbrelationSA

iis

defi

ned

asπid(S

A(i−

1))∪

πsubject(σ

object=

v(S

Pj)).

F(A

≡∃P

.{v})

Anew

idbrelationSPjis

defi

ned

asπsubject,object(S

P(j−

1))∪

πid

,v(π

id(S

Aj)).

Table

2.DBOW

LClassificationFunction

Page 8: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

8

4 DBOWL Advantages and Limitations

DBOWL is implemented using Oracle 10g as Relational Database ManagementSystems. edb relations are tables in the database while idb relations are SQLviews. A view in SQL terminology is a single table that is derived from othertables [4]. A view does not necessarily exists in physical form; it is considered avirtual table (non-materialized). The query defining the view is evaluated whenneeded. Then, once the knowledge base is created, for each Named Class andProperty in the ontology there is a SQL view which defines the set of tuples(asserted and inferred) belonging to such Named Class or Property.

In order to query the knowledge base, SPARQL queries are re-written interms of the SQL views and evaluated on the database. The names of the NamedClasses and Properties involved in the query are changed by the correspondingSQL view name. Note that the queries defining the views are evaluated whenthe SPARQL query is evaluated. Thus, the inferred instances are not material-ized in the database. Only some intermediate results are physically stored in thedatabase, like the results of the transitive function. This non-materialized ap-proach allows us to deal with billions of instances without the need of very largestorage repositories. However, the main advantage of this approach is regardingupdates. The non-materialization of the inferred instances permits the supportof low-cost updates, as well as the possibility of implementing incremental rea-soning algorithms.

Another important feature of DBOWL is the management of the owl:sameAsstatement. At the end of each loop of the algorithm for the knowledge basecreation, those individuals related by the owl:sameAs statement are includedin the SQL views. DBOWL obtains the individuals related by the owl:sameAsstatement as: (1) Those individuals explicitly asserted as (x sameAs y); (2)By means of functional and inverse functional properties; (3) By means of themaxCardinality to 1 restriction.

DBOWL is complete with respect to the DBOWL knowledge base and theimplemented functions, classifying all instances in Named Classes and Propertiescorrectly. However, it presents some limitations:

As DBOWL separates Tbox and Abox reasoning, some inferences with nom-inals are lost. Fortunately, this information is not relevant for DBOWL becausethe objective of DBOWL is to classify instances in Named Classes and theseinferences do not generate additional information for classification of instances.

DBOWL presents a problem regarding the open-world semantics of a DLAbox, which implies that an Abox has several models. The problem of exploringall the possible models in DBOWL is not trivial, even so, as DBOWL supportsa large number of instances, it is logical to think that it could be very inefficient.Nevertheless, we plan to study how to provide a (partial) solution to this problemin the future.

Currently updates are not efficiently supported in DBOWL.Finally, consistency checking of the knowledge base is not completely sup-

ported. Currently only the inconsistency caused by the classification of the sameinstance into (or the assertion of the same instance as member of) two disjoint

Page 9: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

9

Fig. 1. Number of instances for each UOBM query

classes, and by the classification of one instance in a unsatisfiable class are im-plemented.

5 DBOWL Evaluation

In order to demonstrate practically the completeness of DBOWL we use theUOBM (University Ontology Benchmark) [2], a well known benchmark to com-pare repositories in the Semantic Web. This benchmark is intended to evaluatethe performance of OWL repositories with respect to extensional queries overa large data set that commits to a single realistic ontology. Furthermore, thebenchmark evaluates the system completeness and soundness with respect tothe queries defined. This benchmark provides three OWL-DL ontologies, i.e. a20, 100 and 200 Megabytes ontologies and the query results for each one. Thisexperiment is conducted on a VMWARE virtual machine (one for each tool)with 8192 MB memory, running on a Windows XP 64 bits professional and javaruntime environment build 1.6.0 14− b08.

We evaluated the UOBM-DL queries for the 20, 100 and 200 Megabytesontologies in DBOWL and obtained the correct results for all queries. Figure1 presents the results for each ontology and for each query. As we can see,some DBOWL results are marked in a different color. This is because DBOWLand UOBM return different results for queries 11, 13 and 15. We checked theUOBM results for these queries and we believe that they are incorrect. Forquery 11 DBOWL returns more results than UOBM. In the case of queries 11and 15, it is because several owl:sameAs relationships between some UOBMindividuals can be inferred. Therefore, these individuals should be in the result.In the case of query 13, it is because the UOBM result includes instances ofall departments, but query 13 asks only for instances in department0. Figure 2presents the response times for the UOBM-DL 200 Megabytes ontology.

We have also evaluated DBOWL using the standard datasets provided inthe context of the ORE 2012 workshop. These datasets include a set of state

Page 10: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

10

Fig. 2. Response times for UOBM-DL 200M ontology

of the art ontologies in OWL 2 language, both in RDF/XML and Functionalsyntax. and they are organised by reasoning services, i.e. Classification, Classsatisfiability, Ontology satisfiability, Logical entailment and non entailment andInstance retrieval. DBOWL uses Pellet in order to classify the ontology Tboxand to check the class satisfiability. Therefore, datasets corresponding to thesereasoning services are not included in our evaluation. As the main objective ofDBOWL is to classify instances in Named Classes and Properties, we evalu-ate DBOWL using the Instance Retrieval test cases. Some of the ontologies inthese datasets present unsatisfiable classes. We use these ontologies to test thebehavior of DBOWL in such cases. However, the total time taken to load andtest the satisfiability of one ontology and the satisfiability result is reported byPellet. DBOWL only stores in the database the classes that Pellet returns asunsatisfiable. When DBOWL classifies an instance in a unsatisfiable class, it re-turns that the ontology is inconsistent, via a simple SQL query to the relationaldatabase. Thus, the performance of the ontology classification reasoning servicefalls on Pellet. Finally, as DBOWL is an OWL-DL reasoner, we use the OWL-DLInstance Retrieval test case for the evaluation.

This experiment has been carried out in two phases. In the first step, weloaded the nine ontologies in DBOWL. Most of the ontologies could not beloaded due to different problems: (1) Some ontologies are not valid DBOWLontologies (see section 2.3). Information 397.owl, minswap.owl, and people.owldefine complex property’s domains or range (different from named Classes). In-formation 397.owl and people2.owl contain complex Abox assertions (differentfrom Named Classes assertions). Finally, obi.owl cannot be classified by Pelletbecause of memory problems. (2) DBOWL presented some problems dealing withunsatisfiable classes, because the storage and management of the class Nothingwas not completely implemented. (3) DBOWL presented some problems regard-ing the management of the namespaces.

Page 11: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

11

Fig. 3. Results for OWL-DL Instance Retrieval dataset

In the second step, we solved the aforementioned problems and we loadedeight of the nine ontologies in DBOWL (obi.owl could not be loaded becauseit presented a problem with Pellet) and we obtained the corrected result for allof them. We followed the guidelines outlined in Section 2.3 in order to convertthe ontologies into DBOWL ontologies. Figure 3 summarizes the results of theevaluation. Load time includes Tbox classification (Pellet), database creation,ontology storage and knowledge base creation (instances classification).

6 Conclusions

From the evaluation we extract some general conclusions. To the best of ourknowledge, DBOWL is the only OWL reasoner able to deal with the threeUOBM-DL ontologies obtaining the correct results for all queries in all cases.Furthermore, this allows us to check the UOBM results for queries 11, 13 and15 and to conclude that they are incorrect. Finally, DBOWL response times arevery good the highest one being 0.328 seconds for the UOBM 200MB ontol-ogy. The results obtained with both evaluations suggest that DBOWL is a realcomplement to current OWL reasoners. Currently, DBOWL supports ontologieswith much bigger Aboxes than traditional systems based on description logicand satisfiability. This is especially important for some applications such as lifesciences, where particularly large ontologies are used. The datasets provided inthe context of the ORE 2012 workshops have allowed us to improve DBOWLin several ways. Thus, the latest version of DBOWL is able to deal with alltypes of namespaces, to control when a class is non-satisfiable and to check theontology consistency in such a case. Furthermore, we empirically test that therestrictions imposed on the DBOWL ontologies are not a problem for developingreal ontologies, because any ontology can be converted to a DBOWL ontology,keeping the ontology expressivity. With respect to instance retrieval, DBOWL is

Page 12: Evaluating DBOWL: A Non-materializing OWL Reasoner based ...ceur-ws.org/Vol-858/ore2012_paper3.pdf · Relational Algebra operations enable a user to specify basic retrieval re-quests

12

able to obtain the same results as the expected result provided by the OWL-DLInstance Retrieval dataset, suggesting that the DBOWL classification functionsand the algorithm for creating the knowledge base work well.

The use of a relational database to store the ontologies implies that the timefor loading an ontology in DBOWL can be longer than the load time in main-memory reasoners. The advantage of our approach is that, once the knowledgebase is created, the query time is really small. Furthermore, as the knowledgebase is persistent, you can query it at any moment without creating it again.Although other approaches also provide solutions for instance retrieval, theypresent some problems regarding reasoning expressivity or response query times.SHER 6, is a platform developed by IBM which supports sound and completereasoning for the fragment of OWL 1 DL without nominals. SHER adopts amodularisation-based approach in which the ontology breaks into small partsand is reasoned with a DL reasoner in the main memory. After the reasoningprocedure is finished, the corresponding axioms are stored in the database. Rea-soning with instances is performed at query time. Oracle 11g 7 is the lasterversion of the extremely well known RDMS Oracle. Oracle 11g includes a nativeinference engine able to handle a subset of OWL called OWLPrime which coverspart of OWL Lite and a little part of OWL 1 DL. It also supports querying ofRDF/OWL data using SPARQL-like graph patterns embedded in SQL.

As for future work, we are studying some optimisation techniques (such asdatabase indexes, parallel computation and incremental reasoning) in order toimprove the response times of the queries. We also are studying the possibilityof incorporating other OWL reasoners different from Pellet, in DBOWL. Theidea is to select the most convenient OWL reasoner depending on the ontologyexpressivity and size.

7 Acknowledgements

This work is supported by the Project Grant TIN2011-25840 (Spanish Ministryof Education and Science) and P11-TIC-7529 (Innovation, Science and Enter-prise Ministry of the regional government of the Junta de Andalucıa).

References

1. Ullrich Hustadt , Boris Motik , Ulrike Sattler. Reasoning in Description Logics bya Reduction to Disjunctive Datalog. Journal of Automated Reasoning, v.39 n.3,p.351-384, October 2007.

2. Ma, L; Yang, Y; Qiu, Z; Xie, G; Pan, Y. Towards A Complete OWL OntologyBenchmark. In. Proc. of the 3rd European Semantic Web Conference (ESWC 2006).

3. Codd, E. A relational Model for Large Shared Data Banks, CACM, 13:6, june 1970.4. Abiteboul, S., Hull, R., Vianu, V. Foundations of Databases. Addison-Wesley Pub-

lishing Company. 1995.

6 http://domino.research.ibm.com/comm/research projects.nsf/pages/iaa.index.html7 http://www.oracle.com