31
July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 1 Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Query Translation for Ontology-extended Data Sources Jie Bao 1 , Doina Caragea 2 , Vasant Honavar 1 1 Artificial Intelligence Research Laboratory, Department of Computer Science, Iowa State University, Ames, IA 50011-1040, USA {baojie, honavar}@cs.iastate.edu 2 Department of Computing and Information Sciences Kansas State University, Manhattan, KS 66506, USA {dcaragea}@ksu.edu

Query Translation for Ontology-extended Data Sources

  • Upload
    jie-bao

  • View
    1.112

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 1

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Query Translation for Ontology-extended Data Sources

Jie Bao1, Doina Caragea2, Vasant Honavar1

1Artificial Intelligence Research Laboratory,Department of Computer Science,

Iowa State University, Ames, IA 50011-1040, USA{baojie, honavar}@cs.iastate.edu

2Department of Computing and Information SciencesKansas State University, Manhattan, KS 66506, USA

{dcaragea}@ksu.edu

Page 2: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 2

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

INDUS Group

Vasant Honavar Jie BaoDoina Caragea

Jyotishman Pathak Neeraj Koul

Page 3: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 3

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Ontology-Extended Data Source– Schema, Data, and Ontology

• Query Translation for OEDS– Ontology mapping, query translation / soundness / completeness

• Implementation and Optimization– The INDUS system

• Conclusion

Page 4: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 4

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

BackgroundData revolution• Bioinformatics

– Over 200 data repositories of interest to molecular biologists alone• Environmental Informatics• Enterprise Informatics • Medical Informatics• Social Informatics ...

Connectivity revolution (Internet and the web)

Integration revolution • Need to understand the elephant as opposed to examining

the trunk, the tail, etc.

Needed – infrastructure to support collaborative, integrative analysis of data

Page 5: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 5

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Solution: INDUS for Learning from Semantically Heterogeneous Distributed Autonomous Data Sources

Page 6: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 6

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

(Relational) Data Source

DData Set

Extensional Definition(Facts)

MScBob

First-yearAlice

statusname

Student

algorithmCS511

data structureCS103

namecode

Classes

CS511Bob

CS103Alice

classinstructor

Registers

SSchemaIntensional Definition

Classes

Faculty Teaches

name:String

code:String

rank:String

name:StringStudent Registers

name:String status:String

Page 7: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 7

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Semantic Extensions of Data Sources

Return classes that graduate students are registered in

Return all people in the database

?

?

DS

MScBob

First-yearAlice

statusname

Student

algorithmCS511

data structureCS103

namecode

Classes

CS511Bob

CS103Alice

classinstructor

Registers

Page 8: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 8

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology-Extended Data Source

Classes

Instructor Teaches

name:String

code:String

rank:String

name:StringStudent registers

name:String status:String

People

Student Instructor

MScBob

First-yearAlice

statusname

Student

student

Undergrad Graduate

First-year

MSc

Fourth-year

…PhD

MA

Page 9: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 9

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology-Extended Data Source

DData Set

SSchemaOS Schema Ontology

OD

Data Content Ontology

O’S

O’D

Page 10: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 10

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology-Extended Data Source

• Relational Model (Reiter, 1982)

– Schema S: a first order language with predicate symbols RS, each for a relational table (e.g. Classes, Faculty)

– Data Set D: a first order interpretation of S with domain

• Ontology-Extended (Relational) Data Source (Caragea et al. 2004)– Extending relational model with

– Schema Ontology: a first order language LOS with predicate symbols ROS, and RS ROS

– Data Content ontology: OOD=(LOD,DOD)

• LOD: a first order language with predicate symbols ROD, ROD RS=

• DOD: a first order interpretation of LOD with domain ’, ’

Page 11: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 11

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

OEDS: Example

S: Instructor(x,y); Classes(x,y), Student(x,y)…

MScBob

First-yearAlice

statusname

Student

D

Classes

Instructor Teaches

name:String

code:String

rank:String

name:StringStudent registers

name:String status:String

LOS

x,y, Student(x,y) Instructor(x,y) People(x)

isa(x,y) isa(y,z) isa(x,z)

LODDOD

isa(First-year,Undergraduate)isa(Undergraduate,Student)isa(MSc,Graduate)…

see survey [Shvaiko & Euzenat 2005]

OD

Page 12: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 12

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Ontology-Extended Data Source– Schema, Data, and Ontology

• Query Translation for OEDS– Ontology mapping, query translation / soundness / completeness

• Implementation and Optimization– The INDUS system

• Conclusion

Page 13: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 13

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Query

• Tuple Relational Calculus (TRC)– Tuple: a multiset of attributes– TRC Relational Algebra – q(t) := Student(t) (t.status=”Graduate”)

• Ontology-Extended Tuple Relational Calculus– q(t) := Student(t) isa(t.status, Graduate)

We focus on data content ontologies in this talk

Page 14: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 14

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Query Translation

DS

O2

q’

DS

q

O1

User Ontology Data Source Ontology

M

Ontology Mapping

Page 15: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 15

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Ontology Mapping

isa1(x, c1) ^ into(c1, c2) isa2(x, c2)

isa1(c1, x) ^ onto(c1, c2) isa2(c2, x)

……

Student

Undergrad Graduate

First-year

MSc

Fourth-year

…PhD

MA

Student

Undergrad Postgraduate

Freshman…

DoctoralMaster

into

onto

equ

isa1 isa2

Page 16: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 16

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Query Translation

DS

O2

q’q

DS

O1

M

Student(t) ^ isa1(t:status,Master) Student(t) ^ isa2(t.status, Graduate)

Student(t) ^ isa2(t.status, MSc)

Page 17: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 17

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Soundness, Completeness and Exactness

{q}

{q’}

{q’}

{q} {q}={q’}

SoundTranslation

CompleteTranslation

ExactTranslation

q := Student(t) ^ isa1(t:status,Master)

q’ := Student(t) ^ isa2(t.status, MSc)

q’ := Student(t) ^ isa2(t.status, Graduate) Non-existent

Page 18: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 18

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Most Informative Translation

c1d1

d2

O1O2

q := isa1(x,c1)

isa2(x,d1) isa2(x,d2)Most informative sound translation!

onto

onto

LUB (least upper bound)

isa2(x,d1)

isa2(x,d2)find its sound translation(s)

Page 19: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 19

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Query Translation Rules

For hierarchical ontologies

(similarly for complete translation of complex queries)

Atomic conditions

Complex conditions

GLB=greatest lower bound, LUB=least upper bound

Page 20: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 20

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Ontology-Extended Data Source– Schema, Data, and Ontology

• Query Translation for OEDS– Ontology mapping, query translation / soundness / completeness

• Implementation and Optimization– The INDUS system

• Conclusion

Page 21: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 21

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

INDUS Tools

Ontology Editor

Schema Editor

Mapping Editor

Data Editor

Query Engine and Interface

Page 22: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 22

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

INDUS – Mapping Editor

http://sourceforge.net/projects/indus-project/

Page 23: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 23

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

INDUS – Data Editor

http://sourceforge.net/projects/indus-project/

Page 24: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 24

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

INDUS – Query Editor

http://sourceforge.net/projects/indus-project/

Page 25: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 25

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Optimization for Scalability

• Database storage for ontologies• Using transitive closure for fast inference with

hierarchies• Server-side caching

– Using temporary tables on the data source server

• Client-side caching – Of remote ontologies and ontology mappings

Page 26: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 26

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Performance

• O1: Enzyme Classification (EC) hierarchy (4,564 terms)• M: SCOP to EC mapping [Richard George et. al.] with 15,765 rules• O2: SCOP (Structural Classification of Proteins) hierarchy (86,766 terms).

ServerClient

D

Internet

Page 27: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 27

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Performance

Page 28: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 28

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Ontology-Extended Data Source– Schema, Data, and Ontology

• Query Translation for OEDS– Ontology mapping, query translation / soundness / completeness

• Implementation and Optimization– The INDUS system

• Conclusion

Page 29: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 29

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Conclusion

• We have studied the query translation process for relational data sources extended with context-specific data content ontologies.– how to exploit ontologies and mappings for flexibly querying

semantic-rich data sources.– query translation strategy that works for hierarchical ontologies. – the conditions under which the soundness and completeness of

such a procedure can be guaranteed.

• Ongoing Work– More expressive ontologies, e.g., Description Logics– Schema ontology + data content ontology– Statistical learning from OEDS

Page 30: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 30

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Thank You!

Page 31: Query Translation for Ontology-extended Data Sources

July 23,2007, Semantic e-Science Workshop @AAAI 2007, Vancouver, Canada 31

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Semantics Preserving Translation

• Conservative Extension