32
Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS’12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Embed Size (px)

Citation preview

Page 1: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

1

Date : 2012/09/20

Author : Sina Fakhraee, Farshad Fotouhi

Source : KEYS’12

Speaker : Er-Gang Liu

Advisor : Dr. Jia-ling Koh

Page 2: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

2

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 3: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

3

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 4: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

4

Introduction

“Movie” AND “directed” AND “starred” AND “George” AND “Clooney”

“Movie” OR “directed” OR “starred” OR “George” OR “Clooney”

The movie(s) which are directed and stared in by

George Clooney

Database(EX : IMDB)

Not found

Page 5: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

5

IntroductionThe movie(s) which are

directed and stared in by George Clooney

Database(EX : IMDB)

Director / Actor

Search Result:

Movie : Leatherheads

Actor : George Clooney

Director: George Clooney

Page 6: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

6

The UI of the System

Page 7: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

7

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 8: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

8

System ArchitectureKnowledge Base

Generation1Keyword to Knowledgebase Resource Mapper2

SPARQL Query Construction3

Page 9: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

9

Page 10: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Knowledge BaseRDB

RDF

10

Page 11: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

11

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 12: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Extracting an ontology from the database schema

12

Movie ID Title DirectorID

1 Ocean’s Twelve 001

2 Star Trek 004

3 Leatherheads 003

4 The Terminator 002

DirectorID Direct

001 Soderbergh

002 James Cameron

003 George Clooney

004 J.Abrams

Movie Directors

Page 13: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Extracting an ontology from the database schema

13

Movie ID Title DirectorID

1 Ocean’s Twelve 001

2 Star Trek 004

3 Leatherheads 003

4 The Terminator 002

DirectorID Direct

001 Soderbergh

002 James Cameron

003 George Clooney

004 J.Abrams

Movie , 1Ocean’s Twelve

Movie: title

Movie Directors

Page 14: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Extracting an ontology from the database schema

14

Movie ID Title DirectorID

1 Ocean’s Twelve 001

2 Star Trek 004

3 Leatherheads 003

4 The Terminator 002

DirectorID Direct

001 Soderbergh

002 James Cameron

003 George Clooney

004 J.Abrams

Movie , 1Ocean’s Twelve

Movie: title

Movie: DirctorID

Movie Directors

Page 15: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Extracting an ontology from the database schema

15

Movie ID Title DirectorID

1 Ocean’s Twelve 001

2 Star Trek 004

3 Leatherheads 003

4 The Terminator 002

DirectorID Direct

001 Soderbergh

002 James Cameron

003 George Clooney

004 J.Abrams

Movie , 1Ocean’s Twelve

Movie: title

Movie: title

Director, 001

Movie Directors

Scanning each table row by row. Each row of a table which is identified uniquely by a primary key becomes a subject of a triple

SoderberghDirectors: Direct

Page 16: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

16

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 17: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Q = {Terminator , Stars}

Keyword Mapping Techniques

17

• Terminator → “The Terminator”• Stars → “Star Track” , “Actors”

Q = {Terminator , Stars}

Page 18: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Terminator / stars

The Terminator / Actors

(Similarity)

• Semantic Mapping (WordNet)

• Levenshtein distance (Edit Distance)

Q = {Terminator , Stars}

18

• Computing the syntactic similarity between each ( 1.terms and 2.phrase ) and each resource in the knowledgebase• Tokenize the query string into N individual terms (Unigram)• Generate a set of 2-term phrases/nouns from the query string (Bigram)

Keyword Mapping Techniques

Page 19: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

19

Keyword Mapping Techniques

• Choosing 0.4 as a threshold• Looking up each query keyword in a dictionary (e.g. WordNet) to find the

semantically similar resources for that keyword, which if found will be added to with a score of 0.5,

• Q = {scary movie stars}• Unigram : “scary” , “movie”, “stars”

• finding the “stars” who have starred in scary movies. • Bigram : “scary movie” , “movie star”

• finding the “stars” of the movie called “scary movie”

Page 20: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

Similarity

T H E Terminator → The Terminator (insertion of ‘T‘ , ‘H‘ , ‘E‘ at the front)

LevenshteinDist ( “Terminator” , “The Terminator ”) = 3

Similarity ( “Terminator” , “The Terminator ”) = 1 - > 0.4

Page 21: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

21

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 22: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

22

SPARQL Query Construction

Q = {Terminator , Stars}

Page 23: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

23

SPARQL Query Construction

Q = {Terminator , Stars}

Page 24: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

24

SPARQL Query Construction

Q = {Terminator , Stars}

Page 25: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

25

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 26: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

26

• Type1 keyword : the names of the actors, actresses, studios, etc which are indicated by namei in the table.

• Type2 keyword : the actual concepts, relationships (e.g. actor, director, starred in, etc).

Outline

Page 27: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

27

Experiment

Page 28: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

28

Precision , Recall

Precision = = = 0.8

Recall = = =

= 100 = 400

= 300 = 200

Total : 1000

Page 29: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

29

Experiment

Page 30: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

30

Query 1 Query 2 Query 4 Query 5Query 3 Query

RankingResult

1/2 1 1 1/4 1/3 Reciprocal Rank

1.2.3.4.5.

1.2.3.4.5.

1.2.3.4.5.

1.2.3.4.5.

1.2.3.4.5.

Mean Reciprocal Rank (MRR)

MRR = ( + 1 + 1 + + / 5 = 0.617

Page 31: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

31

Outline

• Introduction• DBSemSXplorer’s System Architecture• Relational database to RDF

• Knowledge Base• Extracting an ontology from the database schema

• Query Keyword to Knowledgebase Resource Mapper• Keyword mapping Techniques

• SPARQL Query Construction• Experiment• Conclusion

Page 32: Date : 2012/09/20 Author : Sina Fakhraee, Farshad Fotouhi Source : KEYS12 Speaker : Er-Gang Liu Advisor : Dr. Jia-ling Koh 1

32

Conclusion

• This paper proposed an approach and implemented a system based on that, DBSemSXplorer, to answer keyword search in RDBs by • Transforming the RDBs into an RDF knowledgebase• Mapping the query terms into the most semantically and syntactically

similar knowledgebase resources • Constructing equivalent SPARQL from the mapped terms

• Paper’s approach outperformed the previous approaches in finding the most relevant answers to the keyword query.