24
GlobeNet 2013 WEB 2013, The First International Conference on Building and Exploring Web Based Environments January 27 - February 1, 2013 - Seville, Spain Translating Natural Language Competency Questions into SPARQL Queries: A Case Study Authors: Leila ZEMMOUCHI-GHOMARI, [email protected] Abdessamed Réda GHOMARI, [email protected] LMCS Laboratory National Superior School of Computer Science, Algiers, Algeria www.esi.dz

Translating natural language competency questions into sparql queries web2013

Embed Size (px)

Citation preview

Page 1: Translating natural language competency questions into sparql queries   web2013

GlobeNet 2013WEB 2013, The First International Conference on

Building and Exploring Web Based EnvironmentsJanuary 27 - February 1, 2013 - Seville, Spain

Translating Natural Language Competency Questions into SPARQL

Queries: A Case Study

Authors:

Leila ZEMMOUCHI-GHOMARI, [email protected] Abdessamed Réda GHOMARI, [email protected]

LMCS LaboratoryNational Superior School of Computer Science, Algiers, Algeria

www.esi.dz

Page 2: Translating natural language competency questions into sparql queries   web2013

2

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

1. MOTIVATION

2. RELATED WORK

3. PROPOSED TRANSLATION APPROACH

4. CASE STUDY

6. CONCLUSIONS AND FUTURE WORK

OUTLINE

Page 3: Translating natural language competency questions into sparql queries   web2013

3WEB 2013 January 27 - February 1, 2013 - Seville, Spain

1. MOTIVATION

Translation

The context of the current research work is a PHD thesis focused on an ontology engineering process

Page 4: Translating natural language competency questions into sparql queries   web2013

4WEB 2013 January 27 - February 1, 2013 - Seville, Spain

1. MOTIVATION

Translation

The context of the current research work is a PHD thesis focused on an ontology engineering process

Competency questions is a well-known technique that

allow to determine the requirements or needs the

ontology should fulfill

expressed in a formal language in order to

allow automatic evaluation

Page 5: Translating natural language competency questions into sparql queries   web2013

7

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

2. RELATED WORK

CNL

OWLPATH

PANTO

DEANNA

Ben Abacha & Zweigenbaum

Approach, 2012

To the best of our knowledge, automatic translation of competency questions into SPARQL queries, with the aim of validating an ontology, has not been tackled by researchers. Although, in a more general perspective, there exist several approaches dedicated to web Question Answering (QA) area

Page 6: Translating natural language competency questions into sparql queries   web2013

8

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

2. RELATED WORK

CNL

Ontology-basedControlled

Natural LanguageEditor

OWLPATH

OWL Ontology-guided query

Editor

PANTO

Portable Natural Language

Interface to Ontologies

DEANNA

Deep Answers for Naturally Asked

Questions

Ben Abacha &ZweigenbaumApproach, 2012

Translating MedicalQuestions into

SPARQL Queries

Limitations:

Scalability: Their test ontologies are relatively small Preliminary work are necessary to apply theses approaches like Mapping set between concepts’ questions and queried knowledge bases difficult to carry out and to maintain. some of them focus on some types of questions and some know. domains No consensus of web QA community on a single approach

Page 7: Translating natural language competency questions into sparql queries   web2013

9

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

3. PROPOSED TRANSLATION APPROACH (1/3)A variation of [Ben Abacha & Zweigenbaum, 2012] Approach

WHY ?

HOW ?

Specific to the medical field

Limited to a particular set of questions:

WH questions, except complex ones (why and when).

Their approach Our approach

1. Identifying QuestionType 1. Identifying QuestionType

2. Determining the Expected Answer(s)Type(s) for WH questions

2. Determining the expected answer

3. Constructing the question’s affirmative and simplified form

4. Medical Entity Recognition (treatment, disease…)

3. Entity Extraction

5. Relation Extraction 4. Identifying answer entity type and entity location in the ontology

6. SPARQL Query Construction 5. SPARQL Query Construction

Page 8: Translating natural language competency questions into sparql queries   web2013

10

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase I: Identifying competency questions’ categories according to expected

answers’ types:

a) Definition Questions: that begins with “What is/are” or “What does mean”

b) Boolean or Yes/No Questions

c) Factual Questions: the answer is a fact or a precise information

d) List questions: the answer is a list of entities

e) Complex Questions: that begins with “How” and “Why”

3. PROPOSED TRANSLATION APPROACH (2/3)

Page 9: Translating natural language competency questions into sparql queries   web2013

11

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase I: Identifying competency questions’ categories according to expected

answers’ types:

a) Definition Questions: that begins with “What is/are” or “What does mean”

b) Boolean or Yes/No Questions

c) Factual Questions: the answer is a fact or a precise information

d) List questions: the answer is a list of entities

e) Complex Questions: that begins with “How” and “Why”

3. PROPOSED TRANSLATION APPROACH (2/3)the query result clause specifies the result form

Page 10: Translating natural language competency questions into sparql queries   web2013

12

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase II: Determining the expected (perfect or ideal) answer

Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II

Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology

Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV

3. PROPOSED TRANSLATION APPROACH (3/3)

Page 11: Translating natural language competency questions into sparql queries   web2013

13

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase II: Determining the expected (perfect or ideal) answer

Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II

Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology

Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV

3. PROPOSED TRANSLATION APPROACH (3/3)

Mapping between question/answer entity

and ontology entity

Page 12: Translating natural language competency questions into sparql queries   web2013

14

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase II: Determining the expected (perfect or ideal) answer

Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II

Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology

Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV

SELECT * WHERE {?Teacher rdf:type HERO:Teacher . }

3. PROPOSED TRANSLATION APPROACH (3/3)

Page 13: Translating natural language competency questions into sparql queries   web2013

15

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

4. CASE STUDY: HERO

Translation of Competency Questions of HERO ontology (Higher Education Reference Ontology) into SPARQL Queries

HERO describes several aspects of university domain such as organizational structure, administration, staff, roles, incomes, etc.

HERO aims to be a valuable tool for researchers and institutional employees interested in analyzing the system of higher education as a whole.

HERO Ontology is available at: http://sourceforge.net/projects/heronto/?source=directory Competency questions (81) and their corresponding queries are available at: http://herontology.esi.dz/content/downloads

Page 14: Translating natural language competency questions into sparql queries   web2013

16

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase I: Identifying competency questions’ categories according to expected answers’ types

CQs’ Categories CQs’ Examples from 81 CQs

Definition questions CQ59.What is a Credit?

Yes/No questions CQ3. Must a university teacher be a researcher?

Factual questions CQ44. What average size and duration have governing board?

List questions CQ1. What are the possible academic ranks of a teacher?

Complex questions CQ41.Why universities are organized into departments?

4. CASE STUDY

Page 15: Translating natural language competency questions into sparql queries   web2013

17

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

CQs’ Examples Corresponding Answers

CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is determined by the number of class hours the course meets each week.

CQ3. Must a university teacher be a researcher?

Nearly all faculty members are expected to engage in research.

CQ44. What average size and duration have governing board?

The average size of public boards is approximately 10 people and the average size among independent (private) institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.

CQ1. What are the possible academic ranks of a teacher?

Assistant Professor, Associate Professor, Full Professor, Professor Emeritus.

CQ41.Why universities are organized into departments?

The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every department belongs to an academic field.

Phase II: Determining the expected answer

4. CASE STUDY

Page 16: Translating natural language competency questions into sparql queries   web2013

18

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

CQs’ Examples Corresponding Answers

CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is determined by the number of class hours the course meets each week.

CQ3. Must a university teacher be a researcher?

Nearly all faculty members are expected to engage in research.

CQ44. What average size and duration have governing board?

The average size of public boards is approximately 10 people and the average size among independent (private) institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.

CQ1. What are the possible academic ranks of a teacher?

Assistant Professor, Associate Professor, Full Professor, Professor Emeritus.

CQ41.Why universities are organized into departments?

The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every department belongs to an academic field.

Phase II: Determining the expected answer

4. CASE STUDY Answers sources are: academic reports,

governmental websites, experts’ interviews, ...

Page 17: Translating natural language competency questions into sparql queries   web2013

19

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Phase III: Extracting Entity or Entities from competency questions andtheir corresponding expected answers identified in II.This extraction is based on a mapping between relevant terms inquestions/answers pairs and their equivalent terms in the ontology

Extracted terms from CQs’ Extracted terms from Answers

CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is

determined by the number of class hours the course meets each week.

CQ3. Must a university teacher be a researcher?

Nearly all faculty members are expected to engage in research.

CQ44. What average size andduration has governing

board?

The average size of public boards is approximately 10 peopleand the average size among independent (private)

institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.

CQ41.Why universities areorganized into departments?

The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every

department belongs to an academic field.

4. CASE STUDY

Page 18: Translating natural language competency questions into sparql queries   web2013

20

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Entities’ Types Entities’ Locations in the ontologyClass: CourseData Property: CourseCreditsNumber

CourseCreditsNumber Domain Course

Classes: Teacher, Researcher Teacher SubClassOf Researcher

Class: Governing BoardData Properties: Size, Duration

GoverningBoardSize Domain GoverningBoardGoverningBoardDuration Domain GoverningBoard

Class: TeacherData Property: Rank, Assistant Professor, Associate Professor, Full Professor, Professor Emeritus

TeacherRank Domain TeacherAssistantProfessor SubPropertyOf TeacherRankAssociateProfessor SubPropertyOf TeacherRankFullProfessor SubPropertyOf TeacherRankProfessorEmeritus SubPropertyOf TeacherRank

Classes: Higher Education Organization, Department

Department SubClassOf FacultyFaculty SubClassOf RoleRole SubClassOf HigherEducationOrganizationDepartment Definition

Phase IV: Identifying answer entity type (class, data property, object property, annotation, axiom, instance) and entity location in the ontology

4. CASE STUDY:

Page 19: Translating natural language competency questions into sparql queries   web2013

21

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Competency Questions SPARQL Queries

CQ59.What is a Credit? SELECT ?comment WHERE{ HERO:CourseCreditsNumber rdfs:comment ?comment }

CQ3. Must a university teacher be a researcher?

ASK{HERO:Teacher rdfs:subClassOf HERO:Researcher .}

CQ44. What average size and duration have governing board?

SELECT ?university ?size WHERE { ?university rdf:type HERO:HigherEducationOrganization; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardSize ?size }SELECT ?university ?duration WHERE { ?university rdf:type HERO:HigherEducationOrganization ; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardDuration?duration }

CQ1. What are the possible academic ranks of a teacher?

SELECT ?a ?b ?c ?d WHERE{?a rdfs:subPropertyOf HERO:TeacherRank. ?b rdfs:subPropertyOf ?a . ?c rdfs:subPropertyOf ?b . ?d rdfs:subPropertyOf ?c .}

Phase V: Construction of SPARQL queries

4. CASE STUDY:

Page 20: Translating natural language competency questions into sparql queries   web2013

22

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

Competency Questions SPARQL Queries

CQ59.What is a Credit? SELECT ?comment WHERE{ HERO:CourseCreditsNumber rdfs:comment ?comment }

CQ3. Must a university teacher be a researcher?

ASK{HERO:Teacher rdfs:subClassOf HERO:Researcher .}

CQ44. What average size and duration have governing board?

SELECT ?university ?size WHERE { ?university rdf:type HERO:HigherEducationOrganization; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardSize ?size }SELECT ?university ?duration WHERE { ?university rdf:type HERO:HigherEducationOrganization ; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardDuration?duration }

CQ1. What are the possible academic ranks of a teacher?

SELECT ?a ?b ?c ?d WHERE{?a rdfs:subPropertyOf HERO:TeacherRank. ?b rdfs:subPropertyOf ?a . ?c rdfs:subPropertyOf ?b . ?d rdfs:subPropertyOf ?c .}

Phase V: Construction of SPARQL queries

4. CASE STUDY: These queries can be checked out by using available online SPARQL end-points or off-line tools such as: TWINKLE

Page 21: Translating natural language competency questions into sparql queries   web2013

23

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

5. CONCLUSION AND FUTURE WORK

• Summary

Intended users: ontology developers, i.e.; They are familiar with: ontology language, ontologystructure and query language

Intended uses: ontology validation, i.e.; Since competency questions are the starting point for extracting relevant terms that become later ontology entities

translated CQs on SPARQL Queries target directlyontology entities

Page 22: Translating natural language competency questions into sparql queries   web2013

24

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

5. CONCLUSION AND FUTURE WORK

• Summary

Intended users: ontology developers, i.e.; They are familiar with: ontology language, ontologystructure and query language

Intended uses: ontology validation, i.e.; Since competency questions are the starting point for extracting relevant terms that become later ontology entities

translated CQs on SPARQL Queries target directlyontology entities

Helps in Entity location (phase 4 ) and query

construction (phase 5)

Helps in Entity extraction (phase 3 )

Page 23: Translating natural language competency questions into sparql queries   web2013

25

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

5. CONCLUSION AND FUTURE WORK

• Limitations Two of proposed approach phases are manual and dependent of user knowledge background: Entity extraction from questions/answers pairs and mapping between questions/answers relevant terms and ontology entities

Weak treatment of complex questions

• Future WorkThe best way to tackle the issue of manual phases is to integrate natural language processing tools like GATE in terms extraction phase and automatic matching systems such as COMA 3.0 which efficiency has been already proved.

Page 24: Translating natural language competency questions into sparql queries   web2013

26

WEB 2013 January 27 - February 1, 2013 - Seville, Spain

SOME REFERENCES

1. CQs……M. Gruninger and M. S. Fox, “Methodology for the design and evaluation of ontologies”, IJCAI95, Workshop on Basic Ontological Issues in Knowledge Sharing. Montreal, 1995, pp. 6.1–6.10.

2. Web QA Approach….. A. Ben Abacha and P. Zweigenbaum, “Medical Question Answering: Translating Medical Questions into SPARQL Queries”, Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, Miami, Florida, USA, 2012, pp. 41-50.

3. SPARQL….Querying the Semantic Web: SPARQL by Emanuelle Della Valle and Stefano Ceri, pp 299-363 in HANDBOOK OF SEMANTIC WEB TECHNOLOGIES, 2011, SPRINGER.

THANK YOU FOR YOUR ATTENTION