Upload
zemmouchi-ghomari-leila
View
1.070
Download
5
Embed Size (px)
Citation preview
GlobeNet 2013WEB 2013, The First International Conference on
Building and Exploring Web Based EnvironmentsJanuary 27 - February 1, 2013 - Seville, Spain
Translating Natural Language Competency Questions into SPARQL
Queries: A Case Study
Authors:
Leila ZEMMOUCHI-GHOMARI, [email protected] Abdessamed Réda GHOMARI, [email protected]
LMCS LaboratoryNational Superior School of Computer Science, Algiers, Algeria
www.esi.dz
2
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
1. MOTIVATION
2. RELATED WORK
3. PROPOSED TRANSLATION APPROACH
4. CASE STUDY
6. CONCLUSIONS AND FUTURE WORK
OUTLINE
3WEB 2013 January 27 - February 1, 2013 - Seville, Spain
1. MOTIVATION
Translation
The context of the current research work is a PHD thesis focused on an ontology engineering process
4WEB 2013 January 27 - February 1, 2013 - Seville, Spain
1. MOTIVATION
Translation
The context of the current research work is a PHD thesis focused on an ontology engineering process
Competency questions is a well-known technique that
allow to determine the requirements or needs the
ontology should fulfill
expressed in a formal language in order to
allow automatic evaluation
7
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
2. RELATED WORK
CNL
OWLPATH
PANTO
DEANNA
Ben Abacha & Zweigenbaum
Approach, 2012
To the best of our knowledge, automatic translation of competency questions into SPARQL queries, with the aim of validating an ontology, has not been tackled by researchers. Although, in a more general perspective, there exist several approaches dedicated to web Question Answering (QA) area
8
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
2. RELATED WORK
CNL
Ontology-basedControlled
Natural LanguageEditor
OWLPATH
OWL Ontology-guided query
Editor
PANTO
Portable Natural Language
Interface to Ontologies
DEANNA
Deep Answers for Naturally Asked
Questions
Ben Abacha &ZweigenbaumApproach, 2012
Translating MedicalQuestions into
SPARQL Queries
Limitations:
Scalability: Their test ontologies are relatively small Preliminary work are necessary to apply theses approaches like Mapping set between concepts’ questions and queried knowledge bases difficult to carry out and to maintain. some of them focus on some types of questions and some know. domains No consensus of web QA community on a single approach
9
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
3. PROPOSED TRANSLATION APPROACH (1/3)A variation of [Ben Abacha & Zweigenbaum, 2012] Approach
WHY ?
HOW ?
Specific to the medical field
Limited to a particular set of questions:
WH questions, except complex ones (why and when).
Their approach Our approach
1. Identifying QuestionType 1. Identifying QuestionType
2. Determining the Expected Answer(s)Type(s) for WH questions
2. Determining the expected answer
3. Constructing the question’s affirmative and simplified form
4. Medical Entity Recognition (treatment, disease…)
3. Entity Extraction
5. Relation Extraction 4. Identifying answer entity type and entity location in the ontology
6. SPARQL Query Construction 5. SPARQL Query Construction
10
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase I: Identifying competency questions’ categories according to expected
answers’ types:
a) Definition Questions: that begins with “What is/are” or “What does mean”
b) Boolean or Yes/No Questions
c) Factual Questions: the answer is a fact or a precise information
d) List questions: the answer is a list of entities
e) Complex Questions: that begins with “How” and “Why”
3. PROPOSED TRANSLATION APPROACH (2/3)
11
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase I: Identifying competency questions’ categories according to expected
answers’ types:
a) Definition Questions: that begins with “What is/are” or “What does mean”
b) Boolean or Yes/No Questions
c) Factual Questions: the answer is a fact or a precise information
d) List questions: the answer is a list of entities
e) Complex Questions: that begins with “How” and “Why”
3. PROPOSED TRANSLATION APPROACH (2/3)the query result clause specifies the result form
12
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase II: Determining the expected (perfect or ideal) answer
Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II
Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology
Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV
3. PROPOSED TRANSLATION APPROACH (3/3)
13
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase II: Determining the expected (perfect or ideal) answer
Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II
Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology
Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV
3. PROPOSED TRANSLATION APPROACH (3/3)
Mapping between question/answer entity
and ontology entity
14
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase II: Determining the expected (perfect or ideal) answer
Phase III: Extracting Entity or Entities from questions and theircorresponding expected answers identified in II
Phase IV: Identifying answer entity type (class, data property,object property, annotation, axiom, instance) and entity location inthe ontology
Phase V: Constructing SPARQL query based on question typeidentified in phase I, question/answer entity extracted from phaseIII and its corresponding entity type/entity location in the ontologyfrom phase IV
SELECT * WHERE {?Teacher rdf:type HERO:Teacher . }
3. PROPOSED TRANSLATION APPROACH (3/3)
15
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
4. CASE STUDY: HERO
Translation of Competency Questions of HERO ontology (Higher Education Reference Ontology) into SPARQL Queries
HERO describes several aspects of university domain such as organizational structure, administration, staff, roles, incomes, etc.
HERO aims to be a valuable tool for researchers and institutional employees interested in analyzing the system of higher education as a whole.
HERO Ontology is available at: http://sourceforge.net/projects/heronto/?source=directory Competency questions (81) and their corresponding queries are available at: http://herontology.esi.dz/content/downloads
16
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase I: Identifying competency questions’ categories according to expected answers’ types
CQs’ Categories CQs’ Examples from 81 CQs
Definition questions CQ59.What is a Credit?
Yes/No questions CQ3. Must a university teacher be a researcher?
Factual questions CQ44. What average size and duration have governing board?
List questions CQ1. What are the possible academic ranks of a teacher?
Complex questions CQ41.Why universities are organized into departments?
4. CASE STUDY
17
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
CQs’ Examples Corresponding Answers
CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is determined by the number of class hours the course meets each week.
CQ3. Must a university teacher be a researcher?
Nearly all faculty members are expected to engage in research.
CQ44. What average size and duration have governing board?
The average size of public boards is approximately 10 people and the average size among independent (private) institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.
CQ1. What are the possible academic ranks of a teacher?
Assistant Professor, Associate Professor, Full Professor, Professor Emeritus.
CQ41.Why universities are organized into departments?
The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every department belongs to an academic field.
Phase II: Determining the expected answer
4. CASE STUDY
18
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
CQs’ Examples Corresponding Answers
CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is determined by the number of class hours the course meets each week.
CQ3. Must a university teacher be a researcher?
Nearly all faculty members are expected to engage in research.
CQ44. What average size and duration have governing board?
The average size of public boards is approximately 10 people and the average size among independent (private) institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.
CQ1. What are the possible academic ranks of a teacher?
Assistant Professor, Associate Professor, Full Professor, Professor Emeritus.
CQ41.Why universities are organized into departments?
The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every department belongs to an academic field.
Phase II: Determining the expected answer
4. CASE STUDY Answers sources are: academic reports,
governmental websites, experts’ interviews, ...
19
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Phase III: Extracting Entity or Entities from competency questions andtheir corresponding expected answers identified in II.This extraction is based on a mapping between relevant terms inquestions/answers pairs and their equivalent terms in the ontology
Extracted terms from CQs’ Extracted terms from Answers
CQ59.What is a Credit? Each course bears a specified number of credits.In general, the number of credits a course carries is
determined by the number of class hours the course meets each week.
CQ3. Must a university teacher be a researcher?
Nearly all faculty members are expected to engage in research.
CQ44. What average size andduration has governing
board?
The average size of public boards is approximately 10 peopleand the average size among independent (private)
institutions is 30. The length of board members’ terms varies from three years to as long as 12 years.
CQ41.Why universities areorganized into departments?
The basic unit of academic organization in most institutions is the department (e.g., chemistry, political science). Every
department belongs to an academic field.
4. CASE STUDY
20
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Entities’ Types Entities’ Locations in the ontologyClass: CourseData Property: CourseCreditsNumber
CourseCreditsNumber Domain Course
Classes: Teacher, Researcher Teacher SubClassOf Researcher
Class: Governing BoardData Properties: Size, Duration
GoverningBoardSize Domain GoverningBoardGoverningBoardDuration Domain GoverningBoard
Class: TeacherData Property: Rank, Assistant Professor, Associate Professor, Full Professor, Professor Emeritus
TeacherRank Domain TeacherAssistantProfessor SubPropertyOf TeacherRankAssociateProfessor SubPropertyOf TeacherRankFullProfessor SubPropertyOf TeacherRankProfessorEmeritus SubPropertyOf TeacherRank
Classes: Higher Education Organization, Department
Department SubClassOf FacultyFaculty SubClassOf RoleRole SubClassOf HigherEducationOrganizationDepartment Definition
Phase IV: Identifying answer entity type (class, data property, object property, annotation, axiom, instance) and entity location in the ontology
4. CASE STUDY:
21
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Competency Questions SPARQL Queries
CQ59.What is a Credit? SELECT ?comment WHERE{ HERO:CourseCreditsNumber rdfs:comment ?comment }
CQ3. Must a university teacher be a researcher?
ASK{HERO:Teacher rdfs:subClassOf HERO:Researcher .}
CQ44. What average size and duration have governing board?
SELECT ?university ?size WHERE { ?university rdf:type HERO:HigherEducationOrganization; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardSize ?size }SELECT ?university ?duration WHERE { ?university rdf:type HERO:HigherEducationOrganization ; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardDuration?duration }
CQ1. What are the possible academic ranks of a teacher?
SELECT ?a ?b ?c ?d WHERE{?a rdfs:subPropertyOf HERO:TeacherRank. ?b rdfs:subPropertyOf ?a . ?c rdfs:subPropertyOf ?b . ?d rdfs:subPropertyOf ?c .}
Phase V: Construction of SPARQL queries
4. CASE STUDY:
22
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
Competency Questions SPARQL Queries
CQ59.What is a Credit? SELECT ?comment WHERE{ HERO:CourseCreditsNumber rdfs:comment ?comment }
CQ3. Must a university teacher be a researcher?
ASK{HERO:Teacher rdfs:subClassOf HERO:Researcher .}
CQ44. What average size and duration have governing board?
SELECT ?university ?size WHERE { ?university rdf:type HERO:HigherEducationOrganization; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardSize ?size }SELECT ?university ?duration WHERE { ?university rdf:type HERO:HigherEducationOrganization ; ?y rdfs:subClassOf ?university ; ?y HERO:GoverningBoardDuration?duration }
CQ1. What are the possible academic ranks of a teacher?
SELECT ?a ?b ?c ?d WHERE{?a rdfs:subPropertyOf HERO:TeacherRank. ?b rdfs:subPropertyOf ?a . ?c rdfs:subPropertyOf ?b . ?d rdfs:subPropertyOf ?c .}
Phase V: Construction of SPARQL queries
4. CASE STUDY: These queries can be checked out by using available online SPARQL end-points or off-line tools such as: TWINKLE
23
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
5. CONCLUSION AND FUTURE WORK
• Summary
Intended users: ontology developers, i.e.; They are familiar with: ontology language, ontologystructure and query language
Intended uses: ontology validation, i.e.; Since competency questions are the starting point for extracting relevant terms that become later ontology entities
translated CQs on SPARQL Queries target directlyontology entities
24
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
5. CONCLUSION AND FUTURE WORK
• Summary
Intended users: ontology developers, i.e.; They are familiar with: ontology language, ontologystructure and query language
Intended uses: ontology validation, i.e.; Since competency questions are the starting point for extracting relevant terms that become later ontology entities
translated CQs on SPARQL Queries target directlyontology entities
Helps in Entity location (phase 4 ) and query
construction (phase 5)
Helps in Entity extraction (phase 3 )
25
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
5. CONCLUSION AND FUTURE WORK
• Limitations Two of proposed approach phases are manual and dependent of user knowledge background: Entity extraction from questions/answers pairs and mapping between questions/answers relevant terms and ontology entities
Weak treatment of complex questions
• Future WorkThe best way to tackle the issue of manual phases is to integrate natural language processing tools like GATE in terms extraction phase and automatic matching systems such as COMA 3.0 which efficiency has been already proved.
26
WEB 2013 January 27 - February 1, 2013 - Seville, Spain
SOME REFERENCES
1. CQs……M. Gruninger and M. S. Fox, “Methodology for the design and evaluation of ontologies”, IJCAI95, Workshop on Basic Ontological Issues in Knowledge Sharing. Montreal, 1995, pp. 6.1–6.10.
2. Web QA Approach….. A. Ben Abacha and P. Zweigenbaum, “Medical Question Answering: Translating Medical Questions into SPARQL Queries”, Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, Miami, Florida, USA, 2012, pp. 41-50.
3. SPARQL….Querying the Semantic Web: SPARQL by Emanuelle Della Valle and Stefano Ceri, pp 299-363 in HANDBOOK OF SEMANTIC WEB TECHNOLOGIES, 2011, SPRINGER.
THANK YOU FOR YOUR ATTENTION