25
From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David Martin, Mark Stickel, and Richard Waldinger of SRI Chris Culy SRI International Menlo Park, CA

From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

Embed Size (px)

Citation preview

Page 1: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

From Question-Answeringto Information-Seeking Dialogs

Jerry R. Hobbs

USC/ISI

Marina del Rey, CA

with Douglas Appelt, David Israel, Peter Jarvis, David Martin,Mark Stickel, and Richard Waldinger of SRI

Chris Culy

SRI International

Menlo Park, CA

Page 2: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 2

Decomposing Questions

Could Mohammed Atta have met with an Iraqi official between 1998 and 2001?

IE Engine

GeographicalReasoning

QuestionDecomposition

via Logical Rules

ResourceAttached toReasoning

Process

meet(a,b,t) & 1998 t 2001

at(a,x1,t) & at(b,x2,t) & near(x1,x2) & official(b,Iraq)

go(a,x1,t) go(b,x2,t)

IE Engine

TemporalReasoning

Logical Form

SNARK

Page 3: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 3

The Problem

Inference in large knowledge bases is required for competent question-answering

Many rich but heterogeneous knowledge bases exist today

How do we make use of them in a single system?

Page 4: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 4

Outline

Three Resources:

1. The Semantic Web: Teknowledge’s search engine ASCS

2. An Information Extraction Engine: SRI’s TextPro

3. An Ontology of Time: DAML-Time

Page 5: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 5

DAML Search Engine

pred:

arg1:

arg2: Indonesia

?x

capital namespace

namespace

namespace

Searches entire(soon to be

exponentially growing)Semantic Web

Also conjunctive queries: population of capital of Indonesia

Problem: you have to know logic and RDF to use it.

Tecknowledge has developed ASCS:

Page 6: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 6

DAML Search Engineas AQUAINT Web

Resource

pred:

arg1:

arg2: Indonesia

?x

capital namespace

namespace

namespace

Searches entire(soon to be

exponentially growing)Semantic Web

Solution: You only have to know English to use it; Makes the entire Semantic Web accessible to AQUAINT users.Also: Can use it for subqueries.

AQUAINT System

capital(?x,Indonesia)

procedural attachment in SNARK

Page 7: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 7

Namespace Problem

Where to find the right predicates?

In QUARK: Subtheories linking predicates to namespaces Subtheories linking topics to namespaces

In DAML/ASCS: EQUIVALENT statements Standardized ontologies Use WordNet and SUMO to expand query Any namespace

Decreasingprecision

Decreasingprecision

Page 8: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 8

Information ExtractionEngine as a Resource

Document retrieval for pre-processing

TextPro: Top of the line information extraction engine recognizes subject-verb-object, coref rels

Analyze NL query w GEMINI and SNARK

Bottom out in a pattern for TextPro to seek

Keyword search on very large corpus

TextPro runs over documents retrieved

Page 9: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 9

Linking SNARK with TextPro

TextSearch(EntType(?x), Terms(p), Terms(c), WSeq)

& Analyze(WSeq, p(?x,c))

--> p(?x,c)

Call to TextPro

Type of questionedconstituent

Synonyms and hypernymsof word associated with p or c

Answer:Ordered sequence

of annotated strings of words

Match pieces of annotated answer strings with pieces of query

Subquery generated by SNARKduring analysis of query

Page 10: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 10

Three Modes of Operationfor TextPro

1. Search for predefined patterns and relations (ACE-style) and translate relations into SNARK's logic

Where does the CEO of IBM live?

2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano is CEO of IBM."

3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."

Use coreference links to get most informative answer

ACE Roleand AT

Relations

Page 11: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 11

First Mode

TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, Role(?x,Management,IBM,CEO)) --> CEO(?x,IBM)

CEO(Samuel Palmisano,IBM)

Analyze

Entity1: {Samuel Palmisano, Palmisano, head, he}Entity2: {IBM, International Business Machines, they}Relation: Role(Entity1,Entity2, Management,CEO)

<relation TYPE=Role SUBTYPE=Management> <rel_entity_arg ID=“Entity1” ARGNUM=“1”/> <rel_entity_arg ID=“Entity2” ARGNUM=“2”/> <rel_attribute ATTR=“POSITION”>CEO</rel_attribute></relation>

Page 12: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 12

Three Modes of Operationfor TextPro

1. Search for predefined patterns (MUC-style) and translate template into SNARK's logic Where does the CEO of IBM live?

2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano heads IBM."

3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."

Use coreference links to get most informative answer

Page 13: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 13

Second Mode

TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, CEO(?x,IBM)) --> CEO(?x,IBM)

"<subj> Samuel Palmisano </subj> <verb> heads </verb> <obj> IBM </obj>"

CEO(Samuel Palmisano,IBM)

Analyze

Page 14: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 14

Three Modes of Operationfor TextPro

1. Search for predefined patterns (MUC-style) and translate template into SNARK's logic Where does the CEO of IBM live?

2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano is CEO of IBM."

3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."

Use coreference links to get most informative answer

Page 15: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 15

Third Mode

TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, CEO(?x,IBM)) --> CEO(?x,IBM)

"<person> He </person> has recently been rumored to have been

appointed Lou Gerstner's successor as <CEOword> CEO </CEOword>of the major computer maker nicknamed <co> Big Blue </co>"

CEO(Samuel Palmisano,IBM)

Analyze

"<person> Samuel Palmisano </person> ...."

coref

Page 16: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 16

Challenges for IE

Cross-document identification of individuals Document 1: Osama bin Laden Document 2: bin Laden Document 3: Usama bin Laden

Do entities with the same or similar names represent the same individual?

Metonymy Text: Beijing approved the UN resolution on Iraq. Query involves “China”, not “Beijing”

Page 17: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 17

Temporal Reasoning: Structure

Topology of Time: start, end, before, between

Measures of Duration: for an hour, ...

Clock and Calendar: 3:45pm, Wednesday, June 12

Temporal Aggregates: every other Wednesday

Deictic Time: last year, ...

Page 18: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 18

Temporal Reasoning: Goals

Develop temporal ontology (DAML)

Reason about time in SNARK (AQUAINT, DAML)

Link with Temporal Annotation Language TimeML (AQUAINT)

Answer questions with temporal component (AQUAINT)

Nearly complete

In progress

Page 19: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 19

Convergence

DAML Annotationof Temporal Information

on Web(DAML-Time)

Annotation of Temporal Information

in Text(TimeML)

Most information on Web is in text

The two annotation schemesshould be intertranslatable

Page 20: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 20

TimeML Annotation Scheme(An Abstract View)

2001

6 mos

Sept 11

warning

clock & calendar intervals& instants

intervalsinclusion

beforedurations

instantaneousevents

Page 21: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 21

TimeML Example

The top commander of a Cambodian resistance force said Thursdayhe has sent a team to recover the remains of a British mine removalexpert kidnapped and presumed killed by Khmer Rouge guerrillastwo years ago.

resist

command

sent recover

Thursday

said now

remove kidnap

2 years

presumed

killedremain

Page 22: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 22

Vision for Time

Manual DAML temporal annotation of web resources

Manual temporal annotation of large NL corpus

Programs for automatic temporal annotation of NL text

Automatic DAML temporal annotation of web resources

Page 23: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 23

Spatial and GeographicalReasoning: Structure

Topology of Space: Is Albania a part of Europe?

Dimensionality: How long/big is Chile?

Measures: How large is North Korea? Orientation and Shape: What direction is Monterey from SF?

Latitude and Longitude: Alexandrian Digital Library Gazetteer

Political Divisions: CIA World Fact Book, ...

Page 24: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 24

Spatial and GeographicalReasoning: Goals

Develop spatial and geographical ontology (DAML)

Reason about space and geography in SNARK (AQUAINT, DAML)

Attach spatial and geographical resources (AQUAINT)

Answer questions with spatial component (AQUAINT)

Somecapability

now

Page 25: From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA with Douglas Appelt, David Israel, Peter Jarvis, David

12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 25

Status and Future Directions

Basic architecture essentially complete

Good sampling of web and other resources have been incorporated

Focus on bulking up knowledge base relevant to domain (nonproliferation)

Focus on dialogue structure