1
Engineering a Knowledge Base for
an Intelligent Personal Assistant
Vinay K. ChaudhriAdam CheyerRichard Guili
Bill JarroldKaren L. MyersJohn Niekrasz
2
Outline
Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work Summary and Conclusions
3
Problem
Cognitive Assistant that Learns and Organizes (CALO)
Learn from experience Be told what to do Explain what it is doing Reflect on experience Respond robustly to surprises
Situated in an office environment
4
CALO Functions
Organize & Manage
Information
Schedule &Organize in
Time
Acquire, Allocate
Resources
Prepare InformationProducts
Monitor & ManageTasks
Observe & Mediate
Interactions
CALO
PerceptionManager
MeetingActivity
Recognition
KnowledgeManager
Timeline Mgr
Update Mgr
Query Mgr
Memory Mgr
TaskManagerInteraction
Manager
Interpretation
NL/Speech
IRIS
Explanation
KnowledgeBase
Timeline Database(Episodic Memory)
CollaborativeProblem Solver
Plan Reasoner
Task Exec
Coordination Mgr
ParticipantTracking
CyberManager
RemoteCyber
Environment
Local CyberEnvironment
5
An Ontology is needed to permit sharing of data and knowledge across these various subcomponents
6
Learning in ContextProvides greater value, needs fewer examples
Important?Meeting?
…?
Subj: fMRI meeting
We need to meet soon to discuss the paper deadline.
Learning Algorithm
To: Sue @ sri.com
Subj: Re: fMRI meeting
Ok, I suggest Wednesday at 4pm.
To: Bob@ sri.com
Subj: Re: fMRI meeting
See you then. Attached is the current draft.
Leader of
ProjectsMeetings Files
Manager of
Relevant to
Works on
Meetingfor
7
8
Example Functionality
CALO will automatically put together a portfolio of information (e.g., mail, files, web pages) relevant to your projects and to upcoming meetings
CALO will summarize, prioritize, and classify an email.
CALO will identifies the action items, and produce an annotated meeting record.
9
Test Questions: PQs and Iqs(Parameterized Questions &Instantiated Questions)
What |sc:%Meeting| is being discussed or suggested in |io:%EmailMessage|?
What is the duration suggested for the meeting discussed in |io:%EmailMessage|?
What is the time suggested for the meeting discussed in |io:%EmailMessage|?
What date is mentioned in |io:%Email|?
What location is mentioned in |io:%Email|?
What time is mentioned in |io:%Email|?
10
Outline
Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work Summary and Conclusions
11
KB Development
Knowledge Representation Framework Development Process Overview of Knowledge Content
12
Knowledge Representation Framework
The “core” ontology The Component Library (CLIB)
Barker, Porter, Clark KCAP 2001. CLIB is written in KM (Knowledge Machine)
Re-usable, Composable, Domain-Independent Library
Richly axiomatized event classes (e.g. Move, Attach)
13
Knowledge RepresentationFramework
We used OWL for sharing the knowledge with modules that needed to load the ontology
We developed a KM to OWL translator We limited the translation to only that subset of KM that could be translated into OWL
14
Knowledge Representation Framework
SPARK procedure language for representing knowledge about performing automated tasks
Expressiveness of SPARK was essential for representing complex process structures necessary for accommodating office tasks
We represent uncertain knowledge using weighted rules
Weights are necessary to capture the output from learning methods
We are still able to expose a deterministic interface to the rest of the system
15
Development Process
Distributed team with over 20 different research groups
We solicited requirements List of classes and relations Formal axioms
Large scale reuse of ontologies iCalendar Work of Radarnetworks
Ontology Simplification Eliminate unneeded constructs Simplify representation
Distributed development Use Protégé for knowledge authoring
16
Overview of Knowledge Content- The “Office Ontology” People
First name, last name
Contacts Postal address, home
address, work address
Emails Sender, receiver,
etc Calendars
Start, end, repetition
Projects/Tasks
Meetings Meeting types,
discussion topics, meeting roles
Organizations Organizational
roles Learning Methods
Capability of learning methods, data needed
Provenance Source of an
information
17
Overview of Knowledge Content- Example Class ChatSessionMessage
Comment: "Instances of #$ChatSessionMessage are complete messages passed between chat participants during a #$ChatSession. For example, if Bob and Fred are involved in CALO Online Chat Bob might send the chat message 'Hi Fred' to Fred. Such a message is a #$ChatSessionMessage. More specifically, it is a #$ChatTextMessage. Please see the subclasses of #$ChatSessionMessage because developers are more likely to be referencing its subclasses. A negative example of #$ChatSessionMessage would be a portion of the message sent from Bob to Fred such as 'Hi Fr'.”
Superclasses: ElectronicMessage ComputerEncodedInformation
18
Overview of Knowledge Content
Process model system (PTIME) has approx 50 process models
In Core plus Office Ontology Approx 1000 classes Approx 500 relations
19
Outline
Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work
20
Knowledge Engineering Challenges
Reusing iCalendar Representing Meetings Representing Tasks Ensuring Interoperability
21
Reusing iCalendar
Prune the relations needed All the relations were not needed We did not want to bloat the ontology We retained only what was needed
Define symbol name mappings We renamed the relations to fit in our standard naming convention
But, we retained the mappings to the original name
Link to the rest of the ontology We needed to define iCalendar relations using existing vocabulary of People, and Time
22
Representing Meetings
Communication Model Modeling multi-modal communication Modeling Discourse Structure Modeling Meeting Activity
23
Representing MeetingsModel of Communication
24
Representing MeetingsModeling Discourse Structure
Modeling Dialog Structures Define Communicate subclasses such as Statement, Question, BackChannel, etc.
Modeling Argument Structure Define coarser level actions such as Raising an Issue, Proposal, Acceptance, etc.
25
Representing MeetingsModeling the Meeting Activity
Provide ways to segment a meeting Physical state of participants
Sitting, standing, etc Agenda state of participants
Position within a previously defined meeting structure
26
Representing Tasks
Tasks are modeled in terms of a task type a set of input and output parameters, Whether a parameter is required or optional allowed constraints
Task instances are used through out the system
Descriptive properties Priority, documentation, source, location, resource allocation and usage
Temporal properties Creation time, start time (adapted from iCalendar)
27
Ensuring Interoperability
OWL does not allow n-ary relationships
Task representation requires representing position of an argument in a list
Needed to reify each argument so that the position could be specified
OWL does not allow specialization of primitive data types
Special kinds of strings such as Postal Code, Telephone Number are of interest
Needed to define a hierarchy of ``pseudo ranges’’
28
Outline
Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work
29
Deploying the Knowledge Base
Querying the Knowledge base Uniform point of access provided by a query manager
Updating the knowledge base Methods to update the instance data if ontology changes
Mechanism to propagate additions to the ontology at runtime
30
Documentation (via Owldoc)
31
Outline
Problem KB Development Knowledge Engineering Challenges Deploying the Knowledge Base Future Work
32
Future Work
Align Different OWL files Ontology to Help “Stacked Learning”
A software engineer can solve a new learning problem by writing its specification in ontology
An end-user can specify a goal, and CALO can compute how to learn to meet that goal
CALO can infer a user’s goal and learn how to achieve that goal