Upload
willis-collins
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
1CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
CS 9010: Knowledge-Based Systems
Ontologies and OWL
Paula Matuszek
Spring, 2011
2CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
What Is An Ontology• An ontology is an explicit description of a
domain:– concepts
– properties and attributes of concepts
– constraints on properties and attributes
– Individuals (often, but not always)
• An ontology defines – a common vocabulary
– a shared understanding
3CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Ontology Examples• Taxonomies on the Web
– Google Directory, Google Recipes
• Catalogs for on-line shopping– Amazon.com product catalog
• Domain-specific standard terminology– Unified Medical Language System (UMLS)
and MeSH
• Broad general ontologies– Cyc
4CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
5CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
6CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
7CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Cyc Ontology
8CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Why Develop an Ontology?• To share common understanding of the
structure of information – among people– among software agents
• To enable reuse of domain knowledge– to avoid “re-inventing the wheel”– to introduce standards to allow interoperability
9CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
More Reasons• To make domain assumptions explicit
– easier to change domain assumptions (consider a genetics knowledge base)
– easier to understand and update legacy data
• To separate domain knowledge from the operational knowledge– re-use domain and operational knowledge
separately (e.g., configuration based on constraints)
10
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
• Which wine should I serve with seafood today?
• What wines should I buy next Monday for my reception in Villanova, PA?
• Is there a market for the products of another small winery in this area?
• What online source is the best for wines for my party in Texas next fall?
Consider Questions Like:
11
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
What Do We Need to Know?
12
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
French winesand
wine regions
California wines and
wine regions
Which wine should
I serve with seafood today? A sharedA shared
ONTOLOGYONTOLOGYof of
wine and foodwine and food
A sharedA sharedONTOLOGYONTOLOGY
of of wine and foodwine and food
13
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Wines and Wineries
14
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
An Ontology Is Often Just the Beginning
Ontologies
Software agents Problem-
solving methods
Domain-independent applications
DatabasesDeclare
structure
KnowledgebasesProvide
domaindescription
15
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
What does an Ontology Look Like?• Can be any knowledge representation
• Typically a description logic or associative network of some kind.
Internal Representation
English Representation
Facts
Reasoning Programs
16
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Description Logics
• Modeling-based representations reflect the structure of the domain, and then reason based on the model.– Semantic Nets– Frames– Scripts
• Sometimes called associative networks
17
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Basics of Description Logics
• All include – Concepts
– Various kinds of links between concepts• “has-part” or aggregation
• “is-a” or specialization
• More specialized depending on domain
• Typically also include– Inheritance
– Some kind of procedural attachment or restrictions or constraints
18
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Ontology-Development Process
General approach:
determinescope
considerreuse
enumerateterms
defineclasses
defineproperties
defineconstraints
createinstances
Usually a highly iterative process. (Sound familiar?)
19
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Preliminaries - Tools• All screenshots in this tutorial are from
Protégé, which:– is a graphical ontology-development tool– supports a rich knowledge model– is open-source and freely available
(http://protege.stanford.edu)
• Some other available tools:– Ontolingua and Chimaera– OntoEdit– OilEd– OpenCyc
20
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Determine Domain and Scope
• What is the domain that the ontology will cover?
• For what we are going to use the ontology?• For what types of questions the information in
the ontology should provide answers (competency questions)?
determinescope
considerreuse
enumerateterms
defineclasses
defineproperties
defineconstraints
createinstances
21
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Competency Questions• Which wine characteristics should I consider when
choosing a wine?• Is Bordeaux a red or white wine?• Does Cabernet Sauvignon go well with seafood?• What is the best choice of wine for grilled meat?• Which characteristics of a wine affect its appropriateness
for a dish?• Does a flavor or body of a specific wine change with
vintage year?• What were good vintages for Napa Zinfandel?
22
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Consider Reuse
• Why reuse other ontologies?– to save the effort– to interact with the tools that use other
ontologies– to use ontologies that have been validated
through use in applications
determinescope
considerreuse
enumerateterms
defineclasses
defineproperties
defineconstraints
createinstances
23
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
What to Reuse?• Ontology libraries
– Protégé ontology library (protege.cim3.net/cgi-bin/wiki.pl?ProtegeOntologiesLibrary)
– Biomedical ontologies (bioontology.org)– Many others: try Googling for your topic– http://www.dmoz.org/
• Upper ontologies– IEEE Standard Upper Ontology (suo.ieee.org)– Cyc (www.cyc.com)
24
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Enumerate Important Terms
• What are the terms we need to talk about?
• What are the properties of these terms?
• What do we want to say about the terms?
considerreuse
determinescope
enumerateterms
defineclasses
defineproperties
defineconstraints
createinstances
25
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Enumerating Terms - The Wine Ontology
wine, grape, winery, location,
wine color, wine body, wine flavor, sugar content
white wine, red wine, Bordeaux wine
food, seafood, fish, meat, vegetables, cheese
26
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Define Classes and the Class Hierarchy
• A class is a concept in the domain– a class of wines– a class of wineries– a class of red wines
• A class is a collection of elements with similar properties
• Instances of classes– a glass of California wine you’ll have for lunch
considerreuse
determinescope
defineclasses
defineproperties
defineconstraints
createinstances
enumerateterms
27
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Class Inheritance
• Classes usually constitute a taxonomic hierarchy (a subclass-superclass hierarchy)
• A class hierarchy is usually an IS-A hierarchy:
an instance of a subclass is an instance of a superclass
• If you think of a class as a set of elements, a subclass is a subset
28
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Class Inheritance - Example• Apple is a subclass of Fruit
Every apple is a fruit
• Red wines is a subclass of WineEvery red wine is a wine
• Chianti wine is a subclass of Red wineEvery Chianti wine is a red wine
29
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Levels in the Hierarchy
Middlelevel
Toplevel
Bottomlevel
30
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Modes of Development• top-down – define the most general
concepts first and then specialize them• bottom-up – define the most specific
concepts and then organize them in more general classes
• combination – define the more salient concepts first and then generalize and specialize them
31
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Documentation• Classes (and properties) usually have
documentation– Describing the class in natural language– Listing domain assumptions relevant to the
class definition– Listing synonyms
• Documenting classes and properties is as important as documenting computer code!
32
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Define Properties of Classes – Properties
• Properties in a class definition describe attributes of instances of the class and relations to other instancesEach wine will have color, sugar content,
producer, etc.
considerreuse
determinescope
defineconstraints
createinstances
enumerateterms
defineclasses
defineproperties
33
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Properties (Properties)• Types of properties
– “intrinsic” properties: flavor and color of wine– “extrinsic” properties: name and price of wine– parts: ingredients in a dish– relations to other objects: producer of wine (winery)
• Simple and complex properties– simple properties (attributes): contain primitive values
(strings, numbers)– complex properties: contain (or point to) other objects
(e.g., a winery instance)
34
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Properties for the Class Wine
(in Protégé-2000)
35
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Property and Class Inheritance
• A subclass inherits all the properties from the superclassIf a wine has a name and flavor, a red wine also has a name
and flavor
• If a class has multiple superclasses, it inherits properties from all of themPort is both a dessert wine and a red wine. It inherits “sugar
content: high” from the former and “color:red” from the latter
36
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Property Constraints
• Property constraints (restrictions) describe or limit the set of possible values for a propertyThe name of a wine is a string
The wine producer is an instance of Winery
A winery has exactly one location
considerreuse
determinescope
createinstances
enumerateterms
defineclasses
defineconstraints
defineproperties
37
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Restrictions for Properties at the Wine Class
38
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Common Restrictions• Property cardinality – the number of values
a property has• Property value type – the type of values a
property has• Minimum and maximum value – a range of
values for a numeric property• Default value – the value a property has
unless explicitly specified otherwise
39
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Common Restrictions: Property Cardinality
• Cardinality– Cardinality N means that the property must have N values
• Minimum cardinality– Minimum cardinality 1 means that the property must have a value
(required)
– Minimum cardinality 0 means that the property value is optional
• Maximum cardinality– Maximum cardinality 1 means that the property can have at most
one value (single-valued property)
– Maximum cardinality greater than 1 means that the property can have more than one value (multiple-valued property)
40
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Common Restrictions: Value Type• String: a string of characters (“Château Lafite”)• Number: an integer or a float (15, 4.5)• Boolean: a true/false flag• Enumerated type: a list of allowed values (high,
medium, low)• Complex type: an instance of another class
– Specify the class to which the instances belongThe Wine class is the value type for the property
“produces” at the Winery class
41
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Domain and Range of Property• Domain of a property – the class (or
classes) that have the property– More precisely: class (or classes) instances of
which can have the property
• Range of a property – the class (or classes) to which property values belong
42
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Restrictions and Class Inheritance• A subclass inherits all the properties from the superclass
• A subclass can override the restrictions to “narrow” the list of allowed values– Make the cardinality range smaller
– Replace a class in the range with a subclass
Wine
Frenchwine
Winery
Frenchwinery
is-a is-a
producer
producer
43
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Create Instances
• Create an instance of a class– The class becomes a direct type of the instance
– Any superclass of the direct type is a type of the instance
• Assign property values for the instance frame– Property values should conform to the facet constraints
– Knowledge-acquisition tools often check that
considerreuse
determinescope
createinstances
enumerateterms
defineclasses
defineproperties
defineconstraints
44
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Creating an Instance: Example
45
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Defining Classes and a Class Hierarchy
• The things to remember:– There is no single correct class hierarchy– But there are some guidelines
• The question to ask:“Is each instance of the subclass an instance of its
superclass?”
46
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Disjoint Classes• Classes are disjoint if they cannot have common instances• Disjoint classes cannot have any common subclasses either
Red wine, White wine,Rosé wine are disjoint
Dessert wine and Redwine are not disjoint
Wine
Redwine
Roséwine
Whitewine
Dessertwine
Port
47
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Classes and Their Names• Classes represent concepts in the domain, not their names
• The class name can change, but it will still refer to the same concept
• Synonym names for the same concept are not different classes– Many systems allow listing synonyms as part of the class
definition
48
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Back to the Properties: Domain and Range
• When defining a domain or range for a property, find the most general class or classes
• Consider the flavor property– Domain: Red wine, White wine, Rosé wine– Domain: Wine
• Consider the produces property for a Winery:– Range: Red wine, White wine, Rosé wine– Range: Wine
49
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Back to the Properties: Domain and Range
• When defining a domain or range for a property, find the most general class or classes
• Consider the flavor property– Domain: Red wine, White wine, Rosé wine
– Domain: Wine
• Consider the produces property for a Winery:– Range: Red wine, White wine, Rosé wine
– Range: Wine
propertyclass allowed values
DOMAIN RANGE
50
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Inverse PropertiesMaker and
Producer
are inverse properties
51
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Inverse Properties (II)• Inverse properties contain redundant information,
but– Allow acquisition of the information in either direction– Enable additional verification– Allow presentation of information in both directions
• The actual implementation differs from system to system– Are both values stored?– When are the inverse values filled in?– What happens if we change the link to an inverse
property?
52
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Limiting the Scope• An ontology should not contain all the
possible information about the domain– No need to specialize or generalize more than
the application requires– No need to include all possible properties of a
class• Only the most salient properties
• Only the properties that the applications require
53
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Limiting the Scope (II)• Ontology of wine, food, and their pairings
probably will not include– Bottle size– Label color– My favorite food and wine
• An ontology of biological experiments will contain– Biological organism– Experimenter
• Is the class Experimenter a subclass of Biological organism?
54
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Ontology Languages• RDF and RDFS
• DAML+OIL
• OWL
• Cyc-L
• Others
• OWL has a special status within the semantic web community
55
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
OWL Representation• An OWL file is a text file
– XML– Uses RDFS– Plus OWL-specific extensions
• If you are using a tool like Protege to create your ontology you don’t need to know the underlying representation
• If you’re going to do something with it, you do.
56
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
OWL Example• This ontology
• Becomes this text file: wine.txt
56
57
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
OWL: Classes And a Class Hierarchy
• Classes– Each class is an instance of owl:Class
• Class hierarchy– Defined by rdfs:subClassOf
• More ways to specify organization of classes– Disjointness (owl:disjointWith)– Equivalence (owl:EquivalentClass)
• Predefined: owl:Thing, owl:Nothing
58
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
More Ways To Define a Class in OWL• Union of classes: owl:unionOf
A class Person is a union of classes Male and Female
• Intersection of classes: owl:intersectionOfA class Red Wine is an intersection of Wine and Red Thing
• Complement of a class: owl:complementOfCarnivores are all the animals that are not herbivores
• Enumeration of elements: owl:oneOfA class Wine Color contains the following instances: red, white, rosé
• Restriction on properties: owl:RestrictionA class Red Thing is a collection of things with color: Red
59
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Properties in OWL• Two kinds of property
– owl:ObjectProperty. Object properties relate objects to other objects. Example: Goes-well-with.
– owl:datatypeProperty. Datatype properties relate objects to datatype values. Example: Has-cost.
• Properties can be equivalent– owl:equivalentProperty. Example: bottled-by
and produced-by
60
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Some Special Properties in OWL• owl:TransitiveProperty
– Is more expensive than
• owl:SymmetricProperty– Comes from vinyard
• owl:FunctionalProperty– Has UPC code
• owl:InverseFunctionalProperty– Is UPC code of
61
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Research Issues in Ontology Creation
• Content generation• Analysis and evaluation• Maintenance • Ontology languages• Tool development
62
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Content: Top-Level Ontologies• What does “top-level” mean?
– Objects: tangible, intangible
– Processes, events, actors, roles
– Agents, organizations
– Spaces, boundaries, location
– Time
• IEEE Standard Upper Ontology effort– Goal: Design a single upper-level ontology
– Process: Merge upper-level of existing ontologies
63
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Content: Knowledge Acquisition• Knowledge acquisition is a bottleneck
• Sharing and reuse alleviate the problem
• But we need automated knowledge acquisition techniques– Linguistic techniques: ontology acquisition from text
– Machine-learning: generate ontologies from structured documents (e.g., XML documents)
– Exploiting the Web structure: generate ontologies by crawling structured Web sites
– Knowledge-acquisition templates: experts specify only part of the knowledge required
64
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Analysis• Analysis: semantic consistency
– Violation of property constraints– Cycles in the class hierarchy– Terms which are used but not defined– Interval restrictions that produce empty intervals
(min > max)
• Analysis: style– Classes with a single subclass– Classes and properties with no definitions– Properties with no constraints (value type, cardinality)
65
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Evaluation • One of the hardest problems in ontology
design
• Ontology design is subjective
• What does it mean for an ontology to be correct (objectively)?
• The best test is the application for which the ontology was designed
66
CSC 9010 Spring, 2011. Paula Matuszek
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt
Ontology Languages• What is the “right” level of expressiveness?
• What is the “right” semantics?
• When does the language make “too many” assumptions?
67
CSC 9010 Spring, 2006. Paula Matuszek. [email protected]
Problems with Hand-Crafting• Requires both skill in building ontologies
and knowledge of the domain
• Tremendously time consuming, even with tools like Protégé.
• Easy to end up with errors– Inheritance– Defaults– Instances
• Hard to keep up to date.
68
CSC 9010 Spring, 2006. Paula Matuszek. [email protected]
What Can Help? Some Technologies
• NLP tools– Term extraction
• Document summarization and topic tree creation (www.megaputer.com)
• Entity and relationship extraction to get instances
• Machine Learning– Association-rule-learning algorithms
– Bayesian Classifiers
– Rule induction
– Clustering
Note: these are all classification tools of some kind
69
CSC 9010 Spring, 2006. Paula Matuszek. [email protected]
What Can Help? Semi-automated Ontology Building
• Extract from existing data on the web with Google-type search– KnowItAll: Etzioni et al– Cyc: Matuszek et al
• KCVC: knowledge contribution from volunteer contributors– FACTory– ESPGame
70
CSC 9010 Spring, 2006. Paula Matuszek. [email protected]
Knowledge Management• Ontologies represented in OWL/RDF/XML
have many of the same issues as any software development project– Change Management and tracking– Permissions and ownership– Components and reuse