70
1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems Ontologies and OWL Paula Matuszek Spring, 2011

1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

Embed Size (px)

Citation preview

Page 1: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

1CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

CS 9010: Knowledge-Based Systems

Ontologies and OWL

Paula Matuszek

Spring, 2011

Page 2: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

2CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

What Is An Ontology• An ontology is an explicit description of a

domain:– concepts

– properties and attributes of concepts

– constraints on properties and attributes

– Individuals (often, but not always)

• An ontology defines – a common vocabulary

– a shared understanding

Page 3: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

3CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Ontology Examples• Taxonomies on the Web

– Google Directory, Google Recipes

• Catalogs for on-line shopping– Amazon.com product catalog

• Domain-specific standard terminology– Unified Medical Language System (UMLS)

and MeSH

• Broad general ontologies– Cyc

Page 4: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

4CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Page 5: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

5CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Page 6: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

6CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Page 7: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

7CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Cyc Ontology

Page 8: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

8CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Why Develop an Ontology?• To share common understanding of the

structure of information – among people– among software agents

• To enable reuse of domain knowledge– to avoid “re-inventing the wheel”– to introduce standards to allow interoperability

Page 9: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

9CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

More Reasons• To make domain assumptions explicit

– easier to change domain assumptions (consider a genetics knowledge base)

– easier to understand and update legacy data

• To separate domain knowledge from the operational knowledge– re-use domain and operational knowledge

separately (e.g., configuration based on constraints)

Page 10: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

10

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

• Which wine should I serve with seafood today?

• What wines should I buy next Monday for my reception in Villanova, PA?

• Is there a market for the products of another small winery in this area?

• What online source is the best for wines for my party in Texas next fall?

Consider Questions Like:

Page 11: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

11

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

What Do We Need to Know?

Page 12: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

12

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

French winesand

wine regions

California wines and

wine regions

Which wine should

I serve with seafood today? A sharedA shared

ONTOLOGYONTOLOGYof of

wine and foodwine and food

A sharedA sharedONTOLOGYONTOLOGY

of of wine and foodwine and food

Page 13: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

13

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Wines and Wineries

Page 14: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

14

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

An Ontology Is Often Just the Beginning

Ontologies

Software agents Problem-

solving methods

Domain-independent applications

DatabasesDeclare

structure

KnowledgebasesProvide

domaindescription

Page 15: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

15

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

What does an Ontology Look Like?• Can be any knowledge representation

• Typically a description logic or associative network of some kind.

Internal Representation

English Representation

Facts

Reasoning Programs

Page 16: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

16

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Description Logics

• Modeling-based representations reflect the structure of the domain, and then reason based on the model.– Semantic Nets– Frames– Scripts

• Sometimes called associative networks

Page 17: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

17

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Basics of Description Logics

• All include – Concepts

– Various kinds of links between concepts• “has-part” or aggregation

• “is-a” or specialization

• More specialized depending on domain

• Typically also include– Inheritance

– Some kind of procedural attachment or restrictions or constraints

Page 18: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

18

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Ontology-Development Process

General approach:

determinescope

considerreuse

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Usually a highly iterative process. (Sound familiar?)

Page 19: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

19

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Preliminaries - Tools• All screenshots in this tutorial are from

Protégé, which:– is a graphical ontology-development tool– supports a rich knowledge model– is open-source and freely available

(http://protege.stanford.edu)

• Some other available tools:– Ontolingua and Chimaera– OntoEdit– OilEd– OpenCyc

Page 20: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

20

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Determine Domain and Scope

• What is the domain that the ontology will cover?

• For what we are going to use the ontology?• For what types of questions the information in

the ontology should provide answers (competency questions)?

determinescope

considerreuse

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 21: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

21

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Competency Questions• Which wine characteristics should I consider when

choosing a wine?• Is Bordeaux a red or white wine?• Does Cabernet Sauvignon go well with seafood?• What is the best choice of wine for grilled meat?• Which characteristics of a wine affect its appropriateness

for a dish?• Does a flavor or body of a specific wine change with

vintage year?• What were good vintages for Napa Zinfandel?

Page 22: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

22

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Consider Reuse

• Why reuse other ontologies?– to save the effort– to interact with the tools that use other

ontologies– to use ontologies that have been validated

through use in applications

determinescope

considerreuse

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 23: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

23

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

What to Reuse?• Ontology libraries

– Protégé ontology library (protege.cim3.net/cgi-bin/wiki.pl?ProtegeOntologiesLibrary)

– Biomedical ontologies (bioontology.org)– Many others: try Googling for your topic– http://www.dmoz.org/

• Upper ontologies– IEEE Standard Upper Ontology (suo.ieee.org)– Cyc (www.cyc.com)

Page 24: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

24

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Enumerate Important Terms

• What are the terms we need to talk about?

• What are the properties of these terms?

• What do we want to say about the terms?

considerreuse

determinescope

enumerateterms

defineclasses

defineproperties

defineconstraints

createinstances

Page 25: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

25

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Enumerating Terms - The Wine Ontology

wine, grape, winery, location,

wine color, wine body, wine flavor, sugar content

white wine, red wine, Bordeaux wine

food, seafood, fish, meat, vegetables, cheese

Page 26: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

26

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Define Classes and the Class Hierarchy

• A class is a concept in the domain– a class of wines– a class of wineries– a class of red wines

• A class is a collection of elements with similar properties

• Instances of classes– a glass of California wine you’ll have for lunch

considerreuse

determinescope

defineclasses

defineproperties

defineconstraints

createinstances

enumerateterms

Page 27: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

27

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Class Inheritance

• Classes usually constitute a taxonomic hierarchy (a subclass-superclass hierarchy)

• A class hierarchy is usually an IS-A hierarchy:

an instance of a subclass is an instance of a superclass

• If you think of a class as a set of elements, a subclass is a subset

Page 28: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

28

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Class Inheritance - Example• Apple is a subclass of Fruit

Every apple is a fruit

• Red wines is a subclass of WineEvery red wine is a wine

• Chianti wine is a subclass of Red wineEvery Chianti wine is a red wine

Page 29: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

29

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Levels in the Hierarchy

Middlelevel

Toplevel

Bottomlevel

Page 30: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

30

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Modes of Development• top-down – define the most general

concepts first and then specialize them• bottom-up – define the most specific

concepts and then organize them in more general classes

• combination – define the more salient concepts first and then generalize and specialize them

Page 31: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

31

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Documentation• Classes (and properties) usually have

documentation– Describing the class in natural language– Listing domain assumptions relevant to the

class definition– Listing synonyms

• Documenting classes and properties is as important as documenting computer code!

Page 32: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

32

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Define Properties of Classes – Properties

• Properties in a class definition describe attributes of instances of the class and relations to other instancesEach wine will have color, sugar content,

producer, etc.

considerreuse

determinescope

defineconstraints

createinstances

enumerateterms

defineclasses

defineproperties

Page 33: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

33

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Properties (Properties)• Types of properties

– “intrinsic” properties: flavor and color of wine– “extrinsic” properties: name and price of wine– parts: ingredients in a dish– relations to other objects: producer of wine (winery)

• Simple and complex properties– simple properties (attributes): contain primitive values

(strings, numbers)– complex properties: contain (or point to) other objects

(e.g., a winery instance)

Page 34: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

34

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Properties for the Class Wine

(in Protégé-2000)

Page 35: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

35

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Property and Class Inheritance

• A subclass inherits all the properties from the superclassIf a wine has a name and flavor, a red wine also has a name

and flavor

• If a class has multiple superclasses, it inherits properties from all of themPort is both a dessert wine and a red wine. It inherits “sugar

content: high” from the former and “color:red” from the latter

Page 36: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

36

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Property Constraints

• Property constraints (restrictions) describe or limit the set of possible values for a propertyThe name of a wine is a string

The wine producer is an instance of Winery

A winery has exactly one location

considerreuse

determinescope

createinstances

enumerateterms

defineclasses

defineconstraints

defineproperties

Page 37: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

37

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Restrictions for Properties at the Wine Class

Page 38: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

38

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Common Restrictions• Property cardinality – the number of values

a property has• Property value type – the type of values a

property has• Minimum and maximum value – a range of

values for a numeric property• Default value – the value a property has

unless explicitly specified otherwise

Page 39: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

39

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Common Restrictions: Property Cardinality

• Cardinality– Cardinality N means that the property must have N values

• Minimum cardinality– Minimum cardinality 1 means that the property must have a value

(required)

– Minimum cardinality 0 means that the property value is optional

• Maximum cardinality– Maximum cardinality 1 means that the property can have at most

one value (single-valued property)

– Maximum cardinality greater than 1 means that the property can have more than one value (multiple-valued property)

Page 40: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

40

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Common Restrictions: Value Type• String: a string of characters (“Château Lafite”)• Number: an integer or a float (15, 4.5)• Boolean: a true/false flag• Enumerated type: a list of allowed values (high,

medium, low)• Complex type: an instance of another class

– Specify the class to which the instances belongThe Wine class is the value type for the property

“produces” at the Winery class

Page 41: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

41

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Domain and Range of Property• Domain of a property – the class (or

classes) that have the property– More precisely: class (or classes) instances of

which can have the property

• Range of a property – the class (or classes) to which property values belong

Page 42: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

42

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Restrictions and Class Inheritance• A subclass inherits all the properties from the superclass

• A subclass can override the restrictions to “narrow” the list of allowed values– Make the cardinality range smaller

– Replace a class in the range with a subclass

Wine

Frenchwine

Winery

Frenchwinery

is-a is-a

producer

producer

Page 43: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

43

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Create Instances

• Create an instance of a class– The class becomes a direct type of the instance

– Any superclass of the direct type is a type of the instance

• Assign property values for the instance frame– Property values should conform to the facet constraints

– Knowledge-acquisition tools often check that

considerreuse

determinescope

createinstances

enumerateterms

defineclasses

defineproperties

defineconstraints

Page 44: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

44

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Creating an Instance: Example

Page 45: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

45

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Defining Classes and a Class Hierarchy

• The things to remember:– There is no single correct class hierarchy– But there are some guidelines

• The question to ask:“Is each instance of the subclass an instance of its

superclass?”

Page 46: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

46

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Disjoint Classes• Classes are disjoint if they cannot have common instances• Disjoint classes cannot have any common subclasses either

Red wine, White wine,Rosé wine are disjoint

Dessert wine and Redwine are not disjoint

Wine

Redwine

Roséwine

Whitewine

Dessertwine

Port

Page 47: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

47

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Classes and Their Names• Classes represent concepts in the domain, not their names

• The class name can change, but it will still refer to the same concept

• Synonym names for the same concept are not different classes– Many systems allow listing synonyms as part of the class

definition

Page 48: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

48

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Back to the Properties: Domain and Range

• When defining a domain or range for a property, find the most general class or classes

• Consider the flavor property– Domain: Red wine, White wine, Rosé wine– Domain: Wine

• Consider the produces property for a Winery:– Range: Red wine, White wine, Rosé wine– Range: Wine

Page 49: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

49

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Back to the Properties: Domain and Range

• When defining a domain or range for a property, find the most general class or classes

• Consider the flavor property– Domain: Red wine, White wine, Rosé wine

– Domain: Wine

• Consider the produces property for a Winery:– Range: Red wine, White wine, Rosé wine

– Range: Wine

propertyclass allowed values

DOMAIN RANGE

Page 50: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

50

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Inverse PropertiesMaker and

Producer

are inverse properties

Page 51: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

51

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Inverse Properties (II)• Inverse properties contain redundant information,

but– Allow acquisition of the information in either direction– Enable additional verification– Allow presentation of information in both directions

• The actual implementation differs from system to system– Are both values stored?– When are the inverse values filled in?– What happens if we change the link to an inverse

property?

Page 52: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

52

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Limiting the Scope• An ontology should not contain all the

possible information about the domain– No need to specialize or generalize more than

the application requires– No need to include all possible properties of a

class• Only the most salient properties

• Only the properties that the applications require

Page 53: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

53

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Limiting the Scope (II)• Ontology of wine, food, and their pairings

probably will not include– Bottle size– Label color– My favorite food and wine

• An ontology of biological experiments will contain– Biological organism– Experimenter

• Is the class Experimenter a subclass of Biological organism?

Page 54: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

54

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Ontology Languages• RDF and RDFS

• DAML+OIL

• OWL

• Cyc-L

• Others

• OWL has a special status within the semantic web community

Page 55: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

55

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

OWL Representation• An OWL file is a text file

– XML– Uses RDFS– Plus OWL-specific extensions

• If you are using a tool like Protege to create your ontology you don’t need to know the underlying representation

• If you’re going to do something with it, you do.

Page 56: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

56

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

OWL Example• This ontology

• Becomes this text file: wine.txt

56

Page 57: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

57

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

OWL: Classes And a Class Hierarchy

• Classes– Each class is an instance of owl:Class

• Class hierarchy– Defined by rdfs:subClassOf

• More ways to specify organization of classes– Disjointness (owl:disjointWith)– Equivalence (owl:EquivalentClass)

• Predefined: owl:Thing, owl:Nothing

Page 58: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

58

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

More Ways To Define a Class in OWL• Union of classes: owl:unionOf

A class Person is a union of classes Male and Female

• Intersection of classes: owl:intersectionOfA class Red Wine is an intersection of Wine and Red Thing

• Complement of a class: owl:complementOfCarnivores are all the animals that are not herbivores

• Enumeration of elements: owl:oneOfA class Wine Color contains the following instances: red, white, rosé

• Restriction on properties: owl:RestrictionA class Red Thing is a collection of things with color: Red

Page 59: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

59

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Properties in OWL• Two kinds of property

– owl:ObjectProperty. Object properties relate objects to other objects. Example: Goes-well-with.

– owl:datatypeProperty. Datatype properties relate objects to datatype values. Example: Has-cost.

• Properties can be equivalent– owl:equivalentProperty. Example: bottled-by

and produced-by

Page 60: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

60

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Some Special Properties in OWL• owl:TransitiveProperty

– Is more expensive than

• owl:SymmetricProperty– Comes from vinyard

• owl:FunctionalProperty– Has UPC code

• owl:InverseFunctionalProperty– Is UPC code of

Page 61: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

61

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Research Issues in Ontology Creation

• Content generation• Analysis and evaluation• Maintenance • Ontology languages• Tool development

Page 62: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

62

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Content: Top-Level Ontologies• What does “top-level” mean?

– Objects: tangible, intangible

– Processes, events, actors, roles

– Agents, organizations

– Spaces, boundaries, location

– Time

• IEEE Standard Upper Ontology effort– Goal: Design a single upper-level ontology

– Process: Merge upper-level of existing ontologies

Page 63: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

63

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Content: Knowledge Acquisition• Knowledge acquisition is a bottleneck

• Sharing and reuse alleviate the problem

• But we need automated knowledge acquisition techniques– Linguistic techniques: ontology acquisition from text

– Machine-learning: generate ontologies from structured documents (e.g., XML documents)

– Exploiting the Web structure: generate ontologies by crawling structured Web sites

– Knowledge-acquisition templates: experts specify only part of the knowledge required

Page 64: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

64

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Analysis• Analysis: semantic consistency

– Violation of property constraints– Cycles in the class hierarchy– Terms which are used but not defined– Interval restrictions that produce empty intervals

(min > max)

• Analysis: style– Classes with a single subclass– Classes and properties with no definitions– Properties with no constraints (value type, cardinality)

Page 65: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

65

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Evaluation • One of the hardest problems in ontology

design

• Ontology design is subjective

• What does it mean for an ontology to be correct (objectively)?

• The best test is the application for which the ontology was designed

Page 66: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

66

CSC 9010 Spring, 2011. Paula Matuszek

Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt

Ontology Languages• What is the “right” level of expressiveness?

• What is the “right” semantics?

• When does the language make “too many” assumptions?

Page 67: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

67

CSC 9010 Spring, 2006. Paula Matuszek. [email protected]

Problems with Hand-Crafting• Requires both skill in building ontologies

and knowledge of the domain

• Tremendously time consuming, even with tools like Protégé.

• Easy to end up with errors– Inheritance– Defaults– Instances

• Hard to keep up to date.

Page 68: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

68

CSC 9010 Spring, 2006. Paula Matuszek. [email protected]

What Can Help? Some Technologies

• NLP tools– Term extraction

• Document summarization and topic tree creation (www.megaputer.com)

• Entity and relationship extraction to get instances

• Machine Learning– Association-rule-learning algorithms

– Bayesian Classifiers

– Rule induction

– Clustering

Note: these are all classification tools of some kind

Page 69: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

69

CSC 9010 Spring, 2006. Paula Matuszek. [email protected]

What Can Help? Semi-automated Ontology Building

• Extract from existing data on the web with Google-type search– KnowItAll: Etzioni et al– Cyc: Matuszek et al

• KCVC: knowledge contribution from volunteer contributors– FACTory– ESPGame

Page 70: 1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

70

CSC 9010 Spring, 2006. Paula Matuszek. [email protected]

Knowledge Management• Ontologies represented in OWL/RDF/XML

have many of the same issues as any software development project– Change Management and tracking– Permissions and ownership– Components and reuse