115
© Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at the ACL/HCSnet 2006 Advanced Program in Natural Language Processing Paul Buitelaar Language Technology Lab & Competence Center Semantic Web DFKI GmbH Saarbrücken, Germany

© Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

Embed Size (px)

Citation preview

Page 1: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Lexical Semantics and Ontologies

Tutorial at the ACL/HCSnet 2006 Advanced Program in Natural Language Processing

Paul Buitelaar

Language Technology Lab &

Competence Center Semantic Web

DFKI GmbH

Saarbrücken, Germany

Page 2: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Overview

Day 1: Words and Meanings

Human language as a system

How do words relate to each other

Day 2: Words and Object Descriptions

Human language as a means of representation

How do words represent objects in the/a world

Page 3: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Day 1 - Introduction

Words and Meanings

Synsets and Senses

Lexical Semantics in WordNet

Related Senses

Generative Lexicon and CoreLex

Domains and Senses

Tuning WordNet to a Domain

Page 4: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Meanings

Lexical Semantics in WordNet

Generative Lexicon and CoreLex

Tuning WordNet to a Domain

Page 5: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Lexical Semantic Resource Semantic Lexicon

Maps words to meanings (senses) Lexical Database

Machine readable (has a formal structure)

Freely available http://wordnet.princeton.edu/

WordNet

Page 6: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

In 1985 a group of psychologists and linguists at Princeton University undertook to develop a lexical database …The initial idea was to provide an aid to use in searching dictionaries conceptually, rather than merely alphabetically …WordNet … instantiates hypotheses based on results of psycholinguistic research … … expose such hypotheses to the full range of the common vocabulary

In anomic aphasia, there is a specific inability to name objects. When confronted with an apple, say, patients may be unable to utter ‘‘apple,’’ even though they will reject such suggestions as shoe or banana, and will recognize that apple is correct when it is provided. (Caramazza/Berndt 1978)

Miller, George A., Richard Beckwith, Christiane Fellbaum, Derek Gross and Katherine J. Miller. ``Introduction to WordNet: an on-line lexical database.'' In: International Journal of Lexicography 3 (4), 1990, pp. 235 - 244.

WordNet - Origins

Page 7: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

WordNet is organized around word meaning (not word forms as with traditional lexicons) Word meaning is represented by “synsets” Synset is a “Set of Synonyms”

Example {board, plank}

Piece of lumber {board, committee}

Group of people

Synsets

Page 8: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Synsets are organized in hierarchies Defines:

generalization (hypernymy) specialization (hyponymy)

Example

{entity}

{whole, unit}

{building material}

{lumber, timber}

{board, plank}

Synset Hierarchy

hyponymyhypernymy

Page 9: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Hierarchies (WordNet 1.7)

Page 10: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Hierarchy Example (WordNet 2.1)

Page 11: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Synsets and Senses Synsets represent word meaning

Words that occur in several synsets have a corresponding number of meanings (senses)

Example

Page 12: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

WordNet 2.1

Page 13: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Synonymy Similar in meaning

Hypernymy/Hyponymy Generalization and Specialization

Meronymy Part-of

e.g. study, bathroom, ... meronym house

Antonymy Opposite in meaning

e.g. warm antonym cold

(Other) WordNet Relations

Page 14: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Meanings

Lexical Semantics in WordNet

Generative Lexicon and CoreLex

Tuning WordNet to a Domain

Page 15: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Homonymy bank

embankment We walked along the bank of the Charles river.

institution Did he have an account at the HBU bank?

Systematic Polysemy school

group (of people) The school went for an outing.

(learning) processSchool starts at 8.30

organization The school was founded in 1910.

building The school has a new roof.

Systematic Polysemy

Page 16: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Obj1 Obj4

Obj2 Obj3

Semantic Analysis Pragmatic Analysis

Lexical Itemsof the

Language

Objects in the World

school school

Obj1

Obj2 Obj3

Obj4

Semantic or Pragmatic?

Page 17: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Anaphora Resolution [A long book heavily weighted with military technicalities]NP:event-physical_object-content ,

in this edition it is neither so long event nor so technical content as it was originally.

Metonymy The Boston office called

office > person

person part-of office

Bridging Peter bought a car. The engine runs well.

engine part-of car

The Boston office called. They asked for a new price. office > person

Underspecified Discourse Referents

Page 18: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Type Coercion

I began the book

book > event

event ‘has-relation-with’ book

read is-a event

multifaceted representation of lexical semantics reflecting systematic / regular / logical polysemy

Generative Lexicon Theory

Page 19: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Qualia Structure (Pustejovsky 1995)

Formal inheritance (is-a / hyponymy)book formal artifact, communication, …

Constitutive modification (part-of / meronymy)book constitutive section, …

Telic purpose („what is the object used for“)book telic read, …

Agentive causality („how did the object come about“)book agentive write, …

Generative Lexicon Theory

Page 20: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Automatic Qualia Structure Acquisition CoreLex is an attempt to automatically acquire underspecified

lexical semantic representations that reflect systematic polysemy These representations can be viewed as shallow Qualia

Structures

Sense Distribution in WordNet Systematic polysemy can be empirically studied in WordNet by

observing sense distributions

>> If more than two words share the same sense distribution (i.e. have the same set of senses), then this may indicate a pattern of systematic polysemy (adapted from Apresjan 1973)

CoreLex (Buitelaar 1998)

Page 21: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

book 1.{publication} => artifact2.{product, production} => artifact 3.{fact} => communication 4.{dramatic_composition, dramatic_work} => communication 5.{record} => communication 6.{section, subdivision} => communication 7.{journal} => artifact

Systematic Polysemous Class

“artifact communication”

amulet annals armband arrow article ballad bauble beacon bible birdcall blank blinker boilerplate book bunk cachet canto catalog catalogue chart chevron clout compact compendium convertible copperplate copy cordon corker ... guillotine homophony horoscope indicator journal laurels lay ledger loophole marker memorial nonsense novel obbligato obelisk obligato overture pamphlet pastoral paternoster pedal pennant phrase platform portrait prescription print puzzle radiogram rasp recap riddle rondeau … statement stave stripe talisman taw text tocsin token transcription trophy trumpery wand well whistle wire wrapper yardstick

Systematic Polysemous Classes

Page 22: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Noun1 Nounn

Basic Type1 Basic Type1

Systematic Polysemous Class1

Systematic Polysemous Classn

From WordNet to CoreLex

Page 23: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

“animal natural_object”alligator broadtail chamois ermine lapin leopard muskrat ...

“natural_object plant”algarroba almond anise baneberry butternut candlenut cardamon ...

“action artifact group_social”artillery assembly band church concourse dance gathering institution ...

“action attribute event psychological”appearance concentration decision deviation difference impulse outrage …

“possession quantity_definite”cent centime dividend gross penny real shilling

Other Examples

Page 24: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

CoreLex vs. WordNet

Page 25: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Representation and Interpretation „Dotted Types“ (Pustejovsky)

Lexical types are either simple (human, artifact, ...) or complex (information AND physical_object)

Can be represented with a „dotted type“, e.g.

informationphysical_object

In (Cooper 2005) interpreted as a record type (a delicious lunch can take forever):

Page 26: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Related Work Apresjan 1973

Regular Polysemy.

Nunberg & Zaenen 1992 Systematic polysemy in lexicology and lexicography.

Bill Dolan 1994 Word Sense Ambiguation: Clustering Related Senses.

Copestake & Briscoe 1996 Semi-productive polysemy and sense extension.

Peters, Peters & Vossen 1998 Automatic Sense Clustering in EuroWordNet.

Tomuro 1998 Semi-Automatic Induction of Systematic Polysemy from WordNet.

Page 27: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Meanings

Lexical Semantics in WordNet

Generative Lexicon and CoreLex

Tuning WordNet to a Domain

Page 28: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Reducing Ambiguity

WordNet has too many senses …

Reduce Ambiguity

Cluster related senses (CoreLex)

Tune WordNet to an application domain

Page 29: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Domains and SensesDomains determine Sense Selection, e.g.

English: cell

prison cell in the Politics/Law domain

living cell in the Biomedical domain

English: tissue

living tissue in the Biomedical domain

cloth in the Fashion domain

German: Probe

test in the Biomedical domain

rehearsal in the Theater domain

>> Compute Domain-Specific Sense

Page 30: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Approaches Subject Codes

Domain codes are in the dictionary

Topic Signatures Compute (domain-specific) context models from dictionary

definitions, domain corpora, web resources

Tuning of WordNet to a domain Top Down: Cucchiarelli & Velardi, 1998 Bottom Up: Buitelaar & Sacaleanu, 2001 Related recent work: McCarthy et al, 2004; Chan & Ng, 2005;

Mohammad & Hirst, 2006

Page 31: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Subject Codes Subject Codes (as used in LDOCE) indicate a

domain in which a word is used in a particular sense Examples (2600 codes)

Sub-Field Codes MDZP (Medicine:Physiology)

Code Combinations MLCO (Meteorology+Building) e.g. lightning conductor MLUF (Meteorology+Europe+France) e.g. Mistral

high

SN (sounds)

DG (drugs)

ML (meteorology)

Page 32: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Adding Subject Codes to WordNet

Grouping Synsets together across POS

MEDICINE Nouns: doctor#1, hospital#1Verbs: operate#7

Grouping Synsets together across Sub-Hierarchies

SPORT life_form#1: athlete#1

physical_object#1: game_equipment#1

act#2 : sport#1

location#1 : playing_field#1

Magnini B. & Cavaglià G. Integrating Subject Field Codes into WordNet In: Proceedings LREC 2000

Page 33: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

WordNet DOMAINS

Sense WordNet synset and gloss Domains

1 Depository, financial institution, bank, banking concern, banking company (a financial institution) Economy

2 Bank (sloping land) Geography, Geology

3 Bank (a supply or stock held in reserve) Economy

4 Bank, bank building (a building) Architecture, Economy

5 Bank (an arrangement of similar objects) Factotum

6 Savings bank, coin bank, money box, bank (a container) Economy

7 Bank (a long ridge or pile) Geography, Geology

8 Bank (the funds held by a gambling house ) Economy, Play

9 Bank, cant, camber (a slope in the turn of a road) Architecture

10 Bank (a flight maneuver.) Transport

Bernardo Magnini, Carlo Strapparava, Giovanni Pezzuli, and Alfio Gliozzo. Using domain information for word sense disambiguation. In: Proceedings of the SENSEVAL2 workshop 2001.

Page 34: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

WSD with Subject Codes Match between set of words in the context of the ambiguous

word and the set of words (“neighborhoods”) in the definitions + sample sentences of all senses that share a Subject Code

Guthrie J. A. & Guthrie I. & Wilks Y. & Aidinejad H. Subject Dependent Co-Occurrence and Word Sense Disambiguation In: Proceedings of ACL 1991.

write safe sum

account person put

take money order

keep pay supply

paper draw cheque

bank: Economics

medicine product hold

origin place human

treatment blood hospital

use store

organ comb

bank: Medicine and Biology

Page 35: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Topic Signatures from the Web Construct Topic Signatures for WordNet synsets/senses

Retrieve document collections from the web and use queries constructed for each WordNet sense, e.g.

Agirre E. & Ansa O. & Hovy E. & Martinez D. Enriching very large ontologies using the WWW In: Proc. of the Ontology Learning Workshop ECAI 2000

( boy AND ( altar boy OR ball boy OR … OR male person )AND NOT (man OR … OR broth of a boy OR son OR … OR mama’s boy OR black ) )

Page 36: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Top Down Tuning – Cucchiarelli & Velardi

Automatically find the best set of (WordNet) senses that: “… represent at best the semantics of the domain”

“[has the] … ‘right’ level of abstraction, so as to

mediate between over-ambiguity and generality”

“… [is] balanced …, i.e. words should be evenly

distributed among categories”

Alessandro Cucchiarelli, Paola Velardi Finding a domain-appropriate sense inventory for semantically tagging a corpus. Natural Language Engineering 4/4, p.325-344, Dec. 1998.

Page 37: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Methods Used

Create alternative sets of balanced categories by use of an adapted version of the Hearst/Schütze algorithm

Apply a scoring function to find the best set, with parameters:

Generality Highest possible level of generalization with a small number of categories is

preferred

Discrimination Power Different senses lead to different categories

(Domain) Coverage Words in the domain corpus that are represented by the selected categories

Average Ambiguity Ambiguity reduction is measured by the inverse of the average ambiguity of

all words

Page 38: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Balanced Categories - Hearst/Schütze

Reduce WordNet noun hierarchy to a set of 726 disjoint categories, each consisting of a relatively large number of synsets and of an average size, with as small a variance as possible

Group categories together into a set of 106 super-categories according to mutual co-occurrence in a training corpus

Measure the frequency of categories on domain corpora

Hearst M. & Schütze H. Customizing a Lexicon to Better Suit a Computational Task In: Proceedings ACL SIGLEX Workshop 1993

12.200 legal_system, ...

11.782 government, ...

7.859 politics, ...

United States Constitution

26.459 religion, ...

25.062 breads, ...

24.356 mythology, ...

Genesis

Page 39: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Generality

Generality of Category Set Ci: 1/DM(Ci)

Average Distance between the Categories of Ci and the topmost synsets.

n

jiji cdm

nCDM

1

)(*1

)(

Topmost SynSetTopmost SynSet

General SynSetGeneral SynSet

4 + 3 / 24 + 3 / 2 3 / 13 / 1

Ci1 Ci2

Ci = {Ci1, Ci2}

DM (Ci )= (3.5 + 3) / 2 = 3.25

Page 40: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Discrimination Power

Discrimination Power of Category Set Ci:

(Nc(Ci) - Npc(Ci))/ Nc(Ci)

where Nc(Ci) is the number of words that reach at least one category of Ci and Npc(Ci) is the number of words that have at least two senses that reach the same category cij of Ci

Ci1 Ci = {Ci1 Ci2 Ci3 Ci4}

w1

Ci2

w2

Ci3

w3

Ci4

General SynsetGeneral Synset

SenseSense

Domain WordDomain Word

Page 41: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Coverage & Average Ambiguity

Coverage of Category Set Ci: Nc(Ci)/W

where Nc(Ci) is the number of words that reach at least one category in Ci

Inverse of Average Ambiguity of Category Set Ci: 1/A(Ci)

where Nc(Ci) is the number of words that reach at least one category in Ci , and for each word w in this set, Cwj(Ci) is the number of categories in Ci reached

)(

1

)(*)(

1)(

CNCwjCAic

ji

ic

i CCN

Page 42: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Best Category Set (WSJ)

Category Higher-level synset

C1 person, individual, someone, mortal, human, soul

C2 instrumentality, instrumentation

C3 written communication, written language

C4 message, content, subject matter, substance

C5 measure, quantity, amount, quantum

C6 action

C7 activity

C8 group action

C9 organization

C10 psychological feature

C11 possession

C12 state

C13 locationTop Down categories for the financial domain, based on the Wall Street Journal

Page 43: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Sense Selection with WSJ SetSense Synset hierarchy for sense Top synset for sense

1 capital > asset possession (C11)

2 support > device instrumentality (C2)

4 document > writing written communication (C3)

5 accumulation > asset possession (C11)

6 ancestor > relative person (C1)

Sense Synset hierarchy for sense

3 stock, inventory > merchandise, wares >…

7 broth, stock > soup > …

8 stock, caudex > stalk, stem > …

9 stock > plant part > …

10 stock, gillyflower > flower > …

11 malcolm stock, stock > flower …

12 lineage, line of descent > … > genealogy > …

14 lumber, timber > …

Senses for stock - kept by domain tuning on the Wall Street Journal

Senses for stock - discarded by domain tuning on the Wall Street Journal

Page 44: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Bottom Up Tuning – Buitelaar & Sacaleanu

Ranking of WordNet synsets according to a domain-specific corpus

Compute term relevance against reference corpus

Compute synset relevance according to term relevance (where term = synonym in synset)

Ranking can be used in WSD (similar to usage of ‘most frequent heuristic’)

Paul Buitelaar, Bogdan Sacaleanu Ranking and Selecting Synsets by Domain Relevance In: Proceedings of WordNet and Other Lexical Resources: Applications, Extensions and Customizations, NAACL 2001 Workshop, June 3/4 2001

Page 45: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

TFIDF

))(

log(.)(wdf

Ntfwtfidf

tf(w) term frequency (number of word occurrences in a document)

df(w) document frequency (number of documents containing the word)

N number of all documents

tfIdf(w) relative importance of the word in the document

The word is more important if it appears several times in a target document

The word is more important if it appears in less documents

Page 46: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Term and Synset Relevance

Term Relevance Relevance Score of Synset Members

where t represents the term, d the domain, N is the total number of domains

Synset Relevance Cumulated Relevance Score for a Synset

)log()log()|( ,t

dtdf

Ntfdtrlv

ct

dtrlvdcrlv )|()|(

Page 47: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Extended Synset Relevance Lexical Coverage

Take Length of the Synset Into Account

[Gefängniszelle, Zelle] ("prison cell")[Zelle] ("living cell")

Hyponyms Take Hyponyms Into Account

[Zelle,Gefängniszelle,Todeszelle][Zelle,Körperzelle,Pflanzenzelle]

ct

dtrlvc

Tdcrlv )|()|(

ct

dtrlvc

Tdcrlv )|()|(

Page 48: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Experiment – Medical Domain

Page 49: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Related Recent Work

Diana McCarthy, Rob Koeling, Julie Weeds, and John Carroll

Finding predominant senses in untagged text. In Proc. of ACL 2004.

Chan, Yee Seng and Ng, Hwee Tou (2005)

Word Sense Disambiguation with Distribution Estimation. Proc. of IJCAI

2005.

Mohammad, Saif and Hirst, Graeme.

Determining word sense dominance using a thesaurus. Proc. of EACL

2006.

Page 50: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Day 2 - Introduction

Words and Object Descriptions

Semantics on the Semantic Web

Semantic Web, Ontologies and Natural Language Processing

The Lexical Semantic Web

Knowledge Representation as Word Meaning

A Lexicon Model for Ontologies

Enriching Ontologies with Linguistic Information

Page 51: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Object Descriptions

Semantics on the Semantic Web

The “Lexical Semantic Web”

A Lexicon Model for Ontologies

Page 52: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Web

Web Consists of Non-Interpreted Data

Text DBsImages Tables

Page 53: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

WebMarkup

Interpretation through Markup - Categories

Page 54: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

“Web 2.0”Markup

Interpretation through Markup – User Tags

Page 55: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

“Web 2.0”Markup

Interpretation through Markup – User Tags

Page 56: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic WebKnowledge

Markup

Formal Interpretation - Knowledge Markup

Ontologies

Page 57: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic WebKnowledge

Markup

Formal Interpretation - Knowledge Markup

Ontologies

Page 58: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic WebKnowledge

Markup

Formal Interpretation - Knowledge Markup

Ontologies

Page 59: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

KnowledgeMarkup Ontologies

Turns the Web into a Knowledge Base

Page 60: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

KnowledgeMarkup Ontologies

Semantic Web Services

Enables Semantic Web Services …

Page 61: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Intelligent Man-Machine Interface

KnowledgeMarkup Ontologies

Semantic Web Services

… and Intelligent Man-Machine Interface

Page 62: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic Web Layer cake

Page 63: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Resource Description Framework (RDF)

node1

DFKI GmbH

Kaiserslautern

name

location

www http://www.dfki.de

Page 64: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

RDF : XML-based Representation

<?xml version=‘1.0’ ?><rdf:RDF

xmlns:rdf=“… rdf-syntax-ns#”xmlns:rdfs=“… rdf-schema#”xmlns=“http://example.org”>

<rdf:Description rdf:nodeID=“node1”><name>DFKI GmbH</name><location>Kaiserslautern</location><www rdf:resource=“http://www.dfki.de” />

</rdf:Description></rdf:RDF>

Page 65: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

RDF Schema (RDFS)

Representation of classes and properties

Person Teacher

Student

rdf:Literal

name

Course

teaches

enrolledInis-

a

is-a

Page 66: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

RDFS : XML-based Representation

Page 67: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Web Ontology Language (OWL)

OWL adds further modelling vocabulary on top of RDFS, e.g. Class equivalence Property types (data vs. object property)

Based on Description Logics, three versions OWL Lite OWL DL OWL Full

Page 68: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

OWL

Extended knowledge representation

Person Teacher

Student

rdf:Literal

name

Course

teache

s

enrolledInis-a

is-a

disjoint

Page 69: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

OWL : XML-based Representation

Page 70: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

XML – RDF – RDFS - OWL

XML Schema Namespaces Interpretation Context

RDF Schema

OWL

Formalization:

Class Definition, Properties

Formalization:

extended Class Definition,

Properties, Property Types

Data Types

XML

RDF

Syntax Semantics

Page 71: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – What they are

Ontology refers to an engineering artifact a specific vocabulary used to describe a certain reality a set of explicit assumptions regarding the intended

meaning of the vocabulary

An Ontology is an explicit specification of a conceptualization [Gruber 93] a shared understanding of a domain of interest

[Uschold/Gruninger 96]

Page 72: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Why you need them

Make domain assumptions explicit Easier to exchange domain assumptions Easier to understand and update legacy data

Separate domain knowledge from operational knowledge Re-use domain and operational knowledge separately

A community reference for applications

Shared understanding of what particular information means

Page 73: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Applications of Ontologies NLP

Information Extraction, e.g. Buitelaar et al. 06, Mädche, Staab & Neumann 00, Nedellec, Rebholz

Information Retrieval (Semantic Search), e.g. WebKB (Martin et al. 00), OntoSeek (Guarino et al. 99), Ontobroker (Decker et al. 99)

Question Answering, e.g. Harabagiu, Schlobach & de Rijke, Aqualog (Lopez and Motta 04)

Machine Translation, e.g. Nirenburg et al. 04, Beale et al. 95, Hovy, Knight

Other Business Process Modeling, e.g. Uschold et al. 98 Digital Libraries, e.g. Amann & Fundulaki 99 Information Integration, e.g. Kashyap 99; Wiederhold 92 Knowledge Management (incl. Semantic Web), e.g. Fensel 01, Staab

& Schnurr 00; Sure et al. 00, Abecker et al. 97 Software Agents, e.g. Gluschko et al. 99; Smith & Poulter 99 User Interfaces, e.g. Kesseler 96

Page 74: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies and Their Relatives

Catalogs

Glossaries & Terminologies

Thesauri

Semantic Networks

Formal isa

Formal Instance

General logicalconstraints

Axioms:Disjoint/Inverse…

Page 75: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Thesauri – Examples : EuroVoc EuroVoc

covers terminology in all of the official EU languages for all fields (27) that concern the EU institutions, e.g. politics,

trade, law, science, energy, agriculture

MT 3606 natural and applied sciencesUF gene pool

genetic resourcegenetic stockgenotypeheredity

BT1 biologyBT2 life sciencesNT1 DNANT1 eugenicsRT genetic engineering (6411)

Page 76: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Thesauri – Examples : MeSH MeSH (Medical Subject Headings)

organized by terms (~ 250,000) that correspond to medical subjects for each term syntactic, morphological or semantic variants are given

MeSH Heading Databases, GeneticEntry Term Genetic DatabasesEntry Term Genetic Sequence DatabasesEntry Term OMIMEntry Term Online Mendelian Inheritance in ManEntry Term Genetic Data BanksEntry Term Genetic Data BasesEntry Term Genetic DatabanksEntry Term Genetic Information DatabasesSee Also Genetic Screening

Page 77: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic Networks - Examples : UMLS

Unified Medical Language System integrates linguistic, terminological and semantic information Semantic Network consists of 134 semantic types and 54

relations between types

Pharmacologic Substance affects Pathologic FunctionPharmacologic Substance causes Pathologic FunctionPharmacologic Substance complicates Pathologic FunctionPharmacologic Substance diagnoses Pathologic FunctionPharmacologic Substance prevents Pathologic FunctionPharmacologic Substance treats Pathologic Function

Page 78: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semantic Networks - Examples : GO

GO (Gene Ontology) Aligns descriptions of gene products in different databases,

including plant, animal and microbial genomes Organizing principles are molecular function, biological process

and cellular component

Accession: GO:0009292Ontology: biological processSynonyms: broad: genetic exchangeDefinition: In the absence of a sexual life cycle, the processes

involved in the introduction of genetic information to create a genetically different individual.

Term Lineage all : all (164142)GO:0008150 : biological process (115947)

GO:0007275 : development (11892)GO:0009292 : genetic transfer (69)

Page 79: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example I

Page 80: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example II

OntologyF-Logic

similar

city

NeckarZugspitze

Geographical Entity (GE)

Natural GE Inhabited GE

countryrivermountain

instance_of

Germany

BerlinStuttgart

is-a

flow_through

located_in

capital_of

flow_through

flow_through

located_in

capital_of

367

length (km)

2962

height (m)

Design: Philipp Cimiano

Page 81: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies for NLP

Information Retrieval Query Expansion

Machine Translation Interlingua

Information Extraction Template Definition Semantic Integration

Question Answering Question Analysis Answer Selection

Page 82: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Information Extraction

Class-based Template Definition Allows for Reasoning over Extracted Templates with

Respect to the Ontology (see e.g. [Nedellec and Nazarenko 2005] for discussion)

Semantic Integration Extraction from Heterogeneous Sources (Text, Tables

and other Semi-Structured Data, Image Captions) – SmartWeb [Buitelaar et al. 06]

Multi-Document Information Extraction – ArtEquAKT [Alani et al. 2003]

Page 83: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Question Answering Question Analysis

Ontology/WordNet-based Semantic Question Interpretation (e.g. [Pasca and Harabagiu 01])

Answer Selection Ontology/WordNet-based Reasoning for Answer Type-Checking

Ontology of Events [Sinha and Narayanan 05] Geographical Ontology, WordNet [Schlobach & de Rijke 04] WordNet [Pasca and Harabagiu 01]

Ontology-based Question Answering Derive Answers from a Knowledge Base (e.g. Aqualog [Lopez &

Motta 04])

Page 84: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontology Life Cycle

Create/SelectDevelopment and/or Selection

PopulateKnowledge Base Generation

ValidateConsistency Checks

EvolveExtension, Modification

MaintainUsability Tests

DeployKnowledge Retrieval

Page 85: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

NLP in the Ontology Life Cycle

Ontology PopulationInformation Extraction

Ontology LearningText Mining

KB RetrievalQuestion Answering

Page 86: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontology Learning

Terms

(Multilingual) Synonyms

Concept Formation

Concept Hierarchy

Relations

Axiom Schemata

GeneralAxioms

Relation Hierarchy

mountain)iver,disjoint(r

z))yx)(z,capital_of(zx)(y,capital_ofy country(x)(x

GE Inhabitedcity city,capital CC

GE):rangeriver,:gh(domflow_throu

(c)Ref,ci(c),:country:c C

located_incapital_of R

Land} nation,{country,

.capital,.. city, nation, country, river,

Design: Philipp Cimiano

Page 87: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Object Descriptions

Semantics on the Semantic Web

The “Lexical Semantic Web”

A Lexicon Model for Ontologies

Page 88: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Dictionary: Words and Senses

Represent interpretations of words through senses, very much like classes that are assigned to a word, e.g.

article

1. An individual thing or element of a class…2. A particular section or item of a series in a written document…3. A non-fictional literary composition that forms an independent part of a publication…4. The part of speech used to indicate nouns and to specify their application5. A particular part or subject; a specific matter or point

(as provided by http://dictionary.reference.com/)

Page 89: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontology: Classes and Labels - I Ontologies assign labels (i.e. words) to a given class

In the COMMA ontology on document management the class article corresponds to sense 2 (‘section of a written document’):

http://pauillac.inria.fr/cdrom/ftp/ocomma/comma.rdfs

Page 90: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontology Classes and Labels - II

In the GOLD ontology on linguistics, the class label article corresponds to sense 4 (‘part of speech ’):

http://emeld.org/gold

Page 91: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

The Meaning of Director - I

The Semantic Web can be viewed as a large, distributed dictionary (or rather a semantic lexicon) in which we can look up the meaning of words, e.g. director

… as a ‘role’ (AgentCities ontology)

http://www-agentcities.doc.ic.ac.uk/ontology/shows.daml

Page 92: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

The Meaning of Director - II

… as ‘head of a program’ (University Benchmark ontology)

http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl

Page 93: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Exploring the Lexical Semantic Web

Collect ontologies OntoSelect

Analyse the use of class/property labels

Treat class/property labels as lexical entries Normalize Organize by language

Page 94: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontology Collection

OntoSelect Web Monitor on DAML, RDFS, OWL Files Download, Analyze and Store Included Information

and Metadata Class and Property Labels Multilingual Information Included Ontologies

Ontology Ranking and Selection Functionalities

http://olp.dfki.de/OntoSelect

Page 95: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

OntoSelect

Page 96: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Multilinguality on the Semantic Web

Page 97: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Multilingual Labels

Page 98: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

“Lexical Semantic Ambiguity”

Page 99: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Words and Object Descriptions

Semantics on the Semantic Web

The “Lexical Semantic Web”

A Lexicon Model for Ontologies

Page 100: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example III

Page 101: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example III (continued)

Campus University

“Fakultät”

located_at

is_part_of

Student

studies_at

Staff

works_at

Page 102: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example III (continued)

Campus University

“Fakultät”

located_at

is_part_of

Fakultät

has_German_term

School

has_US_English_term Faculteit

has_Dutch_term

Student

studies_at

Staff

works_at

Page 103: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Ontologies – Example III (continued)

University

“Fakultät”is_part_of

Term

has_term

Fakultät

instance_of

DE

language

faculteit

instance_of

NL

language

school

EN-US

language

Page 104: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Semiotic Triangle Ogden & Richards, 1923 based on Structural Linguistics studies (de Saussure, 1916) adopted in Knowledge Representation (e.g. Sowa, 1984)

Page 105: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

LingInfo Model – Simplified

Design: Michael Sintek

Page 106: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

LingInfo Model

Page 107: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

LingInfo Instances - Example

Fußballspielers

„of the football player“

Page 108: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

LingInfo Predicate-Arg Structure

Design: Anette Frank

Page 109: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Page 110: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Page 111: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Page 112: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Page 113: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Page 114: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Conclusions

Page 115: © Paul Buitelaar: Lexical Semantics and Ontologies Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia Lexical Semantics and Ontologies Tutorial at

© Paul Buitelaar: Lexical Semantics and Ontologies

Tutorial at ACL/HCSnet, July 2006, Melbourne, Australia

Conclusions WordNet: Appropriate Use may include

Introduction of underspecified senses (sense grouping) Tuning to a domain

The “Lexical Semantic Web” The Semantic Web (and Web 2.0) is a potentially

rich resource for (formal) lexical semantics Mining such resources for lexical semantics (i.e.

compilation of a distributed semantic lexicon) only just started

Ontologies to be extended with linguistic/lexical information