24
Large Scale Integration of Senses for the Semantic Web Jorge Gracia , Mathieu d’Aquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS) University of Zaragoza, Spain Knowledge Media Institute (KMi) Open University, United Kingdom 18th International World Wide Web Conference Madrid, Spain, 20th-24th April 2009

Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

Embed Size (px)

Citation preview

Page 1: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

Large Scale Integration of Sensesfor the Semantic Web

Jorge Gracia, Mathieu d’Aquin, Eduardo Mena

Computer Science and Systems Engineering Department (DIIS)University of Zaragoza, Spain

Knowledge Media Institute (KMi)Open University, United Kingdom

Jorge Gracia, Mathieu d’Aquin, Eduardo Mena

Computer Science and Systems Engineering Department (DIIS)University of Zaragoza, Spain

Knowledge Media Institute (KMi)Open University, United Kingdom

18th International World Wide Web Conference

Madrid, Spain, 20th-24th April 2009

Page 2: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 2

Outline

IntroductionMethodOptimization studyExperimentsConclusions

Page 3: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 3

Introduction

Current Semantic WebFavoured by the increasing amount of online ontologies already available on the WebHampered by the high heterogeneity that this growing semantic content introduces

The redundancy problemExcess of different semantic descriptions, coming from different sources, to describe the same intended meaning

Our proposalA method to cluster the ontology terms that one can find on the Semantic Web, according to the meaning that they intend to represent

Page 4: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 4

Introduction

Page 5: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 5

Introduction

Page 6: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 6

Redundancy problem: many representations of the same meanings

?Watson

apple

Introduction

The Semantic Web

Page 7: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 7

Proposed solution: pool of cross-ontology integrated senses

“clustered” Watson

apple

Introduction

The Semantic Web

The FruitThe Tree

The Company

Page 8: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 8

Introduction

Watson

The Semantic Web

Multiontology Semantic Disambiguator

Ontology Evolution

Semantic Browsing

Scarlet Ontology Matching Folksonomy Enrichment

QueryGen Semantic Query Generation

Question Answering

Page 9: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 9

Ontology terms

Synonym expansion

integration

Sense clustering

Keyword maps

Synonym maps

Senses

(each synonym map)

Watson

Similarity > threshold?

more ont. terms?

yes yesno

no

Extraction

Similarity Computation

risethreshold?

Integration

Senses Clustering

Disintegration

yesnoModify

integration degree

CIDER

Modifyintegration?

yes

Method

OFF-LINE

RUN-TIME

Page 10: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 10

Keyword maps: ontology terms with identical label

Watson

Method

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

Page 11: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 11

Synonym maps: ontology terms with synonym labels

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple

apple tree

Apple Inc.

Apple Inc.

apple tree

manzana

Watson

Method

Page 12: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 12

Method

Agglomerative clustering

CIDER

a

b

c

d

ad

a’

b

c

ad

a’’

b

c

. . .

ee

e

Page 13: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 13

Sense maps: semantically equivalent terms grouped

apple

apple

Apple Inc.

apple tree

manzana

apple

apple

apple

Apple Inc.

apple

apple

apple

The Fruit The Tree

The Company

apple tree

apple

apple

apple

apple

CIDER

Method

Page 14: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 14

Falling threshold(Integration)

Rising threshold(Disintegration)

Optimalthreshold

Method

Page 15: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 15

Integration level varies with similarity threshold

Optimization study

Integration Level = 1 - # finalSenses / # initialOntologyTerms

Page 16: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 16

Which similarity threshold is the best one?Three exploration ways:

Experimenting with ontology matching benchmarks Obtained 0.13 lower bound for optimal threshold

Contrasting with human opinion Range of good values between 0.2 and 0.3

Optimizing time response. Because: It will reduce the response time of the overall system Compatible with the other two ways It is not always feasible to have a large enough number of humans to ask or reference alignments

Optimization study

Page 17: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 17

Response time varies with thresholdOptimal value around 0.22

Optimization study

Page 18: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 18

Scalability study9156 keywords, 73169 different ontology terms to be clustered, Processing time is linear with number of ontology terms

Experiments

Page 19: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 19

Scalability studyProcessing time is independent of ontology size

Experiments

Page 20: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 20

Illustrative exampleKeyword = turkeySynonym map = turkey, Türkei, TürkiyeNº ontology terms = 58Nº Integrated senses = 9 (threshold = 0.27)

Experiments

Page 21: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 21

Experiments

More examples (threshold = 0.19)

Keyword #initial terms

#final senses

appalachian 7 1

apple 39 7

free 51 2

mace 7 3

plant 52 18

poll 5 4

stein 5 1

turkey 58 8

Page 22: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 22

Experiments

Positive factsTerms from different versions of the same ontology are easily detectedVery different meanings are not wrongly integrated (e.g., “plant” as “living organism” with “plant” as “industrial buildings”)

Negative factsHard to obtain a total integration of the same meanings (caused by very different semantic descriptions)

Page 23: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 23

ConclusionsRedundancy of semantic descriptions on the Web can be significantly reducedOur integration technique scales when used on a large body of knowledgeThe proposed method is flexible enough to configure and adapt our integration level to the necessities of client applications

Future workMore advanced prototypeMore extensive human-based evaluationStudy and evaluation of impact on other systems

Conclusions

Page 24: Large Scale Integration of Senses for the Semantic Web Jorge Gracia, Mathieu dAquin, Eduardo Mena Computer Science and Systems Engineering Department (DIIS)

WWW 2009 24

END of presentation

Thank you!