Semantic Wikipedia The missing links

Preview:

DESCRIPTION

Denny Vrandečić AIFB, Universität Karlsruhe (TH) Semantics 2006 Vienna, November 28th, 2006. Semantic Wikipedia The missing links. Universal Access to All Knowledge. Universal Access to All Knowledge. Marrying Wikipedia and the Semantic Web. Overview. Background - PowerPoint PPT Presentation

Citation preview

AIFB

Denny VrandečićAIFB, Universität Karlsruhe (TH)Semantics 2006Vienna, November 28th, 2006

Semantic Wikipedia The missing links

2

AIFB

3

AIFB

4

AIFB

5

AIFB

AIFB

6

Universal Access toAll Knowledge

7

AIFB

8

AIFB

AIFB

11

Universal Access toAll Knowledge

MarryingWikipedia and the Semantic Web

13

AIFB

Overview

Background Wikis, Wikipedia, and the Semantic Web

Semantic Wikipedia Idea, How it looks like, Advantages

Challenges and Opportunities Web ecosystem, Open issues, Next steps

AIFB

Wikis

15

AIFB

Long, long time ago…

Using Pattern Languages for Object Oriented Programming, OOPSLA87

16

AIFB

Design Patterns

17

AIFB

Design Patterns Web Page

HTML

Simple SyntaxH

TM

L

HTML

18

AIFB

Wikis Everyone can edit

Technology allows to edit Syntax is easy to learn

History of edits Recent changes Community building / Attribution

Easy to revert Important for fighting vandalism No fear of breaking the system

AIFB

Wikipedia

20

AIFB

A new encyclopedia

21

AIFB

Created March 2000 Free, web based encyclopedia

Everyone can read Expert authors and editors Extensive formal peer review Until January 2001: 22 articles Side project: wiki-based Nupedia

22

AIFB

Created March 2000 Free, web based encyclopedia

Everyone can read Expert authors and editors Extensive formal peer review Until January 2001: 22 articles Until January 2002: 20,342 articles in

17 languages, 17,307 in English

January 2001

edit

31

23

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

24

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

2003 133,129 98,475 25

25

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

2003 133,129 98,475 25

2004 420,562 189,124 52

26

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

2003 133,129 98,475 25

2004 420,562 189,124 52

2005 1,311,697 438,289 162

27

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

2003 133,129 98,475 25

2004 420,562 189,124 52

2005 1,311,697 438,289 162

2006 3,100,360 893,237 197

28

AIFB

Wikipedia growth

Year Articles English Languages

2002 20,342 17,307 17

2003 133,129 98,475 25

2004 420,562 189,124 52

2005 1,311,697 438,289 162

2006 3,100,360 893,237 197

11/2006 5,565,830 1,462,910 250

29

AIFB

Wikipedia users

2.7 Mio registered users About 70,000 contributors 2% (1,400) make 73.4% of all edits

Most content from wide user base Clean up / “gardening” by small group

[English Wikipedia, study by Aaron Swartz]

AIFB

Wikipedia’s problems

31

AIFB

But it can’t work!

Everyone can edit Repairing is easier than breaking

No special status for experts Community building

Nature study on quality 4 / 3 error rate Wikipedia / EB Controversial

32

AIFB

Quality

Hard to discover factual errors Focusing on quality Repeated facts Big number of lists But still not all “interesting” lists

AIFB

33

AIFB

34

AIFB

35

AIFB

36

AIFB

37

AIFB

38

AIFB

39

AIFB

40

AIFB

41

AIFB

42

AIFB

43

AIFB

44

AIFB

45

AIFB

46

AIFB

47

48

AIFB

Coverage by language

English: 1.5 Mio German: 0.5 Mio 10 more languages: 100.000+ 40 more languages: 10.000+ But what about other languages?

AIFB

Semantic Web

AIFB

50

Angola Africalocated in

Zambia

located in

borders

Country Continent

AIFB

51

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

AIFB

52

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

Angola

http://www.w3.org/2000/01/rdf-schema#label

Africa

Located in

Zambia

Country

Borders

Continent

AIFB

53

Angola Africalocated in

Zambia

located in

borders

Country Continent

54

AIFB

AIFB

55

Angola Africalocated in

Zambia

located in

borders

Country Continent

AIFB

Semantic Wikipedia: Idea

57

AIFB

Wikipedia today

Brač is a Croatian island in the Adriatic Sea. The island has a population of 13,000, living in numerous little towns, ranging from the 'main town' Supetar, with more than 2,500 inhabitants, to Novo Selo, where only a dozen people live.

Today, Brač lives mostly on tourism, but fishing and agriculture (especially wine and olives) are very important too, as is selling its precious, white stone (which was used in building Diocletian's Palace in Split, and is built into the White House in Washington, DC, too).

Category: Croatian Island

AIFB

58

BračBrač

CroatiaCroatia

Adriatic SeaAdriatic Sea

ItalyItaly

tourismtourism

ZagrebZagreb SplitSplit

MontenegroMontenegro

AIFB

59

60

AIFB

How are they linked?

Brač Croatia

Brač Adriatic Sea

Brač Supetar

Brač Novo Selo

Brač tourism

Brač fishing

Brač agriculture

61

AIFB

How are they linked?

Brač belongs to Croatia

Brač Adriatic Sea

Brač Supetar

Brač Novo Selo

Brač tourism

Brač fishing

Brač agriculture

62

AIFB

How are they linked?

Brač belongs to Croatia

Brač located in Adriatic Sea

Brač Supetar

Brač Novo Selo

Brač tourism

Brač fishing

Brač agriculture

63

AIFB

How are they linked?

Brač belongs to Croatia

Brač located in Adriatic Sea

Brač has town Supetar

Brač has town Novo Selo

Brač tourism

Brač fishing

Brač agriculture

64

AIFB

How are they linked?

Brač belongs to Croatia

Brač located in Adriatic Sea

Brač has town Supetar

Brač has town Novo Selo

Brač lives on tourism

Brač lives on fishing

Brač lives on agriculture

65

AIFB

Brač is a

[[Croatia]]n island with an area of 396 km².

Brač is a

[[belongs to::Croatia]]n island with an area of [[area:=396 km²]].

Typed links Extend wiki with typed links So the computer “understands” it

Brač Croatiabelongsto

area396 km²

AIFB

66

AIFB

67

AIFB

How does it look like?

AIFB

69

AIFB

70

AIFB

71

AIFB

72

AIFB

73

AIFB

Advantages

75

AIFB

Many pages answer questions list of female tennis players asteroids named after people countries sorted by area, population, …

They can be generated automatically Less maintenance tasks Higher consistency

Automatic tables and lists

76

AIFB

Inline queries<ask>[[Category:Country]][[located in::Africa]][[population:=>1,000,000]][[population:=<10,000,000]][[population:=*]][[area:=*km²]][[borders::*]]

</ask>

77

AIFB

Inline query results

78

AIFB

Ontoworld

AIFB

79

Ontoworld

80

AIFB

Does every country have one capital? Is there a person with more than one

mother? Is every person born before dying? Does the population density fit to

population and size?

Hand crafted checks

81

AIFB

Multilinguality

Automatic check of consistency over language boundaries

Generating pages for smaller Wikipedias?

Browsing information in different languages

82

AIFB

AIFB

83

AIFB

Web ecosystem

85

AIFB

Wikipedia as a vocabulary

Semantic Wikipedia as a resource of URIs Maintaining good URIs is hard

Documented Multilingual labels Enables easier mapping

Reuse for common terms Helps in mapping the rest structurally

86

AIFB

ChrissieChrissieRobertJordan

RobertJordan

TillTill

fanfr

ien

d

14,95 €14,95 €

price

Wheel ofTime

Wheel ofTime

auth

or o

fsuggestT

ill lo

oks

for

a gi

ftfo

r C

hris

sie,

for

20

€Scenario: looking for gifts

87

AIFB

Wikipedia as knowledge base

Tools already integrate Wikipedia articles

Bits of knowledge make more sense!

Amarok with knowledge import instead full articles?

88

AIFB

Use Semantic Web tools

Integration of data Querying

SPARQL endpoints Browsing

Faceted browsers Visualization

Timeline

89

AIFB

90

AIFB

SMW installations Ontoworld sem’base WWW2006 & ESWC2006 Wiki Semantic Karlsruhe KM Bible wiki Esoteric knowledge wiki Wikicompany JurisPedia …

AIFB

Open issues

92

AIFB

Convergence of vocabulary

Consistency of vocabulary Author of, has written, creator…

Documentation of all types Visual feedback: red and blue links

Queries: consistent vocabulary needed Autocompletion? UI hints?

93

AIFB

NLP and Semantic Wikipedia

Suggestion / learning of relationships Based on patterns Statistical Background knowledge

Good playground for evaluations

94

AIFB

Simpler User Interface

WYSIWYG Interface for MediaWiki is in development

How to integrate with Semantic Extension?

95

AIFB

Lack of expressivity

How to say “Ronald Reagan was US president from 1980 to 1988”?

What about relations between relations, like inverses?

What about transitivity, symmetry? What about constraints like class

disjointness?

AIFB

Next steps

97

AIFB

Next steps

Marry Wikipedia and Semantic Web Need to run stress tests

Wikipedia at 12,000 hits per second Scalability,

scalability, scalability Show cool apps Multilinguality Tons of details

98

AIFB

Conclusions

Very flexible system for creating data “Soft” introduction

People often scared about Semantic Web You can still use it as a standard wiki Immediate benefit

“The simplest database that could work” Data is there – play with it! Kickstart the Semantic Web

AIFB

99

Universal Access toAll Knowledge

AIFB

Thank you!

ontoworld.org

101

AIFB

Conclusions

Very flexible system for creating data “Soft” introduction

People often scared about Semantic Web You can still use it as a standard wiki Immediate benefit

“The simplest database that could work” Data is there – play with it! Kickstart the Semantic Web

AIFB

Backup slides

AIFB

Technicalities

104

AIFB

MediaWiki

Runs Wikipedia Active development Scalable Easy to use and powerful PHP / MySQL Not many SemWeb tools here

105

AIFB

Mapping of OWL to SMW

owl:Individual Article

owl:Class Category

owl:ObjectProperty Relation, Link type

owl:DatatypeProperty Attribute

Object property instance Typed link [[property::object]]

Datatype property instance [[property:=value]]

rdf:type class Class instantiation

[[Category:class]]

(on article page)rdfs:subClassOf class

Subsumption

[[Category:class]]

(on category page)

106

AIFB

Reuse vocabulary

Existing vocabulary can be mapped Wiki directly usable as data source

No external mapping required Define vocabulary and mapping But: no complex mappings

107

AIFB

108

AIFB

AIFB

109

AIFB

110

AIFB

Reuse in a Webpage

112

AIFB

113

AIFB

AIFB

114

Angola Africalocated in

Zambia

located in

borders

Country Continent

AIFB

115

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

AIFB

116

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

Angola

http://www.w3.org/2000/01/rdf-schema#label

Africa

Located in

Zambia

Country

Borders

Continent

AIFB

117

Angola Africalocated in

Zambia

located in

borders

Country Continent

AIFB

118

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

Angola

http://www.w3.org/2000/01/rdf-schema#label

Africa

Located in

Zambia

Country

Borders

Continent

119

AIFB

120

AIFB

SPARQL : RDF Query LangPREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX thing: <http://wiki.ontoworld.org/index.php/_>PREFIX relation: <http://wiki.ontoworld.org/index.php/_Relation-3A>

SELECT ?labelWHERE { thing:Angola relation:Located_in ?c . ?c rdfs:label ?label}

AIFB

121

http://wiki.ontoworld.org/index.php/_Angolahttp://wiki.ontoworld.org/index.php/_Africa

http://wiki.ontoworld.org/index.php/_Relation-3ALocated_in

http://wiki.ontoworld.org/index.php/_Zambia

http://w

iki.ontoworld

.org/index.p

hp/_Relation-3ALoca

ted_in

http://wiki.ontow

orld.org/index.php/_Relation-3A

Borders

http://wiki.ontoworld.org/index.php/_Category-3ACountry

http://wiki.ontoworld.org/index.php/_Category-3AContinent

Angola

http://www.w3.org/2000/01/rdf-schema#label

Africa

Located in

Zambia

Country

Borders

Continent

AIFB

122

<html> <body> Angola is in <?php define("RDFAPI_INCLUDE_DIR", "path/api/"); include(RDFAPI_INCLUDE_DIR . "RDFAPI.php");

$model = ModelFactory::getDefaultModel(); $model->load("full URI/ExportRDF/Angola"); $result = $model->sparqlQuery('SPARQL'); $value = $result[0]['?label']; echo $value->getLabel(); ?> </body></html>

Angola is in Africa.

AIFB

TBox engineering

125

AIFB

Full TBox engineering?

Not meant for it, but possible Does not capture semantics Does not propagate to categories

126

AIFB

Reified ontologies

SubclassOfAxiom1

meta:Populated

place

meta:City

refers

subclass

supe

rclas

s

subclass of

refers

Populated place

City

meta:subclassO

f

127

AIFB

Reified ontologies

SubclassOfAxiom1

meta:Populated

place

meta:City

subclass

supe

rclas

s

subclass of

Populated place

City

meta:subclassO

f

owltools

128

AIFB

Everything can be discussed

All pages have a discussion page All individuals, classes, properties have

a page With reification, even axioms may have

a page Very fine grained discussion possible

Opinions can be formalized explicitly

AIFB

Ontology import

130

AIFB

Ontology import

Reuse existing ontologies Upload mapped ontologies Kickstart a wiki Circumvent empty sheet problem Enrich an existing wiki

Only the simple parts

owl:Individual Article

owl:Class Category

owl:ObjectProperty Relation, Link type

owl:DatatypeProperty Attribute

Object property instance Typed link [[property::object]]

Datatype property instance

[[property::value]]

rdf:type class Class instantiation

[[Category:class]]

On article page

rdfs:subClassOf class

Subsumption

[[Category:class]]

On category page

131

AIFB

AIFB

Reasoning wikis

133

AIFB

Dynamics

SemanticMediaWiki

SemanticMediaWiki KAON2KAON2User

Browser

UserBrowser

edit articlecheck for consistency

consistent?

warn if inconsistent

Example:-wife of has domain Woman-Woman and Man are disjoint

134

AIFB

Mockup screen

135

AIFB

Automatic classification

Infer categories from statements Based on background ontology and

wiki knowledge Automatic classification of articles Can be reused in queries

136

AIFB

Mockup screen

AIFB

URI crisis resolved?

138

AIFB

URIs for everything

Uniform Resource Identifiers Based on known protocols

Linked data: resolve URI for description Maintaining URIs is hard

Clutter namespace Setting up descriptions Persistence Reuse and mapping

139

AIFB

Romeo and JulietRomeo and Juliet

URI crisis

Is “Romeo and Juliet” the book, or the article about the book?

One URI for each Mind the gap! Redirects in

browser

Shakespeareauthor

JSmith 42

author

abou

t

http://en.wikipedia.org/wiki/Romeo_and_Juliet

http://en.wikipedia.org/wiki/_Romeo_and_Juliet

140

AIFB

Bug status: still open

AIFB

141

AIFB

Slides index – Backup slides

103 – Technicalities111 – Reuse in webpages124 – TBox engineering129 – Ontology import132 – Reasoning in wikis137 – URI crisis resolved

19 – Wikipedia 49 – Semantic Web 56 – Basic idea 74 – Advantages 91 – Open issues101 – Conclusions

Recommended