58
Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.i e Enabling Networked Knowledge From Linked Data to Networked Knowledge or Science is a Social Construct Stefan Decker [email protected] http://www.StefanDecker.org/

Stefan Decker Keynote at CSHALS

Embed Size (px)

DESCRIPTION

Keynote Slides Stefan Decker at CSHALS Conference

Citation preview

Page 1: Stefan Decker Keynote at CSHALS

Copyright 2011 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

From Linked Data to Networked Knowledge

orScience is a Social Construct

Stefan Decker

[email protected]://www.StefanDecker.org/

Page 2: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Cave Drawings 30000 BC

Page 3: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Writing: 3200 BC (Sumerian cuneiform)

Page 4: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Printing Press(Gutenberg 1450)

Page 5: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Photography (Daguerre 1839)

Page 6: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Phonograph (Edison 1877)

Page 7: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Movies (Lumiere 1895)

Page 8: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Bush’s camera on the head

Page 9: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge9

Memex

“A memex is a devicein which an individual stores all his books, records, and communications, and which is mechanized so that it may be consultedwith exceeding speed and flexibility”

Posited by Vannevar Bush in “As We May Think” The Atlantic Monthly, July 1945

Page 10: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge10

Sketch of memex

Page 11: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

oNLine System- NLS, 1968(Doug Engelbart, SRI)

“By ‘augmenting human intellect’ we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.”

The Mouse; Word Processing; Data Sharing;Hypertext;

Page 12: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

ARPANET (1969) (John Postel, David Crocker, Vint Cerf)

Page 13: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Xanadu (Ted Nelson ~1960-???)

13

Page 14: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

World Wide Web (Tim Berners-Lee 1989)

WWW (Tim Berners-Lee)“There was a second part of the dream […] we could then use computers to help us analyse it, make sense of what we re doing, where we individually fit in, and how we can better work together.”

Page 15: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge15 of 46

Making Progress…

Memex (Vannevar Bush)A memex is “a device in which an individual stores all his books, records, and communications.”

Augmenting Human Intellect(Doug Engelbart)“By "augmenting human intellect" we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.”

WWW (Tim Berners-Lee)“There was a second part of the dream […] we could then use computers to help us analyse it, make sense of what we re doing, where we individually fit in, and how we can better work together.”

Page 16: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

A Network of Data and Knowledge

Interconnected Universal All encompassing

assists humans, organisations and systems with problem solving

enabling innovation and increased productivity?

Page 17: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

?

Page 18: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

1. Scalability: No growth scalability problem (e.g., no back links from HTML pages)

2. No censorship: no lengthy permission or review process

3. Positive feedback loop: exploit Metcalf’s Law

What enabled the Web?

Page 19: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Metcalfe's law: The value of a network is proportional to the square of the number of connected members

Page 20: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Metcalfe’s Law 1: Links

Page 21: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Metcalfe’s Law 2

Page 22: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Page 23: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

1. Scalability. No centralized infrastructure (e.g., a central object repository) required.

2. No censorship. It must be possible to publish data without having to ask for prior permission.

3. Positive feedback loop. Capitalize on Metcalfe’s Law.

Requirements for a Data Web

Page 24: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Enabling Metcalfe’s Law

1. Global Object Identity. 2. Composability: The value of data can be increased if it can be

combined with other data. Composability has a number of consequences:

1. schema-less. ( Combined data originating from difference sources unlikely to conform to a schema)

2. self-describing

3. “object centric”. In order to integrate information about different entities data must be related to these entities.

4. graph-based. The composition of multiple object-centric data sources results in a graph in the general case.

Page 25: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Observations

• The relational model does not fulfill these requirements (not composable, no global object id)

• XML is not object centric and not composable.

• Graph based data formats are composable

• RDF fulfills these requirements.

• Claim: Any data format that fulfills the requirements is “more or less” isomorphic to RDF.

Page 26: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge26 of 46

The usual two Ingredients

1. RDF – Resource Description FrameworkGraph based Data – nodes and arcs Identifies objects (URIs) Interlink information (Relationships)

2. Vocabularies (Ontologies) provide shared understanding of a domain organise knowledge in a machine-comprehensible way give an exploitable meaning to the data

Page 27: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Linked Open Data cloud - domains

Over 200 open data sets with more than 25 billion facts,interlinked by 400 million typed links, doubling every 10 month!

http://lod-cloud.net/

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.

Media

Government

Geo

Publications

User-generated

Life sciences

Cross-domain

US governmentUK government

BBCNew York Times

LinkedGeoData

27

BestBuyOverstock.comFacebook

Page 28: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Do we really need Ontologies?

Provocative?

Page 29: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Differentiation in Classes and Instances is difficult: no single way to abstract the world (observe Upper Ontology wars…..aehm…discussions!)

Choices between Instances and Classes done at design cause usability issues (different treatment in applications) (animal-mammal-whale)

Ontologies cement power structures (prevent information sharing)

Sharing is only top-down

Issues with Ontologies

Page 30: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Predecessor: Frame Representation Systems “Prototypes: KRL, RLL, and JOSIE employ

prototype frames to represent information about a typical instance of a class as opposed to the class itself and as opposed to actual instances of the class.” [Karp, 1993]

AFAIK: Classes as subsets and instances as elements [Hayes, 1979].

Formalization of Frame Systems (Description Logic) picked up on [Hayes, 1979] and left out alternatives

How did classes/instances happen?

Page 31: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Stefan Decker, Dieter Fensel, Frank van Harmelen, Ian Horrocks, Sergey Melnik, Michel C. A. Klein, Jeen Broekstra: Knowledge Representation on the Web. Description Logics 2000: 89-97

OIL -> DAML+OIL -> OWL -> OWL 2.0

Note: How did DL & Ontologies/Classes happen in the Semantic Web?

Page 32: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge© A. Lienhard, O. Nierstrasz 3.32

Class- vs. Prototype-based Programming Languages

Classes: methods, common properties

Inheritance along class chain Instances defined by their

class Structure typically cannot be

changed at runtime

Class-based: Prototype-based:

> No classes, only objects> Objects define their own

properties and methods> Objects delegate to their

prototype(s)> Any object can be the

prototype of another object

Prototype-based languages unify objects and classes

From: A. Lienhard, O. Nierstrasz: Prototype based programming http://www.slidefinder.net/0/03prototypes/03prototypes/10603817

Page 33: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

> JavaScript, > Self,> NewtonScript,> Omega, Cecil,

Examples

From: A. Lienhard, O. Nierstrasz: Prototype based programming http://www.slidefinder.net/0/03prototypes/03prototypes/10603817

Page 34: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

How it could look like:(Horizontal Information Sharing)

Page 35: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

“Instantiation”

Page 36: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Knowledge Representation Constructs (Specialisation)

Logic based Formalisation of Prototypes Reasoning (e.g., with Rules) Complexity Large Scale Storage, Querying Collaboration facilities

Research Agenda

Page 37: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Linked Data Vocabulariesas Social Constructs

Page 38: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Neologism

http://vocab.deri.ie/ Neologism is a simple, Drupal-based RDF-S vocabulary editor and publishing system, that allows for:

• Collaborativelly creating and maintaining RDFS vocabularies

• Making the vocab available for humans (HTML, graph) and machines (RDF/XML, Turtle)

• Importing external vocabularies

• Working with external namespaces such as via PURL.org, etc.

• More at http://neologism.deri.ie/

Page 39: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Linked Open Data cloud - domains

Over 200 open data sets with more than 25 billion facts,interlinked by 400 million typed links, doubling every 10 month!

http://lod-cloud.net/

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.

Media

Government

Geo

Publications

User-generated

Life sciences

Cross-domain

US governmentUK government

BBCNew York Times

LinkedGeoData

39

BestBuyOverstock.comFacebook

Page 40: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Networked Data Management

Abstraction,Reasoning,Analytics

Visualisation,Collaboration,Exploitation

Information

Action

Page 41: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge6

Digital Enterprise Research Institute www.deri.ie

User Role Analysis

6

• Who are the initiators?

• Who tends to answer questions?

• What fraction of the network is non-social?

Who are the influencers?

• How stable are these roles?

Work by Vaclav Belak, Conor Hayes et al, DERI.

Page 42: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Page 43: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge43

User Role Analysis: Orthogonal Features

■Persistence■Mean/Std. Dev. posts per

thread

■Initialisation■% initiated threads

■Popularity■% in-degree

■% posts that receive reply

■Reciprocity■% bi-directional

neighbours

■% bi-directional threads

AB

CD

Post

Response

Page 44: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge44

Role Analysis

Reciprocity Persistence

Popularity

Initiation

Popular Initiator High High Very High

Popular Participant

High High Low

Supporter Medium Medium Low

Elitist Low neighborhoodHi thread reponse

L-M

Grunt Low to Medium

Low to Medium

Taciturn Very Low Low to Medium

Page 45: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge45

Example

Reciprocity

Persistence

Popularity Initiation

A 2/3 1/7 2/7 1

B 2/3 2/7 2/7 0

C 1/3 3/7 0 0

D 0 1/7 0 0

AB

CD

Popular initiator

Popular participant

Grunt

Taciturn

Post

Response

Page 46: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Analysis of Forums

Boards.ie data from 01/07/2006 to 31/12/2006 Personal Issues Christianity Weather Windows Development Humanities Politics

Page 47: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Personal Issues Forum

Mostly taciturns Not a lot of dialog

Page 48: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Christianity vs Weather

Some popular initiators

Some lengthy discussions

Popular initiators Large portion of grunts Not as much discussion

Page 49: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Windows, Development, Politics

Less social (technical) No popular initiators Lots of grunts

Page 50: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Humanities Forum

highly social elitists Some supporters

Page 51: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

The Evolution of Communities

Work by Vaclav Belak, Conor Hayes et al, DERI.

Page 52: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Motivation

Kuhn claimed the development of scientific knowledge proceeds in discrete steps:1.Pre-paradigm period2.Paradigm period (normal science)

paradigm articulation

3.Crisis4.Reaction to the crisis

paradigm shift

Page 53: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Cross-Community Effects

Co-citation networks of Semantic Web community

community shift

community specialization

Page 54: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Methodology Pipeline

Community shifts and specializations

Publications from major conferences selected from DBLP

Page 55: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Community & Topic Detection

Communities identified using: Infomap

Reasons: publicly available implementations weighted directed networks

Communities traced from one snapshot to the next according to the highest Jaccard coefficient

Ancestors and descendant obtained by a modification of Jaccard coefficient

Page 56: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Page 57: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

Topics of Louvain Community 26

15

Page 58: Stefan Decker Keynote at CSHALS

Digital Enterprise Research Institute www.deri.ie

Enabling Networked Knowledge

A Network of Knowledge

enabling innovation and increased productivity

Interconnected Universal All encompassing

assists humans, organisations and systems with problem solving

Linked Data

•Search•Collaboration•Text Mining

•Science•Commercialization`