Upload
soeren-auer
View
16.376
Download
0
Tags:
Embed Size (px)
DESCRIPTION
This tutorial explains the Data Web vision, some preliminary standards and technologies as well as some tools and technological building blocks developed by AKSW research group from Universität Leipzig.
Citation preview
From Document Web toa Web of Linked Data
Dr. Sören AuerAKSW, Institut für Informatik
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Open Street Maps – linked open geo data
Linked Data Tutorial
From the Document Web to theLinked Open Data Web (and beyond)
Linked Data Tutorial
Web (since 1992)•HTTP•HTML/CSS/JavaScript
Semantic Web(Vision 1998, starting ???)•Reasoning•Logic, Rules•Trust
Social Web (since 2003)•Folksonomies/Tagging•Reputation, sharing•Groups, relationships
Data Web (since 2006)•URI de-referencability•CBD•RDF serializations
Conceptual LevelData Access and Integration
Linked Data Tutorial
Object-relational mappings (ORM)•NeXT’s EOF / WebObjects•ADO.NET Entity Framework•Hibernate
Entity-attribute-value (EAV)•HELP medical record system, TrialDB
Column-oriented DBMS•Collocates column values rather than row values•Vertica, C-Store, MonetDB
Data Web•URIs as entity identifiers•HTTP as data access protocol•Local-As-View (LAV)
RDBMS•Organize data in relations, rows, cells•Oracle, DB2, MS-SQL
Triple/Quad Stores•RDF data model•Virtuoso, Oracle, Sesame
Dat
a M
odel
sD
ata
Mod
els
Others•XML, hierachical, tree, graph-oriented DBMS
Procedural APIs•ODBC•JDBC
Dat
a Ac
cess
Dat
a Ac
cess
Query Languages•Datalog, SQL•SPARQL•XPATH/XQuery
Dat
a In
tegr
ation
Dat
a In
tegr
ation
Linked Data•de-referencable URIs•RDF serialization formats
Enterprise Information Integrationsets of heterogeneous data sources appear as a single, homogeneous data source
Data Warehousing•Based on extract, transform load (ETL)•Global-As-View (GAV)
ResearchMediatorsOntology-basedP2PWeb service-based
Web 1.0 Web 2.0 Web 3.0
Many Web sitescontaining unstructured,textual content
Few large Web sites are specialized onspecific content types
Many Web sites containing & semantically syndicating arbitrarily structured content
PicturesVideo
Encyclopedicarticles+ +
Linked Data Tutorial
The Long Tail of Information DomainsPictures
NewsVideo
Recipes
Calendar
Currently supportedstructuredcontent types
SemWeb supported structured content
Genesequences
Itinerary ofKing George
Talentmanagement
Popu
larit
y
Not or insufficiently supported content types
The Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domains
… …
Requirements-Engineering
……
Special interestcommunities
Linked Data Tutorial
Web server
Web server
Why Do We Need Another Web?Try to search for these things on the current Web:• Apartments near German-French bilingual childcare in Leipzig.• ERP service providers with offices in Vienna and Berlin.• Researchers working on DB related topics in south-east Asia.Information to answer such search queries is available on the Web,
but opaque to current Web search.(Semantic) Data Web allows to complement text on Web pages with
structured data and to intelligently combine and integrate such structured information from different sources:
Linked Data Tutorial
Leipzig.deHas everything about childcare in L.e.
Immobilienscout.deKnows all about real estate offers in GermanyDB
Web serverWeb
server
DB
Web server
Search engineSearch engineHTML HTML
RDF RDF
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Virtuoso – Knowledge Store7. Open Street Maps – free and open geo data
Linked Data Tutorial
RDF - Resource Description Framework
Distinguishes two fundamental base types:
Resources• Complex abstract or concret entities• Uniquely identified by an URI:
– http://DBpedia.org/resource/Vienna
Literals• concrete data values• Optionally typed (e.g. xsl:string, xsl:dateTime etc.) or language (e.g. en,
de):– "2008-05-31T09:30:00"^^xsd:dateTime– "Wien"@"de"
Linked Data Tutorial
RDF Statement / Triple ParadigmRDF/XML:
<?xml version="1.0"?><rdf:RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#">
<Description about=" http://OntoWiki.net "> <dc:Creator>Sören Auer</DC:Creator> </Description>
</rdf:RDF>
RDF/XML:
<?xml version="1.0"?><rdf:RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#">
<Description about=" http://OntoWiki.net "> <dc:Creator>Sören Auer</DC:Creator> </Description>
</rdf:RDF>
Linked Data Tutorial
http://OntoWiki.net Sören Auerdc:creator
Subject
(Resource)
Predicate
(Resource)
Object
(Resource/Literal)
RDF/N3:
http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator "Sören Auer“
RDF/N3:
http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator "Sören Auer“
RDF Document / Model / Graph
– Simple Knowledge Base
– Combines multiple RDF Statements
Linked Data Tutorial
http://OntoWiki.net http://aksw.org/staff/Soerendc:Creator
Sören Auer
foaf:Emailfoaf:Name
RDF Serialization<?xml version="1.0"?><rdf:RDF
xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns:dc="http://purl.org/metadata/dublin_core#">
<rdf:Description about="http://OntoWiki.net"> <dc:Creator> <rdf:Description> <rdf:Description about="http://aksw.org/staff/Soeren"> <dc:Name>Sören Auer</dc:Name> <dc:Email>[email protected]</dc:Email> </rdf:Description> </dc:Creator> </rdf:Description></rdf:RDF>
Linked Data Tutorial
http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator http://aksw.org/staff/Soerenhttp://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Name "Sören Auer"http://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Email [email protected]
http://OntoWiki.net http://purl.org/metadata/dublin_core#Creator http://aksw.org/staff/Soerenhttp://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Name "Sören Auer"http://aksw.org/staff/Soeren http://purl.org/metadata/dublin_core#Email [email protected]
http://OntoWiki.net http://aksw.org/staff/SoerenCreator
Sören Auer
EmailName
RDF Schema
Restrict combinations of resources / literals
Structuring of vocabularies
Instantiation / classification
Provisioning of special resources:• Classes (concepts, frames)
http://www.w3.org/2000/01/rdf-schema#Class• Attributes (properties, slots, roles)
http://www.w3.org/2000/01/rdf-schema#Property• Instances (objects)
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
Linked Data Tutorial
http://OntoWiki.net 16.11.2007dc:creator ?
RDF-S Class & PropertyHierarchies
Beer rdf:type rdfs:ClassBottomFermentedBeer rdfs:subClassOf BeerBock rdfs:subClassOf BottomFermentedBeerLager rdfs:subClassOf BottomFermentedBeerPilsner rdfs:subClassOf BottomFermentedBeer
Beer rdf:type rdfs:ClassBottomFermentedBeer rdfs:subClassOf BeerBock rdfs:subClassOf BottomFermentedBeerLager rdfs:subClassOf BottomFermentedBeerPilsner rdfs:subClassOf BottomFermentedBeer
Linked Data Tutorial
hasContent rdf:type rdfs:PropertyhasAlcoholicContent rdfs:subPropertyOf BeerhasOriginalWortContent rdfs:subClassOf BottomFermentedBeer
hasContent rdf:type rdfs:PropertyhasAlcoholicContent rdfs:subPropertyOf BeerhasOriginalWortContent rdfs:subClassOf BottomFermentedBeer
RDF-S Properties… are defined and used independently from classes
Domain: Association with one or multiple classes
Range: defines values the property can assume– Instances of a certain class– literals typed with a certain XML schema data type
Linked Data Tutorial
hasAlcoholicContent rdf:type owl:DatatypePropertyhasAlcoholicContent rdf:type owl:FunctionalPropertyhasAlcoholicContent rdfs:domain BeerhasAlcoholicContent rdfs:range xsd:floathasAlcoholicContent rdfs:subPropertyOf hasContent brews rdf:type owl:ObjectPropertybrews rdfs:domain Brewerybrews rdfs:range Beer
hasAlcoholicContent rdf:type owl:DatatypePropertyhasAlcoholicContent rdf:type owl:FunctionalPropertyhasAlcoholicContent rdfs:domain BeerhasAlcoholicContent rdfs:range xsd:floathasAlcoholicContent rdfs:subPropertyOf hasContent brews rdf:type owl:ObjectPropertybrews rdfs:domain Brewerybrews rdfs:range Beer
RDF-S InstancesAre associated to one (or multiple) class(es) :
Linked Data Tutorial
Boddingtons rdf:type AleGrafentrunk rdf:type BockHoegaarden rdf:type WhiteJever rdf:type Pilsner
Boddingtons rdf:type AleGrafentrunk rdf:type BockHoegaarden rdf:type WhiteJever rdf:type Pilsner
Linked Data Tutorial
Semantic Web Layer Cake
Linked Data - Paradigm
• Use URIs as names for things
• Use HTTP URIs so that people can look up those names.
• When someone looks up a URI, provide useful information.
• Include links to other URIs. so that they can discover more things.
Linked Data – Publishing RDF
• De-referenceable RDF-URIs, e.g.:http://dbpedia.org/resource/Busan
• Different HTTP response depending on HTTP-Accept-
Header
Linked Data Tutorial
Benefits of using the RDF Data Model in the Linked Data Context
• Clients can look up every URI in an RDF graph over the Web to retrieve additional information.
• Information from different sources merges naturally.• The data model enables you to set RDF links between data
from different sources.• The data model allows you to represent information that is
expressed using different schemata in a single model.• Combined with schema languages such as RDF-S or OWL,
the data model allows you to use as much or as little structure as you need, meaning that you can represent tightly structured data as well as semi-structured data.
Linked Data Tutorial
Linking Open Data (LOD) Cloud
Linked Data Tutorial
Data Web Moving Targets
Base technologies (RDF, SPARQL, HTTP etc.) are developed, standardized and ready to use
Big issues:• Scalability• User interfaces• Search engines• Business models• (Reasoning)
Linked Data Tutorial
Data Web Business Models
• Advertisement (page view) based businesses will probably not be first movers
• Large Web companies will probably not be first movers
• Data Web should focus on fragmented markets with many players which require widest distribution of information, e.g. realtors, online shops, transportation service providers, public information, geo data etc.
Linked Data Tutorial
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Open Street Maps – free and open geo data
Linked Data Tutorial
Triplify Motivation• growth of semantic representations
still outpaced by the traditional Web• overcome the chicken-and-egg
dilemma of missing semantic representations and search facilities on the Web
• Triplify leverages relational representations behind existing Web applications:– often open-source, deployed hundred
thousand times– structure and semantics encoded
in relational database schemes (behind Web apps) is not accessible to Web search engines, mashups etc.
Linked Data Tutorial
Monthly Web application downloads at Sourceforge
Triplify Big Picture
Linked Data Tutorial
Triplify Approach: Simplicity• Expose semantics as simple as possible
– No (new) mapping languages– Few lines of code – easy to plug-in– Simple, reusable configurations
• Available for most popular Web app languages– PHP (ready), Ruby/Python under development
• Works with most popular Web app DBs– MySQL (extensively tested), PHP-PDO DBs (SQLite, Oracle,
DB2, MS SQL, PostgreSQL etc.) should work, not needed for Virtuoso
• Triplify exposes RDF/Ntriples, LinkedData and RDF/JSON
Linked Data Tutorial
Triplify Solution: SQL-SELECT queries map relational data to RDF
Triplify Configuration:• number of SQL queries selecting information, which should be made publicly
available.
Special SQL query result structure required (in order to convert results into RDF:• first column must contain identifiers for generating instance URIs (i.e. the primary
key of DB table) • column names are used to generate property URIs, renaming columns allows to
reuse properties from existing vocabularies such as Dublin Core, FOAF, SIOC– e.g. SELECT id, name AS 'foaf:name' FROM users
• individual cells contain data values or references to other instances(eventually constitute the objects of resulting triples)
Linked Data Tutorial
Example: Wordpress Blog PostsAssociate the URL path fragment 'post‘ with a number of
SQL patterns:http://blog.aksw.org/triplify/post/(xxx)
SELECT id, post_author AS 'sioc:has_creator->user',post_title AS 'dc:title',post_content AS 'sioc:content', post_date AS 'dcterms:modified^^xsd:dateTime‘,post_modified AS 'dcterms:created^^xsd:dateTime'
FROM postsWHERE post_status='publish‘ (AND id=xxx)
SELECT post_id id, tag_label AS 'tag:taggedWithTag‘FROM post2tag INNER JOIN tag ON(post2tag.tag_id=tag.tag_id)(WHERE id=xxx)
SELECT post_id id, category_id AS 'belongsToCategory->category‘FROM post2cat(WHERE id=xxx)
Linked Data Tutorial
Object propertyObject property
Datatype propertyDatatype property
1
2
3
RDF Conversion
id post_author post_title post_content post_date post_modified
1 5 New DBpedia release Today we released … 200810201635 200810201635
Linked Data Tutorial
http://blog.aksw.org/triplify/post/1 sioc:has_creator http://blog.aksw.org/triplify/user/5http://blog.aksw.org/triplify/post/1 dc:title “New DBpedia release”http://blog.aksw.org/triplify/post/1 sioc:content “Today we released …”http://blog.aksw.org/triplify/post/1 dcterms:modified “20081020T1635”^^xsd:dateTimehttp://blog.aksw.org/triplify/post/1 dcterms:created “20081020T1635”^^xsd:dateTimehttp://blog.aksw.org/triplify/post/1 tag:taggedWithTag “DBpedia”http://blog.aksw.org/triplify/post/1 tag:taggedWithTag “Release”http://blog.aksw.org/triplify/post/1 belongsToCategory http://blog.aksw.org/triplify/category/34
id tag:taggedWithTag
1 DBpedia
1 Release
..
id belogsToCategory
1 34
…
1
2 3
http://blog.aksw.org/triplify/post/1
Example Config<?php
include('../wp-config.php');
$triplify['namespaces']=array( 'vocabulary'=>'http://triplify.org/vocabulary/Wordpress/', 'foaf'=>'http://xmlns.com/foaf/0.1/', … );
$triplify['queries']=array( 'post'=>array( "SELECT id,post_author 'sioc:has_creator->user',post_date 'dcterms:created',post_title 'dc:title', post_content 'sioc:content', post_modified 'dcterms:modified‘ FROM {$table_prefix}posts WHERE post_status='publish'", "SELECT post_id id,tag_id 'tag:taggedWithTag' FROM {$table_prefix}post2tag", "SELECT post_id id,category_id 'belongsToCategory' FROM {$table_prefix}post2cat", ), 'tag'=>"SELECT tag_ID id,tag 'tag:tagName' FROM {$table_prefix}tags", 'category'=>"SELECT cat_ID id,cat_name 'skos:prefLabel',category_parent 'skos:narrower' FROM {$table_prefix}categories", 'user'=>array( "SELECT id,user_login 'foaf:accountName',SHA(CONCAT('mailto:',user_email)) 'foaf:mbox_sha1sum', user_url 'foaf:homepage',display_name 'foaf:name' FROM {$table_prefix}users", "SELECT user_id id,meta_value 'foaf:firstName' FROM {$table_prefix}usermeta WHERE meta_key='first_name'", "SELECT user_id id,meta_value 'foaf:family_name' FROM {$table_prefix}usermeta WHERE meta_key='last_name'", ), 'comment'=>"SELECT comment_ID id,comment_post_id 'sioc:reply_of',comment_author AS 'foaf:name', SHA(CONCAT('mailto:',comment_author_email)) 'foaf:mbox_sha1sum', comment_author_url 'foaf:homepage', comment_date AS 'dcterms:created', comment_content 'sioc:content',comment_karma,comment_type FROM {$table_prefix}comments WHERE comment_approved='1'",);
$triplify['objectProperties']=array( 'sioc:has_creator'=>'user', 'tag:taggedWithTag'=>'tag', 'belongsToCategory'=>'category‘,'skos:narrower'=>'category','sioc:reply_of'=>'post');
$triplify['classMap']=array('user'=>'foaf:person', 'post'=>'sioc:Post', 'tag'=>'tag:Tag', 'category'=>'skos:Concept');
$triplify['TTL']=0; // Caching
$triplify['db']=new PDO('mysql:host='.DB_HOST.';dbname='.DB_NAME,DB_USER,DB_PASSWORD);?>
Linked Data Tutorial
Triplify Temporal ExtensionProblem: How do next generation search engines know
something changed on the Data Web?
Different solutions:• Try to crawl always everything: currently deployed on
the Web• Ping a central update notification service:
PingTheSemanticWeb.com – will probably not scale if the Data Web gets really deployed
• Each linked data endpoint publishes an update log:Triplify Update Logs
Linked Data Tutorial
Triplify Temporal Extensionhttp://example.com/Triplify/update
http://example.com/Triplify/update/2007 rdf:type update:UpdateCollection .http://example.com/Triplify/update/2008 rdf:type update:UpdateCollection .
http://example.com/Triplify/update/2008
http://example.com/Triplify/update/2008/Jan rdf:type update:UpdateCollection .http://example.com/Triplify/update/2008/Feb rdf:type update:UpdateCollection .
Nesting continues until we finally reach an URL, which exposes all updates performed in a certain second in time…
http://example.com/Triplify/update/2008/Jan/01/17/58/06
http://example.com/Triplify/update/2008/Jan/01/17/58/06/user123 update:updatedResource http://example.com/Triplify/users/JohnDoe ; update:updatedAt "20080101T17:58:06"^<xsd:dateTime> ; update:updatedBy http://example.com/Triplify/users/JohnDoe .
Linked Data Tutorial
special update path and vocabularyspecial update path and vocabulary
Triplify Spatial ExtensionHow to publish geo-data using Triplify?OpenStreetMaps – 160 GB Geo Data
lots of POIs – hotels, gas stations, universities …
http://LinkedGeoData.org/near/48.213056,16.359722/1000/Hotel
http://LinkedGeoData.org/point/212331http://LinkedGeoData.org/point/944523http://LinkedGeoData.org/point/234091
Linked Data Tutorial
Lon Lat Radius Tag
RDB2RDF tool comparison
Linked Data Tutorial
ToolTriplify R2DQ Virtuoso RDF
Views
TechnologyScripting languages
(PHP) Java Whole middleware solution
SPARQL endpoint- X X
Mapping languageSQL RDF based RDF based
Mapping generation Manual Semi-automatic Manual
Scalability Medium-high(but no SPARQL) medium High
More at: http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt
Relational Databases RDF & Ontologies
Data Model Relational(tables, columns, rows)
Triples(subject, predicate, object)
Schema and data separation
Implicit information
Scalability
Schema flexibility
Web data integration readiness
Marrying DBs with RDF & Ontologies
Linked Data Tutorial
Using DBs for storage and querying of RDF & ontologies
Publishing DB content as RDF
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Open Street Maps – free and open geo data
Linked Data Tutorial
Transforming Wikipedia into aKnowledge base☺ Wikipedia is the 8th most popular website (according to Alexa.com)
☺ Maybe the finest example of truly collaboratively created content(>8M articles in >200 languages written by >300.000 authors)
☺ Covers all possible topics and domains, articles are a result of a “community consensus”
Θ Many inconsistencies can be found on different pages/language versions
Θ Not very well integrated with other data sources
Θ Lacks structured representations of content which facilitate querying and search
Simple Questions – hard to answer:
• What have the Art Nouveau and Berlin in common?
• Who are mayors of central European towns elevated more than 1000m?
• Which films are longer than 4 hours and had a budget of less than $1 Million?
The information required to answer these is contained in Wikipedia!
How can we reveal structure and semantics of Wikipedia content?Linked Data Tutorial
Structure in Wikipedia• Title• Abstract• Infoboxes• Geo-coordinates• Categories• Images• Links
– other language versions– other Wikipedia pages– To the Web– Redirects– Disambiguations
Linked Data Tutorial
Infobox templates{{Infobox Korean settlement| title = Busan Metropolitan City| img = Busan.jpg| imgcaption = A view of the [[Geumjeong]] district in Busan| hangul = 부산 광역시...| area_km2 = 763.46| pop = 3635389| popyear = 2006| mayor = Hur Nam-sik| divs = 15 wards (Gu), 1 county (Gun)| region = [[Yeongnam]]| dialect = [[Gyeongsang]]}}
http://dbpedia.org/resource/Busan
dbp:Busan dbpp:title ″Busan Metropolitan City″dbp:Busan dbpp:hangul ″ 부산 광역시″ @Hangdbp:Busan dbpp:area_km2 ″763.46“^xsd:floatdbp:Busan dbpp:pop ″3635389“^xsd:intdbp:Busan dbpp:region dbp:Yeongnamdbp:Busan dbpp:dialect dbp:Gyeongsang...
Wikitext-Syntax
RDF representation
Linked Data Tutorial
Class Hierarchy• 200k people (70k athletes, 65k artists, 18k office holders)• 193k places (100k areas, 40k cities, 10k rivers)• 187k works (71k music albums, 24k singles, 31k films, 15k books)• 87k species• 70k organisations (20k educational institutions, 18k companies, 12k
radio stations)• 22k buildings (8k airports, 5k stations, 2k stadiums, 1k bridges)• 12k planets
And more… (events, diseases, proteins, drugs, aircrafts, automobiles, ships, astronaut, architect, scientists)
Extraction resultsExtraction algorithm with the English Wikipedia content (
http://dumps.wikimedia.org/enwiki)
<1h needed to extract templates and convert them to RDF (>2M
English Wikipedia articles, >10GB raw data)
roughly 30M facts extracted from infobox templates alone
Sample checks reveal: ~90% accuracy, 9% redundant
information, 1% erroneous
multi-domain ontology covering a large body of domains
extraction results and source code of the extraction algorithm
available at http://dbpedia.org
Dataset (en) Triples
Articles 7.6M
Abstracts 2.1M
External Links 3.2M
Categories 7.3M
Infoboxes 29.3M
Persons 560k
Yago Classes 2M
Wordnet Classes 338k
Geo-coordinates 450k
Mapping to Flickr, DBLP, Eurostat, CIA-Factbook, Musicbrainz, Project Gutenberg, US Census, …
100k
Mapping to OpenCyc 45k
Linked Data Tutorial
DBpedia Components
Wikipedia Dumps
Article texts DB tables
InfoboxArticles Categories…
DBpedia datasets
SPARQLEndpoint
QueryBuilder
SNORQLBrowser
TraditionalWeb Browser
Web 2.0 Mashups
Virtuoso MySQL
Extraction
loaded into
published via
…LinkedData…
Semantic Web Browsers
OpenCyc
Wordnet
Freebase
Geonames…
…
…
interlinked withother open data
Linked Data Tutorial
User Interfaces
Linked Data Tutorial
DBpedia SPARQL Endpoint (1)
• http://dbpedia.org/sparql
• hosted on a OpenLink Virtuoso server
• can answer SPARQL queries like– Give me all Sitcoms that are set in NYC? – All tennis players from Moscow? – All films by Quentin Tarentino? – All German musicians that were born in Berlin in the 19th century? – All soccer players with tricot number 11, playing for a club having a
stadium with over 40,000 seats and is born in a country with over 10 million inhabitants?
DBpedia SPARQL Endpoint (2)SELECT ?name ?birth ?description ?person WHERE {
?person dbp:birthPlace dbp:Berlin .
?person skos:subject dbp:Cat:German_musicians .
?person dbp:birth ?birth .
?person foaf:name ?name .
?person rdfs:comment ?description .
FILTER (LANG(?description) = 'en') .
} ORDER BY ?name
Linked Data Tutorial
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Virtuoso – Knowledge Store7. Open Street Maps – free and open geo data
Linked Data Tutorial
OntoWiki
1.Semantic Wiki2.Differences3.Similarities4.Architecture5.Use Cases
Linked Data Tutorial
Semantic Wiki
• Wiki with added semantics• Goal: Wiki pages + background knowledge
base• Examples: Semantic MediaWiki, Rhizome,
IkeWiki
Linked Data Tutorial
Conceptual Differences: Views over Articles
Wiki articles Resource views
Linked Data Tutorial
Conceptual Differences:Forms over Code
Wiki code Forms
Linked Data Tutorial
Conceptual Similarities:Wikiwiki Concepts
• Everyone can edit anything• Content is edited in the same way as
structure is• Activity can be watched and reviewed by
everyone Ward Cunningham
Linked Data Tutorial
Versioning
• Everything can be undone• Philosophy: make it easy to correct
mistakes
Linked Data Tutorial
OntoWiki Application Framework: Interfaces
• SPARQL Endpoint• Linked Data Endpoint• WebDAV• REST API• Command Line Interface• LDAP
Linked Data Tutorial
Extensibility
• Plugins• Views/Templates• Themes• Localizations
Linked Data Tutorial
Access Control
• Model-based• Action-based• (Statement-based)
Linked Data Tutorial
Other Features
• Facet-based browsing• Inline editing• Auto-adaptive user interface• Resource auto-suggestion• SPARQL Query Editor
Linked Data Tutorial
Architecture
Linked Data Tutorial
Vision
• Generic data wiki for RDF models– no data model mismatch (structured vs.
unstructured)
• Application framework for:– Knowledge-intensive applications– Agile processes– Distributed user groups
Linked Data Tutorial
SoftWiki*
Linked Data Tutorial
Problem: Requirements Engineering with large, spatially distributed stakeholder groups
Solution: comprehensive ontology for representing RE relevant knowledge + adapted OntoWiki application
Application of text-miningmethods for duplicate detection
* Work in BmbF funded project with UniDuE, T-Systems, QA-Systems, LeCoS,ProDV
Linked Data Tutorial
Caucasian Spiders
• Faunistic database on spiders of the Caucasus
• Taxonomy• Localities• 240k triples
Linked Data Tutorial
Linked Data Tutorial
Professor Catalogue
• Professor catalogue with 800 entries and 60 schema elements
• OntoWiki used as backend for data entry• Custom front-end
Linked Data Tutorial
Linked Data Tutorial
Linked Data Tutorial
Semantic Wikis: Related Work
Linked Data Tutorial
OntoWiki Semantic MediaWiki
IkeWiki
Main developer Uni Leipzig AKSW AIFB Karlsruhe Salzburg Research
Technology PHP/MySQL PHP/MySQL (MediaWiki extension)
Java/Postgres
Base artifacts Facts (annotated) texts (annotated) texts
Authoring WYSIWIG facts / forms
Wiki syntax / semantic forms
WYSIWIG / forms
Other Data Web development framework
Planned Wikipedia deployment
Visual KB browser
Vakantieland*One of the largest tourist information sites in NL
(>100.000 daily page views, >20.000 points of interest)Traditional relational DB system was to inflexible to capture the increasingly
heterogeneous content types• Development of an OntoWiki based Data Web application• Geo-data integration from OpenStreetMaps• Semantic-Search• Integration of
DBpedia data• Comprehensive
performance tuning
* work with Ceriel Jakobs,Michael Martin partiallyfunded by SenterNovem
Linked Data Tutorial
Overview
1. The Linked Data Web Vision2. Data Web Technologies3. Publishing relational data on the Web4. DBpedia – transforming Wikipedia into a
knowledge base5. OntoWiki – an Linked Data Wiki6. Open Street Maps – linked open geo data
Linked Data Tutorial
Linked Open Geo DataSpatial data is crucial for the Data Web in order to interlink geographically linked resources.Open Street Map project (OSM) collects, organizes and publishes geo data the wiki way:• 80.000 OSM users collected data about 22M km ways (roads, highways etc.) on earth, 25T
km are added daily• OSM contains a vast amount points-of-interest descriptions e.g. shops, amenities, sports
venues, businesses, touristic and historic sights.Goal: publish OSM geo data, interlink it with other data sources and provide efficient means
for browsing and authoring:• Open Street Map data extraction works on the basis of OSM database dumps, a bi-
directional live integration of OSM and our Linked Geo Data browser and editor is currently in the works.
• Triplify spatial data publishing, the Triplify script for publishing linked data from relational databases is extended for publishing geo data, in particular with regard to the retrieval of information about geographical areas.
• LinkedGeo Data browser and editor is a facet-based browser for geo content, which uses an OLAP inspired hypercube for quickly retrieving aggregated information about any user selected area on earth.
Linked Data Tutorial
Faceted Linked-Geo-Data Browser
Linked Data Tutorial
DBpedia“Semantification” of Wikipedia
DBpedia“Semantification” of Wikipedia
AKSW Linked Data Web Building Blocks
Linked Data Tutorial
Triplify“Semantification” of (small) Web Applications
Triplify“Semantification” of (small) Web Applications
OntoWikiCollaborative creation of explicit knowledge via Semantic Wikis
OntoWikiCollaborative creation of explicit knowledge via Semantic Wikis
OWLDBExtending DBs for ontology handling / revealing implicit information
OWLDBExtending DBs for ontology handling / revealing implicit information
VakantielandBuilding Data Web applicationsVakantielandBuilding Data Web applications
SoftWikiDistributed, stakeholder driven Requirements Engineering
SoftWikiDistributed, stakeholder driven Requirements Engineering
FoundationsMarrying databases with RDFand ontologies
ToolsApplicationsBringing the Data Web to end users
RDF Query Subsumption & View MaintenanceScaling database backed Triple Stores
RDF Query Subsumption & View MaintenanceScaling database backed Triple Stores
xOperatorCombining Instant Messaging with the Data Web
xOperatorCombining Instant Messaging with the Data Web
OpenResearch.orgA semantic Wiki for the sciencesOpenResearch.orgA semantic Wiki for the sciences
…
DL-LearnerMachine Learning for Ontologies
DL-LearnerMachine Learning for Ontologies
Thanks!Dr. Sören [email protected] group Agile Knowledge Engineering & Semantic Web
(AKSW): http://aksw.org• http://Triplify.org• http://DBpedia.org• http://OntoWiki.net• http://OpenResearch.org• http://aksw.org/projects/xOperator
• DL-Learner.org• Cofundos.org
Linked Data Tutorial