Faculty of Mathematics and Physics Charles University in Prague Linked Data Tutorial Tomáš Knap,...
If you can't read please download the document
Faculty of Mathematics and Physics Charles University in Prague Linked Data Tutorial Tomáš Knap, Jindřich Mynarz, Martin Nečaský, Jakub Stárka February
Faculty of Mathematics and Physics Charles University in Prague
Linked Data Tutorial Tom Knap, Jindich Mynarz, Martin Neask, Jakub
Strka February 16, 2012 (Partially based on slides of Chris Bizer
[9])
Slide 2
16th February 2012 | Linked Data Tutorial2 Motivation
Slide 3
16th February 2012 | Linked Data Tutorial3 Motivational
Scenario Basic data EmployeesDepartments Public contracts
BudgetExpenses WWW page of the institution Business Register FIS
Buyers Profile ISVZUS gov.cz Data Consumer: Show me suppliers of
the public contracts for the Ministry of Finance (MF) in the region
Liberec. Show me the data on the Google maps in iPhone. For every
public contract, I am also looking for the aggregation of all the
payments made by MF, link to their budget and responsible person.
Where can I get the data about public contracts, responsible
persons, expenses, and budget of MF? How should I aggregate and
link the data? How can I observe the data on the map?
Slide 4
16th February 2012 | Linked Data Tutorial4 Current Common
Practise 1 MF public contracts 2 MF public contracts + employees 3
- Expenses Consumer did not discovered ? ? ? Information
Integration very time consuming, boring, and ineffective! Basic
data EmployeesDepartments Public contracts BudgetExpenses WWW page
of the institution Business Register FIS Buyers Profile ISVZUS
gov.cz
Slide 5
16th February 2012 | Linked Data Tutorial5 Linked Data -
Basics
Slide 6
16th February 2012 | Linked Data Tutorial6 Linked Data Set of
best practices for publishing structured data on the Web in
accordance with the general architecture of the Web using Semantic
Web technologies and standards Semantic Web is the goal, Linked
Data provides the means to reach the goal
Slide 7
16th February 2012 | Linked Data Tutorial7 Linked Data
Principles 1.Use URIs as names for things 2.Use HTTP URIs so that
people can look up those names. 3.When someone looks up a URI,
provide useful RDF information 4.Include RDF statements that link
to other URIs so that they can discover related things. [Tim
Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html,
2006]
Slide 8
16th February 2012 | Linked Data Tutorial8 Architecture of the
Classic Web Single global information space Small set of simple
standards: HTTP URI globally unique ID retrieval mechanism HTML as
document format Hyperlinks to connect everything Applications work
on top of the complete information space
Slide 9
16th February 2012 | Linked Data Tutorial9 Web 2.0 APIs and
Mashups No single global dataspace Shortcomings: API have
proprietary interfaces No hyperlinks between data items within
different APIs Mashups are based on a fixed set of data sources Web
APIs slice the Web into Walled Gardens!
Slide 10
16th February 2012 | Linked Data Tutorial10 Linked Data Extend
the Web with a single global dataspace By using RDF to publish
structured data on the Web By setting links between data items
within different data sources. Physically distributed, behaves like
single dataspace
Slide 11
16th February 2012 | Linked Data Tutorial11 RDF Data Model
Flexible graph-based data model [2] HTTP URIs take the role of
global primary keys. pd:cygri =
http://richard.cyganiak.de/foaf.rdf#cygri dbpedia:Berlin =
http://dbpedia.org/resource/Berlin
Slide 12
16th February 2012 | Linked Data Tutorial12 Resolving URIs over
the Web The HTTP protocol brings together identification and
retrieval
Slide 13
16th February 2012 | Linked Data Tutorial13 Following Links
deeper into the Web
Slide 14
16th February 2012 | Linked Data Tutorial14 Pubby Linked Data
Browser http://dbpedia.org/page/esk_Krumlov
Slide 15
16th February 2012 | Linked Data Tutorial15 Properties of the
Web of Linked Data Global, distributed data space build on a simple
set of standards RDF, URIs, HTTP Entities are connected by links
creating a global data graph that spans data sources enables the
discovery of new data sources Data-coexistence Everyone can publish
data to the Web of Linked Data Everyone can express their personal
view on things
Slide 16
16th February 2012 | Linked Data Tutorial16 Linked Data
Deployment on the Web.. Is it real?
Slide 17
16th February 2012 | Linked Data Tutorial17 W3C Linking Open
Data Project Grassroots community effort to Publish existing open
license datasets as Linked Data on the Web Interlink things between
different data sources
Slide 18
16th February 2012 | Linked Data Tutorial18 Linked Data Cloud
2007
Slide 19
16th February 2012 | Linked Data Tutorial19 Linked Data Cloud
2009
Slide 20
16th February 2012 | Linked Data Tutorial20 Linked Data Cloud
2011
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.pdf
http://thedatahub.org/
Slide 21
16th February 2012 | Linked Data Tutorial21 More Statistics
http://stats.lod2.eu/stats
Slide 22
16th February 2012 | Linked Data Tutorial22 Uptake in
Governmental Domain The EU is publishing LinkedData EuroStat
http://estatwrap.ontologycentral.com/http://estatwrap.ontologycentral.com/
National efforts The Government is releasing public data
http://data.gov.uk/http://data.gov.uk/ Lots of initiatives in Great
Britain Budget in Germany
http://bund.offenerhaushalt.de/http://bund.offenerhaushalt.de/ Open
Data in Catalonia
http://opendata.gencat.cat/en/dades-obertes.htmlhttp://opendata.gencat.cat/en/dades-obertes.html
Slide 23
16th February 2012 | Linked Data Tutorial23 Data.gov.uk
http://data.gov.uk/organogram/cabinet-office
Slide 24
16th February 2012 | Linked Data Tutorial24 Linked Data
Applications ? ? ? ? Linked Data Browsers
Slide 25
16th February 2012 | Linked Data Tutorial25 Search Engines -
Sig.ma http://sig.ma
Slide 26
16th February 2012 | Linked Data Tutorial26 Mashups Public
Contracts On the Map
http://gd.projekty.ms.mff.cuni.cz:2021/new/map.html
Slide 27
16th February 2012 | Linked Data Tutorial27 Mashups Crime,
Transport, Education http://apps.seme4.com/see-uk/
Slide 28
16th February 2012 | Linked Data Tutorial28 Other Applications
Browsers: Disco Hyperdata Browser
http://www4.wiwiss.fu-berlin.de/rdf_browser/http://www4.wiwiss.fu-berlin.de/rdf_browser/
OpenLink RDF Browser
http://ode.openlinksw.com/http://ode.openlinksw.com/ Search Engines
Falcons http://ws.nju.edu.cn/falcons/http://ws.nju.edu.cn/falcons/
Watson http://watson.kmi.open.ac.uk/WatsonWUI/ Mashups
Slide 29
16th February 2012 | Linked Data Tutorial29 Linked Data
Applications - Summary Linked Data Browsers Search Engines Linked
Data Mashups
Slide 30
16th February 2012 | Linked Data Tutorial30 Publishing Linked
Data
Slide 31
16th February 2012 | Linked Data Tutorial31 Publishing Tasks
Bizer 38 1. Make data available as RDF via HTTP Requires ways to
serialize RDF data model 2. Set RDF links pointing at other data
sources 3. Make your data self-descriptive
Slide 32
16th February 2012 | Linked Data Tutorial32 RDF/XML W3C
Recommendation, 2004 [2]
Slide 33
16th February 2012 | Linked Data Tutorial33 Turtle Syntax
@prefix rdf:. @prefix dataModel:. @prefix myContact:. myContact:me
rdf:type dataModel:Person ; dataModel:fullName "Eric Miller".
dataModel:mailbox. dataModel:personalTitle "Dr.". W3C Team
Submission, 2011, [4]
Slide 34
16th February 2012 | Linked Data Tutorial34 RDFa A way to
directly add RDF to XHTML pages Provides new attributes to handle
additional markup W3C Recommendation, 2008 [5] HTML is not
extendable most RDFa parsers will recognize RDFa attributes in any
version of HTML
Slide 35
16th February 2012 | Linked Data Tutorial35 RDFa Provides new
attributes to handle additional markup, reuses existing About,
resource, Href, src, Used with any supported element, prefered:
Span, div (in the body) a (linking element) Meta, link (in the
header)
Slide 36
16th February 2012 | Linked Data Tutorial36 RDFa Example XHTML
page http://example.com/alice/posts/42 Original XHTML code All
content on this site is licensed under a Creative Commons License.
XHTML + RDFa All content on this site is licensed under a Creative
Commons License. RDF triples destilled from XHTML+RDFa
cc:license.
Slide 37
16th February 2012 | Linked Data Tutorial37 RDF store + Linked
Data Interface Virtuoso + pubby
Slide 38
16th February 2012 | Linked Data Tutorial38 D2R server A way
how to publish data in relational databases as Linked Data Requests
from the Web are rewritten into SQL queries via the mapping.
on-the-fly translation eliminates the need for replicating the data
into a dedicated RDF triple store.
Slide 39
16th February 2012 | Linked Data Tutorial39 Publishing Tasks 1.
Make data available as RDF via HTTP 2. Set RDF links pointing at
other data sources 3. Make your data self-descriptive
Slide 40
16th February 2012 | Linked Data Tutorial40 2. Set RDF links
owl:sameAs. There are tools to help you generate links Silk
[6]
Slide 41
16th February 2012 | Linked Data Tutorial41 Publishing Tasks 1.
Make data available as RDF via HTTP 2. Set RDF links pointing at
other data sources 3. Make your data self-descriptive
Slide 42
16th February 2012 | Linked Data Tutorial42 3. Make your data
self-descriptive Increase the usefulness of your data and ease data
integration Aspects of self-descriptiveness 1. Reuse terms from
common vocabularies 2. Enable clients to retrieve the schema 3.
Publish schema mappings for proprietary terms 4. Metadata Provide
provenance metadata Provide licensing metadata Provide
data-set-level metadata using voiD
Slide 43
16th February 2012 | Linked Data Tutorial43 About Vocabularies
We have to be able to define the meaning of the subject, properties
Vocabularies, e.g. Public contracts ontology
Slide 44
16th February 2012 | Linked Data Tutorial44 Public Contracts
Ontology http://purl.org/procurement/public-contracts#
Slide 45
16th February 2012 | Linked Data Tutorial45 RDFS RDFS = RDF
Schema W3C recommendation http://www.w3.org/TR/rdf-schema/
Vocabulary for RDF Definition of classes is:Student rdf:type
rdfs:Class Definition of properties is:name rdf:type rdfs:Property
Domains and ranges of properties is:name rdfs:domain is:Student
is:name rdfs:range xsd:string
Slide 46
16th February 2012 | Linked Data Tutorial46 OWL OWL = Web
Ontology Language W3C recommendation
http://www.w3.org/TR/owl2-overview/ Ontologies More complex
constructs Class or property equivalences Cardinality
restrictions
Slide 47
16th February 2012 | Linked Data Tutorial47 3. Make your data
self-descriptive Increase the usefulness of your data and ease data
integration Aspects of self-descriptiveness 1. Reuse terms from
common vocabularies 2. Enable clients to retrieve the schema 3.
Publish schema mappings for proprietary terms 4. Metadata Provide
provenance metadata Provide licensing metadata Provide
data-set-level metadata using voiD
Slide 48
16th February 2012 | Linked Data Tutorial48 3.1 Reuse Terms
from Common vocabularies Common Vocabularies Friend-of-a-Friend for
describing people and their social network SIOC for describing
forums and blogs SKOS for representing topic taxonomies
Organization Ontology for describing the structure of organizations
GoodRelations provides terms for describing products and business
entities Music Ontology for describing artists, albums, and
performances Review Vocabulary provides terms for representing
reviews Common sources of identifiers (URIs) for real world objects
LinkedGeoData and Geonames locations GeneID and UniProt life
science identifiers DBpedia wide range of things
Slide 49
16th February 2012 | Linked Data Tutorial49 3.2 Enable Clients
to retrieve the Schema Clients can resolve the URIs that identify
vocabulary terms in order to get their RDFS or OWL definitions. If
we discover in data URI:
http://purl.org/procurement/public-contracts#awardDate
"2011-11-11"^^ ; We resolve the URI and get the definition: RDFS or
OWL definition
Slide 50
16th February 2012 | Linked Data Tutorial50 3.3 Publish Schema
Mappings pc:Tender a owl:Class ; rdfs:subClassOf gr:Offering.
pc:AwardCriterion a owl:Class ; owl:equivalentClass
loted:AwardCriteria. Simple Mappings: rdfs:subClassOf,
rdfs:subPropertyOf owl:equivalentClass, owl:equivalentProperty
Complex mappings R2R [7]
Slide 51
16th February 2012 | Linked Data Tutorial51 3.4 Metadata
Licenses Data Provenance Dataset description
Slide 52
16th February 2012 | Linked Data Tutorial52 Consuming Linked
Data
Slide 53
16th February 2012 | Linked Data Tutorial53 Overview URI ->
Description Pubby Keyword -> Description Sig.ma SPARQL query
language [8] SQL for RDF databases
Slide 54
16th February 2012 | Linked Data Tutorial54 SPARQL Example
Contracts of the given supplier
Slide 55
16th February 2012 | Linked Data Tutorial55 SPARQL Example -
Result
Slide 56
16th February 2012 | Linked Data Tutorial56 Issues of the
Simple Consuming Scenarios How to aggregate the data if the links
are missing, the data models (ontologies) differs? How to deal with
data quality? Everybody can say whatever he wants! Solution: We are
developing an infrastructure for cleaning, linking, and aggregating
Linked Data Reusing existing technologies, such as Silk
Slide 57
16th February 2012 | Linked Data Tutorial57 ODCleanStore
Cleaning the data Custom cleaners Linking the data Silk Graphical
user interface Smart data consuming Data aggregation (due to links,
ontology mappings) Conflict resolution Data provenance
Slide 58
16th February 2012 | Linked Data Tutorial58 Motivational
Scenario - Recall Basic data EmployeesDepartments Public contracts
BudgetExpenses WWW page of the institution Business Register FIS
Buyers Profile ISVZUS gov.cz Data Consumer: Show me suppliers of
the public contracts for the Ministry of Finance (MF) in the region
Liberec. Show the data on the Google maps in iPhone. For every
public contract, I am looking for the aggregation of all the
payments made by MF, link to their budget and responsible person.
Where can I get the data about public contracts, responsible
persons, expenses, and budget of MF? How should I aggregate and
link the data? How can I observer the data on the map?
Slide 59
16th February 2012 | Linked Data Tutorial59 Goal ODCleanStore
Basic data EmployeesDepartments Public contracts BudgetExpenses WWW
page of the institution Business Register FIS Buyers Profile ISVZUS
gov.cz
Slide 60
16th February 2012 | Linked Data Tutorial60 Conclusions
Slide 61
16th February 2012 | Linked Data Tutorial61 Linked Data vs.
Open Data Open data 3 stars! 4th star: Single and flexible model
(RDF) is missing 5th star: Links Open data are raw data, which are
freely available on the Web to: Everyone Anytime For whatever
purpose
Slide 62
16th February 2012 | Linked Data Tutorial62 Conclusions and
Take Away Message The Power Of Linked Data (5 star data) Web-scale
data publishing with web-based discovery mechanisms Distributed
annotation make comments about observations, data series, points on
the map Easy to reuse Huge potential when connecting to the cloud,
linking the data, the benefits are growing as the amount of data
published as Linked Data is increasing Integration on data level
Easy to extend (new data properties as required, no need to be
planned up-front) Easy to merge no name clashes!
Slide 63
16th February 2012 | Linked Data Tutorial63 Future Steps If you
managed to get interesting data, try to publish them as Linked
Data! We can help you with the whole lifecycle creating,
publishing, maintenance of the data Just create RDF data, we will
publish it for you Just let us know (send us the data), we can
publish it Publish data in the same way, but use global identifiers
according to LD principles When the infrastructure (ODCleanStore)
is ready, you can just send us the RDF data using web service and
we will do all the other stuffs clean, link, and provide aggregated
views.
Slide 64
16th February 2012 | Linked Data Tutorial64 Thank You!
Slide 65
16th February 2012 | Linked Data Tutorial65 References
Textbook: Tom Heath, Christian Bizer: Linked Data: Evolving the Web
into a Global Data Space. http://linkeddatabook.com/
http://linkeddatabook.com/ [2]
http://www.w3.org/TR/rdf-primer/http://www.w3.org/TR/rdf-primer/
[3]
http://www.w3.org/TR/REC-rdf-syntax/http://www.w3.org/TR/REC-rdf-syntax/
[4]
http://www.w3.org/TeamSubmission/turtle/http://www.w3.org/TeamSubmission/turtle/
[5]
http://www.w3.org/TR/rdfa-syntax/http://www.w3.org/TR/rdfa-syntax/
[6]
http://www4.wiwiss.fu-berlin.de/bizer/silk/http://www4.wiwiss.fu-berlin.de/bizer/silk/
[7] http://www.w3.org/TR/rdf-sparql-query/
Slide 66
16th February 2012 | Linked Data Tutorial66
Slide 67
16th February 2012 | Linked Data Tutorial67 Thank You!
Slide 68
16th February 2012 | Linked Data Tutorial68 Motivational
Scenario (to recall) zkladn daje zamstnancioddlen veejn zakzky
rozpoetvdaje WWW strnky instituce Obchodn rejstk FIS Profil
zadavatele ISVZUS gov.cz Uivatel: Dodavatel veejnch zakzek MF z
Libereckho kraje na Google mapch v iPhone aplikaci. Pro kadou
zakzku agregace nebo vpis plateb, vazbu na rozpoet a zodpovdnou
osobu. Kde zskm data o zakzkch, odpovdnch osobch, vdajch a rozpotu
MF? Jak mm data slouit a provzat? Jak zobrazit data v iPhone na
map?
Slide 69
16th February 2012 | Linked Data Tutorial69 Searching on the
Current Web Searching information about the American city London in
Ohio Input: keywords London Result: London (in Britain) Searching
is implemented based on which documents contain the given keywords
It is up to the user to read and interpret the results
Slide 70
16th February 2012 | Linked Data Tutorial70 Information
Integration on the Web Looking for Indian food for the Saturdays
night dinner with a girlfriend Step 1: Search restaurants
specialized in Indian cuisine, get the address Step 2: Open up new
tab, go to the favourite map utility to get the driving directions
to the restaurants Step 3: Select the closest one An integration
process, you get some information (address), which you use to get
more information (directions)
Slide 71
16th February 2012 | Linked Data Tutorial71 Data Mining on the
Current Web An application for collecting stock information from
the given set of Web sites, reporting every 10 minutes Specialized
piece of software Re-code if something important has changed E.g.
layout of one single page Re-code if more pages should be added
ablona XRG71
Slide 72
16th February 2012 | Linked Data Tutorial72 Problems of the
Current Web? Web aimed for human readers, display oriented, web
servers/browsers do not understand the content of the page - they
just know how to display the page What is necessary: There has to
be a way (model) to represent knowledge on the Web, shared among
pages There has to be a way to create these statements on each Web
site The statements should involve common terms and relations, at
least for the given domain, there should be a way to define them
Agent should be able to conduct reasoning Agent should be able to
query the statements ablona XRG72
Slide 73
16th February 2012 | Linked Data Tutorial73 Semantic Web Is an
extension of the current Web in which information is given
well-defined meaning, better enabling computers and people to work
in cooperation The Semantic Web is a collection of technologies and
standards that allow machines to understand the meaning (semantics)
of information on the Web ablona XRG73
Slide 74
16th February 2012 | Linked Data Tutorial74 TODO How can Linked
Data Help us in the Motivational Scenarios? Searching Specifying
that I am looking for a city Than starting writing London, the
combo box appears:
http://dbpedia.org/resource/Londonhttp://dbpedia.org/resource/London
http://dbpedia.org/resource/London_Ontariohttp://dbpedia.org/resource/London_Ontario
http://dbpedia.org/resource/London_Ohiohttp://dbpedia.org/resource/London
Information Integration Faceted browsing, filter all restaurants
based on the type of food, distance from my place, opening hours,
Data Mining The application can understand the meaning of the Web
pages No re-coding for new pages, no re-coding if the layout
changes, generic piece of software
Slide 75
16th February 2012 | Linked Data Tutorial75 1. Publishing
Patterns? Bizer 39 Zadas URI dostanes reprezentaci Vrstvy, nen teba
vytvaret nova data Reprezentace se da prizpusobit potrebam ->
RDF/XML XHTML+RDFa Shows HOW to make data available as RDF via
HTTP
Slide 76
16th February 2012 | Linked Data Tutorial76 RDFa A way to
directly add RDF to XHTML pages Provides new attributes to handle
additional markup W3C Recommendation (2008) HTML is not extendable
most RDFa parsers will recognize RDFa attributes in any version of
HTML
Slide 77
16th February 2012 | Linked Data Tutorial77 RDFa Provides new
attributes to handle additional markup, reuses existing About,
resource, Href, src, Used with any supported element, prefered:
Span, div (in the body) a (linking element) Meta, link (in the
header)
Slide 78
16th February 2012 | Linked Data Tutorial78 Example Default
subject XHTML page http://example.com/alice/posts/42 All content on
this site is licensed under a Creative Commons License. RDF triple
cc:license.
Slide 79
16th February 2012 | Linked Data Tutorial79 Example 2 Attribute
Property XHTML page http://example.com/alice/posts/42 The trouble
with Bob Alice... The trouble with Bob Alice...
Slide 80
16th February 2012 | Linked Data Tutorial80 Example 3 Adding
attribute about XHTML page http://example.com/alice/posts/42 The
trouble with Bob Alice......
Slide 81
16th February 2012 | Linked Data Tutorial81 Example 4
Redefining about XHTML page http://example.com/alice/posts/42 The
trouble with Bob Bob takes much better photos than I do: Beautiful
Sunset by Bob.
Slide 82
16th February 2012 | Linked Data Tutorial82 Example 5 Blank
Node XHTML page http://example.com/alice/posts/42 Alice Birpemswick
Email: [email protected][email protected] Phone: +1
617.555.7332
Slide 83
16th February 2012 | Linked Data Tutorial83 Example 6 -
Chaining Albert Einstein 1879-03-14 Federal Republic of
Germany
Slide 84
16th February 2012 | Linked Data Tutorial84 Establishing
Context, Completing @about, @src, @typeof @about and @src
explicitly create a new context for statements, @typeof does so
implicitly If not present, the context is inherited Subject can be
the whole document @About, @rel, @resource@rel @resource ->
subject of the second triple is deduced @About, @rel, @property
@content -> Object and subject of the triples is deduced as
blank node
Slide 85
16th February 2012 | Linked Data Tutorial85 Objects Literal can
be set by using @property to express a predicate and then using
either @content, or the inline text of the element that @property
is on. URI resource can be set using one of @rel or @rev to express
a predicate, and then either using one of @href, @resource or @src
to provide an object resource explicitly, or using the chaining
techniques to obtain an object from a nested subject, or from a
bnode.
Slide 86
16th February 2012 | Linked Data Tutorial86 Other ways
Microformats Fixed vocabulary Uses attribute class Example Jeremy
Keith, Clearleft
Slide 87
16th February 2012 | Linked Data Tutorial87 GRDDL W3C
Recommendation (2007) To obtain RDF from RDFa XHTML page To obtain
RDF from microformats Requires different templates for different
microformats
Slide 88
16th February 2012 | Linked Data Tutorial88 Assignment 1 Look
at public contract described at
http://zakazky.praha.eu/detailZakazky.jsp?zakazkaId=136343
http://zakazky.praha.eu/detailZakazky.jsp?zakazkaId=136343 Download
the page Add RDF annotations (in your favorite editor) to specify:
Publisher of the document (Jan Novak) Contract title (Nzev) in
Czech and English Publication date (Datum vyven) Awarded tender
together with winner of the tender (Dodavatel), and his price
(Nabdkov cena) Winner should be accompanied with a name and ICO.
Price should be accompanied with the currency information Distill
RDF from XHTML file http://www.w3.org/2007/08/pyRdfa/ Notes: You
will need to look at: Opendata.cz ontology for public contracts Any
other ontology referenced
Slide 89
16th February 2012 | Linked Data Tutorial89 Basic Rules Object
is literal Use about, property and content attributes Object is
resource (identified by URI) Use about, rel and href
attributes
Slide 90
16th February 2012 | Linked Data Tutorial90 Checklist Header
version="XHTML+RDFa 1.0 Namespaces Meta publisher Detail -