133
Knowledge Technologies Manolis Koubarakis & Eleni Tsalapati 1 Introduction to Knowledge Technologies Knowledge Graphs, the Semantic Web and Linked Data

Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis & Eleni Tsalapati

1

Introduction to Knowledge Technologies

Knowledge Graphs, the Semantic Web and Linked

Data

Page 2: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

2

Page 3: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

3

Page 4: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies &

Data Science

When knowledge is:

-from multiple resources

-heterogeneous

Then we need to standardize the datasets into a unified framework that will:

-easier analysis (e.g. consolidating data with different unit systems)

-minimal loss of information

-machine & human readable

-can be explored efficiently

4

Page 5: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Deep learning technologies: need to semantically enrich the data to understand

their meaning

Semantic technologies: need to manipulate data more efficiently

5

Knowledge Technologies &

Data Science

Page 6: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

6

Page 7: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

7

The Web Today

• Universal resource identifiers (URIs) to identify documents.

• The Hypertext Transfer Protocol (HTTP) to exchange documents

between a client and a server.

• HTML for marking up information to be presented to human readers

through a browser.

• Search Engines (Google, duckduckgo) to discover information.

Page 8: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

8

Using Search Engines – Example I

• Assume that you want to find information about the actor Sean

Connery.

• What do you do?

– Google “Sean Connery”!

Page 9: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Result

Knowledge Technologies

Manolis Koubarakis

9

Page 10: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Discussion

• Gone are the days when Google would simply return a list of links to

Web pages containing the words in the query ordered by importance

(using PageRank).

• Now Google is able to understand that a user question is about an

entity and returns structured information about this entity (the

infobox) together with relevant web links.

Knowledge Technologies

Manolis Koubarakis

10

Page 11: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Using Search Engines – Example II

• Assume you would like to find information about Prof. Manolis

Koubarakis.

• Google “Prof. Manolis Koubarakis”.

Knowledge Technologies

Manolis Koubarakis

11

Page 12: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Result

Knowledge Technologies

Manolis Koubarakis

12

Page 13: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Discussion

• Good work but no structured information about this entity.

Knowledge Technologies

Manolis Koubarakis

13

Page 14: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Using Search Engines – Example III

• Let us now google: “Who is married to Angelina Jolie?”

Knowledge Technologies

Manolis Koubarakis

14

Page 15: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Using Search Engines – Example III

Knowledge Technologies

Manolis Koubarakis

15

Page 16: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Discussion

• Not only Google knows about entities, it also knows about

relationships e.g., “is married to”.

• Google can understand the semantics (meaning) of a user query.

Knowledge Technologies

Manolis Koubarakis

16

Page 17: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

• The Google Knowledge Graph stores information about real-world

entities and relationships in a structured way.

• Similar knowledge graphs are employed in all big Web companies

today like Microsoft, IBM, eBay, Facebook etc. See the article

https://dl.acm.org/citation.cfm?doid=3351434.3331166

Knowledge Technologies

Manolis Koubarakis

17

The Google Knowledge Graph

Page 18: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Discussion

• Constructing and using knowledge graphs (or knowledge bases) has

been, for a long time, the subject of research in Knowledge

Representation and Reasoning (a subarea of Artificial

Intelligence).

• The problem of knowledge representation and reasoning on the Web

has been studied by the areas of Semantic Web, Linked Data which

are the topic of this course.

Knowledge Technologies

Manolis Koubarakis

18

Page 19: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

19

Towards a Better Web!

• The Semantic Web vision was articulated in a Scientific American

article by Tim Berners-Lee, James Hendler and Ora Lassila (May

2001).

– “The Semantic Web will bring structure to the meaningful content

of Web pages, creating an environment where agents roaming from

page to page readily carry out sophisticated tasks for users.”

• You can find the article on various Web sites e.g.,

http://www.dcc.uchile.cl/~cgutierr/cursos/IC/semantic-web.pdf

• Notice the words meaning (semantics) and agents (a role for all of

Artificial Intelligence).

Page 20: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

20

Towards a Better Web!

Page 21: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

21

How Can we Achieve the Semantic

Web? – The Original Vision

• Instead of publishing information to be consumed by humans, publish

machine-processable data and metadata using terms/languages that

can be understood by machines.

• Build machines (agents) that will explore, integrate etc. this data.

• Make sure all agents understand your terms/languages.

Page 22: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

22

The Semantic Web and Linked Data

Vision Today

• From the Introduction section of the Semantic Web activity of the W3C http://www.w3.org/2001/sw/

“The Semantic Web is a web of data. There is lots of data we all use every day, and it is not part of the web. I can see my bank statements on the web, and my photographs, and I can see my appointments in a calendar. But can I see my photos in a calendar to see what I was doing when I took them? Can I see bank statement lines in a calendar?

Why not?

• common formats for integration and combination of data drawn from diverse sources

• language for recording how the data relates to real world objects.

Page 23: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

23

The Semantic Web Vision Today

(cont’d)

• Why linked data and Web data integration?

– Lots of important Web applications demand this e.g., e-science

and e-government.

• Why Web standards/languages for expressing shared meaning?

– This is important if we want agents that are not handcrafted only for

particular tasks to be developed (“roaming from page to page”)

• See the paper “The Semantic Web Revisited” by Nigel Shadbolt,

Wendy Hall and Tim Berners-Lee at

http://eprints.ecs.soton.ac.uk/12614/1/Semantic_Web_Revisted.pdf .

Page 24: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

24

The Semantic Web Layer Cake

Page 25: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

25

Emphasis of this Course

• The Semantic Web topics that we will cover in this course are:

– Linked data

• RDF, RDFS, SPARQL and RDF stores (Sesame).

– Ontologies

• Description logics, OWL 2, tools for developing ontologies

(Protégé) and reasoners (Pellet, Hermit).

– Rules

• SWRL and others

– Ontology engineering

– Linked geospatial data (the work of Prof Koubarakis’ group)

• stSPARQL, GeoSPARQL and Strabon

Page 26: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

26

Page 27: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

27

Open Data

• The open data movement is about making data public so they can be

used freely by everyone.

• Lots of such data becoming available today is government data.

Page 28: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

https://data.europa.eu/euodp/en/linked-data

Knowledge Technologies

Manolis Koubarakis

28

Page 29: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://data.gov.gr

Knowledge Technologies

Manolis Koubarakis

29

Page 30: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Open Data Directive

• Εntered into force on 16 July 2019 (replacing the PSI Directive of 2013)

• The goal is to make public sector and publicly funded data re-usable

and transparent.

• Focuses on the economic aspects of the re-use of information.

• Strengthens the transparency requirements for public–private

agreements

• Encourages the Member States to make as much information available

for re-use as possible

• High value datasets: beneficial for the society and economy:

– Geospatial, earth observation and environment, meteorological, statistics, companies

and company ownership, mobility

30

Page 31: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Linked Open Data

• The goal of this W3C community effort is “to extend the Web with a

data commons by publishing various open data sets as RDF on the

Web and by setting RDF links between data items from different data

sources.”

• The LOD community is developing a set of best practices for achieving

this.

Knowledge Technologies

Manolis Koubarakis

31

Page 32: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

32

The Linked Open Data Cloud

The dataset currently contains 1260 datasets with 16187 links (as of May 2020)

Page 33: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Linked Geo-data

Knowledge Technologies

Manolis Koubarakis

33

Page 34: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Linked Geo-Data (cont’d)

Knowledge Technologies

Manolis Koubarakis

34

Page 35: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://www.linkedopendata.gr

Knowledge Technologies

Manolis Koubarakis

35

Page 36: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://www.linkedopendata.gr

Knowledge Technologies

Manolis Koubarakis

36

Page 37: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

37

The Greek Administrative Geography

(GAG)

• According to the Kallikratis administrative reform of 2010, Greece consists

of:

• 7 decentralized administrations (e.g., Macedonia and

Thrace)

• 13 regions (e.g., West Macedonia)

• Each region consists of regional units (e.g., Heraklion)

• Each regional unit consists of municipalities (e.g., Dimos

Chersonisou)

• …

• You can find the dataset at http://www.linkedopendata.gr/dataset/greek-

administrative-geography

Page 38: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Querying GAG

Knowledge Technologies

Manolis Koubarakis

38

Page 39: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Tim Berners-Lee outlined four principles of linked data in his "Linked Data" note

of 2006:

1. Use URIs to name (identify) things.

2. Use HTTP URIs so that these things can be looked up (interpreted,

"dereferenced").

3. When someone looks up a URI, provide useful information, using

the standards (RDF, SPARQL, etc)

4. Include links to other URIs so that they can discover more things.

39

Linked data principles

Page 40: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Key Linked Data Technologies

• URIs: a generic means to identify entities or concepts in the world

• HTTP: a simple, yet universal, mechanism for retrieving resources, or

descriptions of resources

• RDF: a data model for structuring and linking data that describes

things in the world

• RDFS: is a general-purpose language for representing simple RDF

vocabularies on the Web.

• SPARQL: a query language for querying linked RDF data

Knowledge Technologies

Manolis Koubarakis

40

Page 41: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

41

Uniform Resource Identifiers

• The Web provides a general form of identifier, called the Uniform Resource Identifier (URI), for identifying (naming) resources on the Web.

• Unlike URLs, URIs are not limited to identifying things that have network locations, or use other computer access mechanisms.

• A number of different URI schemes (URI forms) have been already been developed, and are being used, for various purposes:

• Examples:– http: (Hypertext Transfer Protocol, for Web pages) – mailto: (email addresses), e.g., mailto:[email protected]– ftp: (File Transfer Protocol) – urn: (Uniform Resource Names, intended to be persistent location-independent resource

identifiers), e.g., urn:isbn:0-520-02356-0 (for a book)

• No one person or organization controls who makes URIs or how they can be used. While some URI schemes, such as URL's http:, depend on centralized systems such as DNS, other schemes, such as freenet:, are completely decentralized.

Page 42: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example

• http://en.wikipedia.org/wiki/Angelina_Jolie

• http://dbpedia.org/page/Angelina_Jolie

• http://www.imdb.com/name/nm0001401/

• http://myfavouriteactors.gr/AngelinaJolie.html (does not exist right now)

• Note the difference between the person Angelina Jolie and the various Web pages for her. It is common to use “hash URIs” to refer to the real person:– http://en.wikipedia.org/wiki/Angelina_Jolie#this

Page 43: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

http://dbpedia.org/page/Changeling_(film)

• dbpedia-owl:starring: http://dbpedia.org/page/Angelina_Jolie

• dbpedia-owl:director : http://dbpedia.org/page/Clint_Eastwood

Page 44: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Problem

• Data sets will typically talk about the same object using different

URIs. For example:

Knowledge Technologies

Manolis Koubarakis

44

http://dbpedia.org/page/Angelina_Jolie

http://www.freebase.com/view/en/angelina_jolie

Page 45: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Solution: Linking

• Use the special kind of link owl:sameAs that identifies the resources:

Knowledge Technologies

Manolis Koubarakis

45

http://dbpedia.org/page/Angelina_Jolie

http://www.freebase.com/view/en/angelina_jolie

Page 46: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

46

Other Links Beyond owl:sameAs

• Natura 2000

• Example of link to GAG: This protected area is contained

in the perfecture of Macedonia.

• Public transport

• Examples of links:

• This bus stop is in the municipality of Athens.

• This bus stop is near a public building.

• You can find these datasets as well at

http://www.linkedopendata.gr

Page 47: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

47

5-star deployment scheme for

Linked Open Data

Page 48: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

48

RDF Basics

• The world consists of resources.

• Resources are identified by URIs and described by simple propertiesand property values.

• Example: “Angelina Jolie stars in the film Changeling directed by Clint Eastwood.”

Page 49: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

RDF Statements

• RDF statements are triples of the formsubject predicate object .

• Example:

<http://dbpedia.org/page/Changeling_(film)>

<http://dbpedia.org/ontology/starring>

<http://dbpedia.org/page/Angelina_Jolie> .

<http://dbpedia.org/page/Changeling_(film)>

< http://dbpedia.org/ontology/director>

<http://dbpedia.org/page/Clint_Eastwood> .

Knowledge Technologies

Manolis Koubarakis

49

Page 50: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

• To avoid writing long URIs we can use namespace prefixes:

@prefix dbpedia: http://dbpedia.org/page/ .

@prefix dbpedia-owl: http://dbpedia.org/ontology/ .

dbpedia:Changeling_(film) dpedia-owl:starring dbpedia:Angelina_Jolie .

dbpedia:Changeling_(film) dpedia-owl:director dbpedia:Clint_Eastwood .

Knowledge Technologies

Manolis Koubarakis

50

Page 51: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

51

RDF Graphs

dbpedia:Changeling_(Film)

dbpedia:Angelina_Joliedbpedia:Clint_Eastwood

dbpedia-owl:starringdbpedia-owl:director

Page 52: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

52

RDF Schema

• The RDF Vocabulary Description Language 1.0 (or RDF Schema or

RDFS):

– is a simple schema language for RDF.

– allows to declare classes/properties using the predefined language

level class rdfs:Class/ property rdfs:Property

– is an ontology definition language (a simple one; we can only

define taxonomies and do some basic inference about them).

– like a schema definition language in the relational or object-

oriented data models (hence the alternative name RDF Schema –

we will use this name and its shorthand RDFS mostly!).

Page 53: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

53

Example – Class Hierarchy

owl:Thing

dbpedia-owl:Actor dbpedia-owl:Film_Director

rdfs:subClassOf

dbpedia-owl:Person

rdfs:subClassOfrdfs:subClassOf

dbpedia:Angelina_Joliedbpedia:Clint_Eastwood

rdf:type/a rdf:type/a rdf:type/a

Page 54: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

54

Example – Property Hierarchy

dbpedia-owl:husbandOf dbpedia-owl:wifeOf

dbpedia-owl:spouseOf

rdfs:subPropertyOf rdfs:subPropertyOf

Page 55: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Standard Vocabularies

• Linked data are written using standard vocabularies i.e., vocabularies

that have been agreed by certain communities for describing

certain kinds of resources.

• Examples:

– FOAF

– Dublin Core

– schema.org

Knowledge Technologies

Manolis Koubarakis

55

Page 56: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

56

FOAF

• The Friend of a Friend (FOAF) project has created a standard

vocabulary for describing people, the links between them and the

things they create and do.

– e.g. foaf:name, foaf:knows, foaf:homepage

• See http://xmlns.com/foaf/spec/ for more information.

Page 57: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://en.wikipedia.org/wiki/Angelina_Jolie#this>

rdf:type foaf:Person ;

foaf:name “Angelina Jolie" ;

foaf:mbox <mailto:[email protected]> ;

foaf:homepage <http://www.AngelinaJolie.com/> ;

foaf:interest < http://en.wikipedia.org/wiki/UNHCR> ;

foaf:knows [ rdf:type foaf:Person ;

foaf:name “Michael Douglas" ] .

Knowledge Technologies

Manolis Koubarakis

57

Page 58: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

58

Dublin Core

• Dublin Core is a community effort that has defined a set of metadata

attributes (e.g., creator, creation date, title etc.) for describing a wide

range of digital resources.

• See http://dublincore.org for more information.

Page 59: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

59

Example

Page 60: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://schema.org

• This was an initiative by Bing, Google, Yahoo! and Yandex to

provide a collection of schemas that webmasters can use to markup

their pages in ways recognized by major search providers.

• Standard ontology of classes and relations.

• Since April 2015, the W3C Schema.org community group manages this

activity.

Knowledge Technologies

Manolis Koubarakis

60

Page 61: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://schema.org/Place

Knowledge Technologies

Manolis Koubarakis

61

Page 62: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://schema.org/LocalBusiness

Knowledge Technologies

Manolis Koubarakis

62

Page 63: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Microdata

• Microdata provide a simple way to annotate HTML elements with

machine readable tags.

• Microdata can use standardized vocabularies to capture the semantics

of HTML items.

• See http://www.w3.org/TR/microdata/ for more details.

• Similar (earlier) approaches are RDFa (RDF in attributes-W3C

Recommendation that adds a set of attribute-level extensions to

HTML), microformats and JSON-LD.

Knowledge Technologies

Manolis Koubarakis

63

Page 64: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example – Original HTML

• The following is some HTML code for describing a local

business called “Beachwalk Beachwear and Giftware”.

<h1>Beachwalk Beachwear & Giftware</h1>

A superb collection of fine gifts and clothing to

accent your stay in Mexico Beach.

3102 Highway 98

Mexico Beach, FL

Phone: 850-648-4200

Knowledge Technologies

Manolis Koubarakis

64

Page 65: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)– HTML With

Microdata

<div itemscope itemtype="http://schema.org/LocalBusiness">

<h1><span itemprop="name">Beachwalk Beachwear & Giftware</span></h1>

<span itemprop="description"> A superb collection of fine gifts and

clothing to accent your stay in Mexico Beach.</span>

<div itemprop="address" itemscope

itemtype="http://schema.org/PostalAddress">

<span itemprop="streetAddress">3102 Highway 98</span>

<span itemprop="addressLocality">Mexico Beach</span>,

<span itemprop="addressRegion">FL</span>

</div>

Phone: <span itemprop="telephone">850-648-4200</span>

</div>

Knowledge Technologies

Manolis Koubarakis

65

Page 66: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

SPARQL

• SPARQL stands for “SPARQL Protocol and RDF Query Language”.

• It is the standard query language for RDF dataproposed by the W3C.

• It is based on matching graph patterns against RDF graphs.

• The simplest kind of graph pattern is a triple pattern.

– A triple pattern is like an RDF triple, but with the option of a variable in the subject, predicate or object

positions.

Page 67: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example Dataset

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://en.wikipedia.org/wiki/Angelina_Jolie#this>

rdf:type foaf:Person ;

foaf:name “Angelina Jolie" ;

foaf:mbox <mailto:[email protected]> ;

foaf:homepage <http://www.AngelinaJolie.com/> ;

foaf:interest <http://en.wikipedia.org/wiki/UNHCR> ;

foaf:knows [ rdf:type foaf:Person ;

foaf:name “Michael Douglas" ] .

Knowledge Technologies

Manolis Koubarakis

67

Page 68: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example SPARQL Query

SELECT ?name

WHERE { ?x foaf:name ?name .

?x rdf:type foaf:Person .

?x foaf:knows ?y .

?y foaf:name “Michael Douglas” .

}

• Result:?name

“Angelina Jolie"

Page 69: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

69

Page 70: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

70

Ontologies

• An ontology is a formal, explicit, shared specification of a conceptualization of a domain (Gruber, 1993).

• Conceptualization: the objects, concepts, and other entities that are assumed to exist in some area of interest and the relationships that hold among them. A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose.

• The term ontology is borrowed from Philosophy, where ontology is a systematic account of existence (what things exist, how they can be differentiated from each other etc.).

• Sometimes the word ontology is a synonym for a shared knowledge base. Some other times it is used to refer only to intentional (schema-level) knowledge.

Page 71: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Formal Languages for Ontologies

• Ontologies are typically expressed in some formal logic-based language (e.g., first-order logic).

• The literature also offers special formalisms for defining ontologies that

contain mainly taxonomic knowledge:

– Semantic networks

– Frames

– Description logics

– RDF, RDFS and OWL in the Semantic Web.

Knowledge Technologies

Manolis Koubarakis

71

Page 72: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Formal Languages for Ontologies

(cont’d)

• You can think about these formalisms as being object-oriented logics:

– They have special constructs for representing knowledge about

individuals (or objects), categories (or classes) and relationships (or

roles).

– Categories are organized into taxonomies.

– They have special reasoning methods to deal with these constructs.

• Description logics, RDF, RDFS and OWL 2 are a core part of this course.

Knowledge Technologies

Manolis Koubarakis

72

Page 73: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Taxonomies

• Taxonomies have been used profitably for centuries in various

technical fields (biology, medicine, library science etc.).

• Taxonomic information plays a central role in other Computer

Science areas e.g., programming languages, databases and

software engineering.

• Taxonomies are very important in modern applications: Web

information retrieval and integration, knowledge management, e-

commerce, e-science, e-government etc.

Knowledge Technologies

Manolis Koubarakis

73

Page 74: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

74

Example: an RDFS ontology of vehicles

Page 75: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example: the GAG RDFS ontology

75

Page 76: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Very Large Ontologies and Knowledge

Graphs

• Various areas of human knowledge and deploying this knowledge in

applications such as search engines or question answering.

• Example: Watson, IBM’s question answering system that beat humans

in the quiz show Jeopardy (2011)

(https://www.youtube.com/watch?v=P18EdAKuC1U).

Knowledge Technologies

Manolis Koubarakis

76

Page 77: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Examples of Very Large Ontologies and

Knowledge Graphs

• Cyc

• Wordnet

• DBpedia

• Freebase

• Wikidata

• YAGO

• CaLiGraph

• Google Knowledge Graph

• Facebook Graph Search

• TextRunner

• NELL

Knowledge Technologies

Manolis Koubarakis

77

Page 78: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Cyc (1984)

• From the Cyc Web site (http://www.cyc.com/):

– The Cyc KB is a formalized representation of a vast quantity of

human common sense knowledge: facts, rules of thumb, and

heuristics for reasoning about the objects and events of everyday

life.

– At the present time, the Cyc KB contains nearly 500,000 terms,

including about 15,000 types of relations, and about 5,000,000

facts (assertions) relating these terms.

– New assertions were added to the KB through a combination of

automated and manual means.

– Latest release: 2017

Knowledge Technologies

Manolis Koubarakis

78

Page 79: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

What does Cyc know?

Knowledge Technologies

Manolis Koubarakis

79

Page 80: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Cyc (cont’d)

• Cyc is written in CycL: an extension of FOL with equality, default

reasoning, skolemization, and some second-order features.

• The reasoning engine of Cyc uses around 1000 specialized

reasoners.

• There is an open source version called OpenCyc.

• Applications: Glaxo, Cleveland Clinic Foundation, Terrorist KB (2005)

Knowledge Technologies

Manolis Koubarakis

80

Page 81: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Wordnet

• From the WordNet Web site (http://wordnet.princeton.edu/):

– WordNet is a large lexical database of English (around

200,000 of words).

– Nouns, verbs, adjectives and adverbs are grouped into sets of

cognitive synonyms (synsets), each expressing a distinct

concept.

– Synsets are interlinked by means of conceptual-semantic

(hyponym, hypernym, meronym etc.) and lexical relations.

– WordNet is not really an ontology but parts of it can be used into

an ontology containing taxonomic information.

Knowledge Technologies

Manolis Koubarakis

81

Page 82: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example

Knowledge Technologies

Manolis Koubarakis

82

Page 83: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

Knowledge Technologies

Manolis Koubarakis

83

Page 84: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

• DBpedia is a community effort to extract structured information from

Wikipedia.

• This structured information resembles an open knowledge graph

• Part of the Linked Open Data effort.

• The English version:

– 4.58 million things (1,445,000 persons, 735,000 places,etc)

• Advantages:

– multi-domain

– represents real community agreement

– automatically evolves as Wikipedia changes

– multilingual (125 languages-but with different content for each one)

• Emphasis on individuals, not general knowledge about various

domains (compared to Cyc and Wordnet).Knowledge Technologies

Manolis Koubarakis

84

Page 85: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://dbpedia.org/page/Angelina_Jolie

Knowledge Technologies

Manolis Koubarakis

85

Page 86: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

http://dbpedia.org/page/Angelina_Jolie

Knowledge Technologies

Manolis Koubarakis

86

Page 87: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

87

Page 88: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Freebase• Large collaborative knowledge base consisting of data harvested from

many sources, including individual, user-submitted wiki contributions

• Freebase was an open, Creative Commons licensed repository of

structured data of almost 20 million entities.

• Bought by Google and its Knowledge Graph technology build on it.

• In December of 2014, Google closed it down.

• In September 2018 Google has published at github.com sources of graphd

server,which is a Freebase backend. https://github.com/google/graphd

• The contents of Freebase were exploited by Wikidata:

https://static.googleusercontent.com/media/research.google.com/en//pubs/

archive/44818.pdf

Knowledge Technologies

Manolis Koubarakis

88

Page 89: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Wikidata (https://www.wikidata.org/)

Knowledge Technologies

Manolis Koubarakis

89

Page 90: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Wikidata

• Provides high quality structured data in real-time (500 editors/minute)

• Largest general-purpose knowledge base on the Semantic Web

maintained by open communities

• Provides content to Wikipedia

• Multilingual: same information for each language

Knowledge Technologies

Manolis Koubarakis

90

Page 91: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

YAGO

• From the YAGO Web page (https://yago-knowledge.org/):

– Max Planck Institute – YAGO contains both entities (such as movies, people, cities, countries, etc.) and

relations between these entities (who played in which movie, which city is located

in which country, etc.).

– Contains more than 50 million entities and 2 billion facts.

– YAGO4 was released on March 2020. It is open source.

– Combines two resources:

• Wikidata: difficult taxonomy and no human-readable entity identifiers.

• schema.org: ontology of relations and classes but without entities

• Logical constraints: ensure consistency (e.g. no entity can be at the same

time a person and a place) =>

reasoning techniques can be applied.

Knowledge Technologies

Manolis Koubarakis

91

Page 92: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

92

Page 93: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

CaLiGraph

• Under development (started in 2019)

• DBpedia and YAGO 2 create an entity for each page in Wikipedia.

• Wikipedia's policies permit pages about subjects only if they have a

certain popularity

• Lack information about less well-known entities

• Extraction of entities from Wikipedia's categories and listpages

• Classification of entities with ML techniques.

93

Page 94: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Google’s Knowledge Graph

Knowledge Technologies

Manolis Koubarakis

94

• Information gathered from a variety of sources.

• The information is presented to users in a knowledge panel next to the search results.

• It covers 570 million entities and 18 billion facts

Page 95: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Facebook’s Graph Search

Knowledge Technologies

Manolis Koubarakis

95

• On June 7th 2019, Facebook shut down its Graph Search options

• “The vast majority of people on Facebook search using keywords, a

factor which led us to pause some aspects of graph search and

focus more on improving keyword search. We are working closely

with researchers to make sure they have the tools they need to use

our platform.“ Buzzfeed

• The majority of urls for graph search queries is no longer working

Page 96: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

TextRunner

• TextRunner is a knowledge base built by parsing 500,000 Web pages

and extracting relationships among entities from them.

• TextRunner is part of the research project KnowItAll at the University

of Washington. This project concentrates on two fundamental

questions:

– How can a computer accumulate a massive body of knowledge?

– What will Web search engines look like in ten years?-wait, that it is

now

Knowledge Technologies

Manolis Koubarakis

96

Page 97: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

TextRunner (cont’d)

Knowledge Technologies

Manolis Koubarakis

97

Page 98: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

NELL (Never-Ending Language

Learning)

Knowledge Technologies

Manolis Koubarakis

98

Page 99: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Interacting with NELL

Knowledge Technologies

Manolis Koubarakis

99

Page 100: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Building Ontologies

• The example ontologies and knowledge graphs we saw have been

created using different means:

– Cyc, schema.org: trained ontology engineers

– WordNet: trained experts (lexicographers).

– DBpedia, FreeBase, YAGO4: automatically importing structured

facts from various Web sources possibly with some inference.

– CaLiGraph, TextRunner, NELL: parsing textual data and extracting

information from them

Knowledge Technologies

Manolis Koubarakis

100

Page 101: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Building Ontologies (cont’d)

• Ontology engineering offers us general methodologies of how to

create, maintain and evolve well-structured ontologies.

• See the OWL tutorial available at http://www.co-

ode.org/resources/tutorials/intro/.

Knowledge Technologies

Manolis Koubarakis

101

Page 102: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Description Logics

• The origins of DLs lie in research on semantic networks and frames.

DLs are languages for describing the nature and structure of

objects.

• The DL approach to KR was developed in the 80's and 90's in

parallel with pure FOL approaches and other languages for

structured objects like Telos and F-logic.

• DLs have been used to provide the foundations for ontology

languages for the Web e.g., OWL.

• Many DLs are decidable fragments of first-order logic (FOL)

• Some DLs have features that are not covered in FOL

Knowledge Technologies

Manolis Koubarakis

102

Page 103: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

The Origins: KL-ONE

Knowledge Technologies

Manolis Koubarakis

103

Page 104: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Examples

• The set of female doctors.

Female ∏ Doctor

• The set of individuals that have at least 3 children that are male.

• The set of individuals such that all their children have graduated from at

least one Greek University.

• The set of individuals that have at least three children such that all their

degrees are from Greek Universities.

Knowledge Technologies

Manolis Koubarakis

104

Page 105: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Examples (cont’d)

• Ann is a female doctor.

(Female ∏ Doctor)(ANN)

• John is a child of Ann.

hasChild(ANN,JOHN)

• Ann has at least 3 children that are male.

Knowledge Technologies

Manolis Koubarakis

105

Page 106: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Examples (cont’d)

• A woman is a female person.

Woman ≡ Person ∏ Female

• A parent is a person who has at least one child.

Parent ≡ Person ∏

Knowledge Technologies

Manolis Koubarakis

106

Page 107: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

107

OWL

• OWL 2 is the current version of the

Web Ontology Language and a

W3C recommendation as of

October 2009.

• The previous version of OWL (OWL 1) became a W3C recommendation in 2004.

• All W3C documents about OWL 2 can be found at http://www.w3.org/TR/2009/REC-owl2-overview-20091027/ .

Page 108: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

108

The Structure of OWL 2

Page 109: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example

SubClassOf(:Student :Person)

EquivalentClasses(:Man

ObjectIntersectionOf(:Person :Male))

EquivalentClasses(:Woman

ObjectIntersectionOf(:Person :Female))

EquivalentClasses(:Parent

ObjectSomeValuesFrom(:hasChild :Person))

Knowledge Technologies

Manolis Koubarakis

109

Page 110: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

EquivalentClasses(:Teenager

ObjectIntersectionOf(Person

DataSomeValuesFrom(:hasAge

DatatypeRestriction(xsd:integer

xsd:minExclusive "12"^^xsd:integer

xsd:maxInclusive "19"^^xsd:integer ))))

Knowledge Technologies

Manolis Koubarakis

110

Page 111: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

EquivalentClasses(:ChildlessPerson

ObjectIntersectionOf(:Person

ObjectComplementOf(ObjectSomeValuesFrom(

ObjectInverseOf(:hasParent) owl:Thing))))

Knowledge Technologies

Manolis Koubarakis

111

Page 112: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

ClassAssertion(:Person :John)

ClassAssertion(:Male :John)

ClassAssertion(:Person :Mary)

ClassAssertion(:Female :Mary)

ObjectPropertyAssertion(:hasChild :John :Mary)

NegativeObjectPropertyAssertion(:hasChild :John :Tina)

DataPropertyAssertion(:hasAge :Maria "18"^^xsd:integer)

Knowledge Technologies

Manolis Koubarakis

112

Page 113: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example (cont’d)

SameIndividual(:John :Jack)

SameIndividual(:John otherOnt:JohnBrown)

SameIndividual(:Mary otherOnt:MaryBrown)

DifferentIndividuals(:John :Bill)

Knowledge Technologies

Manolis Koubarakis

113

Page 114: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• Knowledge Technologies & Data Science

• The vision of the Semantic Web

• Linked Data

– RDF, RDFS and SPARQL

• Ontologies

– Very large ontologies, building ontologies, Description logics, OWL

• Linked geospatial and temporal data

• Applications

Knowledge Technologies

Manolis Koubarakis

114

Page 115: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Geospatial Data on the Web

Knowledge Technologies

Manolis Koubarakis

115

Page 116: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Earth Observation Data

Knowledge Technologies

Manolis Koubarakis

116

ExtremeEarth

Page 117: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Linked Geospatial and Temporal Data

• This is a research area studied by Prof Koubarakis’ group since 2010.

• The basic research question is how to manage geospatial and

temporal data on the Web using linked data technologies.

• See our team web site for more: http://ai.di.uoa.gr/

Knowledge Technologies

Manolis Koubarakis

117

Page 118: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Research Contributions

• The model stRDF and stSPARQL

• The systems GeoTriples and Silk

• The spatiotemporal RDF store Strabon

• The ontology-based data access system Ontop-spatial.

• The visualization tool Sextant.

• Many applications, especially with Earth Observation

data.

• You will use most of our tools in the course project.

• See our survey paper:

http://cgi.di.uoa.gr/~koubarak/publications/2017/ic4ld-

manolis.pdf

Knowledge Technologies

Manolis Koubarakis

118

Page 119: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example of stRDF (Geospatial

Dimension)gag:Olympia

rdf:type gag:MunicipalCommunity;

gag:name "Ancient Olympia";

gag:population "184"^^xsd:int;

strdf:hasGeometry "POLYGON((21.5 18.5,23.5 18.5, 23.5

21,21.5 21,21.5 18.5));

http://www.opengis.net/def/crs/OGC/1.3/CRS84"^^strdf:WKT.

Knowledge Technologies

Manolis Koubarakis

119

Ancient Olympia

Page 120: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Example of stSPARQL

(Geospatial Dimension)Query: Compute the parts of burnt areas that lie in coniferous forests

SELECT ?burntArea (strdf:intersection(?baGeom,

strdf:union(?fGeom)) AS ?burntForest)

WHERE { ?burntArea rdf:type noa:BurntArea;

strdf:hasGeometry ?baGeom.

?forest rdf:type clc:Region;

clc:hasLandCover clc:ConiferousForest;

strdf:hasGeometry ?fGeom.

FILTER (strdf:intersects(?baGeom,?fGeom)) }

GROUP BY ?burntArea ?baGeom

Knowledge Technologies

Manolis Koubarakis

120

Page 121: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Geospatial Question Answering

• Try the following question on Google:

“What rivers cross London?”

Knowledge Technologies

Manolis Koubarakis

121

Page 122: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Geospatial Question Answering

(cont’d)• Although a Web page about crossings of River Thames

comes up first in the results, it is now clear that we do

not get a definite answer like we got for Sean Connery

and Angelina Jolie earlier.

• The reason is that the Google knowledge graph

currently does not support geospatial question

answering (similarly, temporal or with quantities etc.)

• See our paper on this topic:

http://cgi.di.uoa.gr/~koubarak/publications/2018/template

-based-GeoQA.pdf

Knowledge Technologies

Manolis Koubarakis

122

Page 123: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Presentation Outline

• The Semantic Web vision

• Linked data

• Linked geospatial and temporal data

• Ontologies

• Applications

Knowledge Technologies

Manolis Koubarakis

123

Page 124: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Applications

• Industry (e.g., see project Optique http://www.optique-

project.eu/)

• E-government (e.g., open data)

• E-commerce

• Tourism

• Medicine

• Biology

• Earth Observation (see the work of my group in projects

TELEIOS http://www.earthobservatory.eu/Earth

Observation (see the work of my group in projects

TELEIOS http://www.earthobservatory.eu/ and LEO

http://www.linkedeodata.eu/ ).

• …Knowledge Technologies

Manolis Koubarakis

124

Page 125: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

KGs in Enterprise (neo4j)

125Knowledge Graphs: The Path to Enterprise — Michael Moore and AI Omar Azhar, EY

Page 126: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Diffbot

126

https://www.youtube.com/watch?v

=d58BwcyTwEo&t=16s

Page 127: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

127

Readings

• The papers mentioned in the presentation.

• Gomez-Perez J.M., Pan J.Z., Vetere G., Wu H. (2017) Enterprise Knowledge Graph: An Introduction. In: Pan J., Vetere G., Gomez-Perez J., Wu H. (eds) Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, Cham. https://doi.org/10.1007/978-3-319-45654-6_1

• Chapter 1 of the Semantic Web Primer by Grigoris Antoniou and Frank van Harmelen available from http://www.csd.uoc.gr/~hy566/

• Chapter 1 of the book “Foundations of Semantic Web Technologies” by Pascal Hitzler, Markus Krötzsch and Sebastian Rudolph. See http://semantic-web-book.org.

• The book “Linked Data: Evolving the Web into a Global Data Space”by Tom Heath and Christian Bizer available from http://linkeddatabook.com/editions/1.0/ .

Page 128: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Readings (cont’d)

• The W3C Semantic Web Activity web site (http://www.w3.org/2001/sw/). Browse! There are many interesting introductory tutorials.

• The ESWC invited talk of Jim Hendler available at http://videolectures.net/eswc2011_hendler_work/ which looks at what the Semantic Web has achieved so far and deals with various criticisms of the area.

Knowledge Technologies

Manolis Koubarakis

128

Page 129: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

What to do after this Course

• Join my research group ☺

• Innovate using Semantic Web Technologies and Linked

Data! Two nice examples:

– Nomothesi@ (http://legislation.di.uoa.gr/, Ilias

Chalkidis, Babis Nikolaou and Panagiotis Soursos)

– Diavgeia Redefined (https://eellak.github.io/gsoc17-

diavgeia/index.html , Themis Beris)

Knowledge Technologies

Manolis Koubarakis

129

Page 130: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Nomothesi@

Knowledge Technologies

Manolis Koubarakis

130

Page 131: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Diavgeia Redefined

Knowledge Technologies

Manolis Koubarakis

131

Page 132: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

geoQA

-Question Answering over Large Geographic KBs

-Yago2geo

-Scalable KG update

-QA via KG embeddings

-QA via templates

132

Page 133: Introduction to Knowledge Technologies Knowledge Graphs ...cgi.di.uoa.gr/~pms509/lectures/Introduction to...• Knowledge Technologies & Data Science • The vision of the Semantic

Knowledge Technologies

Manolis Koubarakis

133

Outline of the Rest of the

Course• The Resource Description Framework (RDF and

RDFS)

• SPARQL: A Query language for RDF and RDFS

• Description logics

• The Web Ontology Language (OWL 2)

• Rule languages for the Semantic Web (SWRL)

• Ontology Engineering

• Linked geospatial and temporal data. Geospatial and temporal extensions of RDF and SPARQL