21
ONTOLOGIES, SEMANTICS & ANALYTICS Professor Richard McClatchey ([email protected]) University of the West of England (UWE) Coldharbour Lane, Bristol, BS16 1QY, UK (Slides prepared by Dr Jetendr SHAMDASANI, UWE)

O NTOLOGIES, S EMANTICS & A NALYTICS Professor Richard McClatchey ([email protected]) University of the West of England (UWE) Coldharbour Lane,

Embed Size (px)

Citation preview

ONTOLOGIES, SEMANTICS & ANALYTICS

Professor Richard McClatchey ([email protected])

University of the West of England (UWE)

Coldharbour Lane, Bristol, BS16 1QY, UK

(Slides prepared by Dr Jetendr SHAMDASANI, UWE)

• Ontologies : What are they?

• What are Ontologies currently used for?

• Ontology Languages

• The Semantic Web

• Linked Data : Ontologies and the Semantic WebCome of Age

• Comments : Ontologies, Big Data & Analytics

OUTLINE

REPRESENTING DATA IN COMPUTERS

Data are the basic units in computation We can structure data in databases.

Information is ‘Data in context’ We can add properties to data and its ‘usage’ and

describe these in models Knowledge is ‘Information plus semantics’

We can attribute meaning to information and enrich our models often as so-called knowledge models or Ontologies

Some call these levels of abstraction as meta-data(or meta-meta data).

ONTOLOGIES : POPULAR ACADEMIC DEFINITION

Formal, explicit specification of a shared conceptualization

commonly accepted understanding

conceptual model of a domain

(ontological theory)

unambiguous terminology definitions

machine-readability with computational

semantics

[Gruber93]

ONTOLOGY IN COMPUTER SCIENCE An ontology is a (software) engineering

artifact: It is constituted by a specific vocabulary

used to describe a certain reality, plus a set of explicit assumptions regarding the

intended meaning of the vocabulary.

Thus, an ontology describes a formal specification of a certain domain: Shared understanding of a domain of interest Formal and machine manipulable model of a

domain of interest

ONTOLOGY USES

Many domains are described : Medical, Law, Construction, Music, Physics etc...

The most popular domain is Medical There are many ontologies that exist in this

domain already : Gene Ontology, MeSH, UMLS, FMA etc..

The domain of medicine has had the most successful uptake so far since there can be ambiguity and heterogeneity

ONTOLOGY USES Ontologies are used to

define “things” These are real world

entities Ontologies are formalised

as Concepts (things) and Relationships between Concepts

They can precisely define concepts in the real world and their relationships to other concepts

Mitral valve Aortic valve

Heart

Cavitated organCardiovascular

System

part_of part_of

part_ofis_a

ONTOLOGY LANGUAGES

Many Ontologies are based on Graphs or Graph Models (Nodes & Edges)

Some Languages : OBOL (Open Biomedical Ontology Language) DAML + OIL (DARPA Agent Markup Language +

Ontology Inference Layer) RDF (Resource Description Framework) OWL (Web Ontology Language)

RDF

Directed Labelled Graphs

Contain : Resource Property Value Statement

These create a triple

ResourceProperty

ValueResource

Statement

ResourceAuthor

“Paul”

OWL

Based on Description Logics Is a set of Axioms Based on RDF Can be represented as a Directed Graph

THE SEMANTIC WEB

Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN

His vision of the Web was much more ambitious than the reality of the existing (syntactic) Web:

This vision of the Web has become known as the Semantic Web

“… a plan for achieving a set of connected applications for data on the Web in such a way as to form a consistent logical web of data …”

“… an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation …”

LINKED DATA A giant graph of data published on the Web Ontologies are used to define the data Links are made by annotating data with existing

Ontologies Concept / Property mapping is conducted

Mostly Manual But Ontology Matching techniques can be used

This creates a graph of data on the web This graph is queryable using query languages e.g.

SPARQL (RDF) Description Logics Reasoning (OWL)

Most Linked Data is published as RDF since some developers may find OWL too complex

SO HOW DO ONTOLOGIES HELP US ?

They enable us to declare semantics in a way that can be represented in a computer.

Thus enabling relationships and definitions to be ‘discovered’ and ‘interpreted’ by services.

Bringing structure and understanding to complex data-information-knowledge

Thereby laying the foundation for business and scientific analytics and enabling description of ‘Big Data’.

This should help us build more interoperable, flexible and maintainable systems. But...

Ontologies are complex to build and test can suffer from semantic heterogeneity and performance limits

End.

CRISTAL CRISTAL

A long-running research project (1997-present) between UWE, CERN and CNRS (France).

That has developed data models and software using state-of-the-art technologies.

To address the data management, workflow and process control needs of a distributed community of experimental physicists.

Whose requirements were initially vague, long-term, evolving and demanding (cost/size/response).

Which has yielded academic output and software that is being commercially exploited.

17

Business Process Management enables the modelling and management of independent (business) processes e.g. For production lines, billing systems, i.e. workflow systems

Normally based on static models which cannot easily cope with system evolution.

CRISTAL provides a dynamically alterable data model which copes with design-to-production change.

A product and data management system that organises processes allowing evolution of processes and domain models.

It captures items and their descriptions +metadata. CRISTAL enables traceability of individual items,

workflows, descriptions, agents etc.

CRISTAL & BUSINESS PROCESS MANAGEMENT

CRISTAL FUNCTIONALITIES

Traceability of products and workflows over extended timescales, even as requirements evolve.

Design to maintenance continuity leads to reduced product ‘times-to-market’ (full lifecycle coverage).

Handles complexity of data management at the Terabyte level.

Distributed control over multiple data gathering centes.

Allows sharing of enterprise data across multiple user domains.

19

CRISTAL PHILOSOPHY

Object-oriented design User involvement at every stage A ‘description-driven’ approach:

Separation of description from instantiationProviding reusability of design patternsScalability and flexibility of data modelAble to evolve as requirements change !

Led to an open design readily adaptable to different domains and attractive to industry.

20

CRISTAL APPLICATION DOMAINS

Building Management (e.g. at Lyon Satolas) Production process management (e.g. CERN) Logistics control Business Process Management Bioinformatics data and process management Finances Public procurement Accounting