14
V.2.2 Heiner Oberkampf, PhD November 7-8 th 2017 EMMC Workshop on Interoperability in Materials Modelling, Cambridge From Big Data to Big Analysis The convergence of Formal Semantics & Data Science in Life Sciences

From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

V.2.2

Heiner Oberkampf, PhDNovember 7-8th 2017EMMC Workshop on Interoperability in Materials Modelling, Cambridge

From Big Data to Big AnalysisThe convergence of Formal Semantics & Data Science in Life Sciences

Page 2: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 2

Understanding the 4V’s of Big Data

Normally the focus of Big Data Solutions

Performance is Critical to Success

Data Complexity is Increasing

Handling Uncertainty Requires Statistics

Majority of Big Data analytics approaches treat these two V’s

Semantic technologies provide

clear advantages

Mathematical Clustering

Techniques provide clear advantages

Page 3: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 3

AT OSTHUS LAB DATA SCIENCE IS

B I G A N A L Y S I SST

ATIS

TIC

AL

SEM

ANTI

CS

MAC

HIN

ELE

ARN

ING

REA

SON

ING

Page 4: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 4

Laboratory Analytical Process

sample dataanalytical process

Page 5: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 5

Typical Laboratory Data

Page 6: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 6

Allotrope Structure 2017

Astrix Technology GroupBSSN SoftwareElemental MachinesErasmus MCFraunhofer IPAThe HDF GroupLabAnswerLabWareMettler ToledoNISTSciBiteStanford UniversityUniversity of Illinois at ChicagoUniversity of Southampton

More information: https://www.allotrope.org/

Page 7: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 7

Allotrope Data Format (ADF)

HDF5Platform Independent File Format

Allotrope Data Format (ADF)

Descriptive metadata about• Method, instrument, sample,

process, result, etc.• Provenance, audit trail• Data Cube, Data Package

Analytical data represented by one-or multidimensional arrays of homogeneous data structures.

Analytical data represented by arbitrary formats, incl. native instrument formats, images, pdf, video, etc.

Specifically designed to store and organize large amounts of scientific data.

Data DescriptionSemantic Graph Model

Data Cubes Universal Data Container

Data Package Virtual File System

APIs

(Jav

a &

.NET

cla

ss li

brar

ies)

Chromatogram 2D HDF

Page 8: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 8

Ontology for HPLC Example

resultdevice

materialprocess

Page 9: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 9

Allotrope Example: Semantics Provides Common Meaning

Allotrope Data Format (ADF)Instance Data

Allotrope Data Models (ADM)Constraints

Allotrope Foundation Ontologies (AFO)Classes and Properties(aligned with Basic Formal Ontology)

is structured by

is classified by

provide standardizedvocabulary

Page 10: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 10

Semantic Spectrum of Knowledge Organization Systems

• Deborah L. McGuinness. "Ontologies Come of Age". In Dieter Fensel, Jim Hendler, Henry Lieberman, and Wolfgang Wahlster, editors. Spinning the Semantic Web: Bringing the World Wide Web to Its Full Potential. MIT Press, 2003. • Michael Uschold and Michael Gruninger “Ontologies and semantics for seamless connectivity” SIGMOD Rec. 33, 4 (December 2004), 58-64. DOI=http://dx.doi.org/10.1145/1041410.1041420• Leo Obrst “The Ontology Spectrum”. Book section in of Roberto Poli, Michael Healy, Achilles Kameas “Theory and Applications of Ontology: Computer Applications”. Springer Netherlands, 17 Sep 2010.• Leo Obrst and Mills Davis "Semantic Wave 2008 Report: Industry Roadmap to Web 3.0 & Multibillion Dollar Market Opportunities”. 2008.

Sources

Page 11: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 11

Application and Reference Ontologies

Materials Models

Application Ontology• includes: Information/Data Model, Schema, Domain

Ontology• Role: Defines the important entities and their

relationships for a specific application scenario.• Realization: ontology, ER Model, UML etc.

Reference Ontology• Also called: Canonical Reference Ontology,

Reference Terminology, Domain Ontology, Foundational Ontology

• Role: Standard (structured) vocabulary to be used for placeholder classes of the data model

• Realization: list, thesaurus, taxonomy or ontology• Domain models reusable in many different

application scenarios• Modules: Public ref. ontologies plus extension• Mappings between ref. ontologies

Units

Terminology Binding• Interface between data model and ref. terminologies

simulationobservation

subjectvalue

unit

Upp

er-L

evel

Ont

olog

y

11

Qualities

physical property

Page 12: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 12

Linked Materials Modelling Data

Lightweight Semantic Integration LayerMake data Findable, Accessible , Interoperable Reusable

(APIs, semantic indexing, data annotation, catalogs, metadata and linking)

Linked Open Data& Open APIs

Semantic Graph DB

(Knowledge Graph)

Simulation and Material Models

Repository

…Unstructured Documents

Analyticssimulationslearningreasoning

Visualizationdashboardsexplorationsearch …

The FAIR Guiding Principles for scientific data management and stewardship https://www.nature.com/articles/sdata201618

Page 13: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 13

1. Think from the end and put use-cases first.

2. Reduce the pain of data sharing and integration by using semantics and FAIR principles.

3. Combine logical and statistical approaches.

Towards Big Analysis

Page 14: From Big Data to Big Analysis - The European Materials ......Realization: ontology, ER Model, UML etc. ... Linked Materials Modelling Data Lightweight Semantic Integration Layer. Make

Slide 14

Heiner OberkampfConsultant at OSTHUS GmbH+49 (0) [email protected]

CONNECTING DATA, PEOPLE AND ORGANIZATIONS