42
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director Zhe Wu, Ph.D. Architect

What’s New in Oracle Database 12c Graph Database

Embed Size (px)

DESCRIPTION

What’s New in Oracle Database 12c Graph Database. Xavier Lopez, Ph.D. Senior Director Zhe Wu, Ph.D. Architect. Agenda. Graph Database Strategy Customer Use Cases Oracle Spatial and Graph RDF Graph Features Future Plans. Graph Database Strategy. Support Graph Data Types…. - PowerPoint PPT Presentation

Citation preview

Page 1: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

What’s New in Oracle Database 12c Graph DatabaseXavier Lopez, Ph.D. Senior DirectorZhe Wu, Ph.D. Architect

Page 2: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Agenda

• Graph Database Strategy• Customer Use Cases• Oracle Spatial and Graph RDF Graph Features• Future Plans

Page 3: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Graph Database Strategy

Support Graph Data Types… …On all enterprise platforms• Oracle Database• Oracle NoSQL Database• Oracle Big Data Appliance• Oracle Cloud

Page 4: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

What Sets Us Apart?

• Scalability: Trillions of triples• Transactional: Concurrent loading and updates with ACID properties• Security: OLS security labels at “triple” level (OLS). • Standards based: W3C• Manageable: Use existing DB tools, utilities and expertise• Multi-type support: graph, relational, search, geospatial …• Multi-platform: Relational database, NoSQL, Hadoop

Page 5: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

RDF Graph v. Property Graph

RDF Semantic Graphs Property Graphs

• Use Case: – Social network analysis

• Analytics:– Clustering, centrality, page rank, path

finding

• Analytics Execution– In-memory, In-database

• Use Case: – Linked data, semantic metadata layer

• Analytics: – pattern matching, Inferencing

• Analytics Execution– In-database

Page 6: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

RDF Semantic Graph feature of Oracle Spatial and GraphFor Oracle Database 12c

Page 7: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Find related content & relations by navigating connected entities

“Reason” across entities

Find related content & relations by navigating connected entities

“Reason” across entities

Two Application

Use CasesLinked Data Entity Analytics

•Unified metadata model for distributed data sources

•Flexible model for sparse and evolving data

•Validate semantic and structural consistency

SPARQL pattern matchingDetecting related entities

across large, sparse, disparate collections of data

Inferencing: Applying rules on asserted data

Page 8: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Graph-based Metadata Layer

Linked Data in Support of Distributed Data

–W3C standard, flexible model for

sparse and evolving data–Common vocabulary enables

data integration & app development

–Relational data stays in place, apps don’t need to change Database Server

HR Database Sales Database

Inventory Database

HR Schema Inventory Schema Sales Schema

Mid-Tier ServerApplication 1

Application 2 Application 3

SQL RDF GraphInventory

Graph Sales Graph

Shared Ontologies

SPARQL

Page 9: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Linked Data in Enterprise

Index

Content Mgmt BI Server Data Warehouse

Machine Generated Data

Semantic Graph model

Transaction Systems

Hadoop Appliance

Subscription ServicesHuman Sourced

InformationSocial Media

Event Server

Data Servers

Data Sources / Types

Access & Presentation Layer

Page 11: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Business Challenge• Link database information on genes,

proteins, metabolic pathways, compounds, ligands, etc. to original sources.

• Increase productivity for accessing, sharing, searching, navigating, cross-linking, analyzing internal /external data

Novartis Institutes for BioMedical Research (NIBR)

Solution• Semantic integration layer on RDF graph• Rich domain-specific terminology (biology,

chemistry and medicine) 1.6 M terms• Terminology Hub: 8 GB of referential data

that cross-references between data repositories.

Page 12: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Find related content & relations by navigating connected entities

“Reason” across entities

Find related content & relations by navigating connected entities

“Reason” across entities

RDF Semantic

Graph-based

Applications

Linked Data Entity Analytics

•Unified metadata model for distributed data sources

•Flexible model for sparse and evolving data

•Validate semantic and structural consistency

SPARQL pattern matchingDetecting related entities

across large, sparse, disparate collections of data

Inferencing: Applying rules on asserted data

Page 13: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Knowledge Management in Intelligence Domain

Data SourcesContents Repository

DatabasesWeb resources

Blogs, Mails, news, RSS feeds

Information ExtractionFeature Extraction, Term Extraction

Financial Data

Telephone Records

Internet Traffic

Extracted Entities & Relationships

RDF Intelligence OntologiesSQL/SPARQL

Search, Presentation, Report, Visualization, Query

National Intelligence Scenario

Enterprise DataSpatial Documents

Person: Abduwali Abdukhadir Muse

Nationality: SomalianCountry: UK

Group: Al Shabab

Ideology: Islamist

Person: ?

Nationality: Pakistani

Country: Pakistan

Group: ?

Person: Chehab Abdouljamid Bouyaly

Country: Morocco

Group: al Qaeda

Currently resides

Member of

Currently resides

Member of

Supports

Supports

Link ?

Link ?

Member of

Currently resides

Has

Has

images

Page 14: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Spatial and Graph RDF Semantic Graph Features

Page 15: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Database 12c RDF Semantic Graph Database

• Exadata ready• Compression & partitioning• Parallel load, inference, query• High availability• Label security: triple-level• W3C standards compliance• Semantic Indexing of text• Enterprise Manager

• Native RDF graph data store• Manages billions of triples• Optimized storage architecture

• SPARQL-Jena/Joseki, Sesame• SQL/graph query, B-tree indexing • Ontology assisted SQL query

• RDFS, OWL2 RL, EL, SKOS• User-defined rules• Incremental, parallel reasoning• User-defined inferencing• Plug-in architecture

Load / Storage

Query

Reasoning

• Semantic indexing framework• Integration with

• OBIEE, Oracle R Enterprise

• Oracle Data Mining

Analytics

Page 16: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Support for Apache Jena and OpenRDF Sesame

Provides application developers with: • Easy-to-use Java APIs to access Oracle databases and RDF files• A standard-compliant SPARQL web service endpoint (Joseki, Fuseki)• Data loading (RDF/XML, N-TRIPLES, N-QUADS, TriG ,Turtle)• JSON output• Oracle-specific extensions for query execution control and

management

Leverage existing investments in open source frameworks

Page 17: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

• RDF views on relational tables• Enables SPARQL query on distributed

resources• Views: Automatic and custom• Aligns with W3C RDB2RDF standard• No duplication of data and storage

RDB to RDF Mapping

Relational to RDF Mapping

Page 18: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Label Security Data Classification

• Fine grained security through integration with Oracle Label Security• Model level security through GRANT/REVOKE privileges • Oracle Label Security - mandatory access control

• Labels assigned to both users and data• Data labels determine the sensitivity of the rows or the rights a

person must posses in order to read or write the data. • User labels indicate their access rights to the data records.

18

Page 19: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Core Inferencing Features

• Forward-chaining based inference engine in the database• Native rulebases: RDFS, OWL 2 RL, OWL 2 EL, SKOS• Validation of inferred data• Proof generation • User defined inferencing

- Temporal reasoning, Spatial reasoning• Ladder Based Inference

- Fine grained security for inference graph• Integration with external OWL 2 reasoners (TrOWL, Pellet)

Page 20: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. 20

RDF Semantic Graph: Graph Visualization & Modeling Support

Cytoscape

Graph Visualization Semantic Modeling

Protégé

Oracle Confidential – Internal/Restricted/Highly Restricted

Page 21: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Analyzing RDF with Oracle BI and Oracle Advanced Analytics

Oracle BI Oracle Advanced Analytics

Page 22: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Partner Tools: (IO Informatics)

Page 23: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Partner Tools: Tom Sawyer Social Network Analysis

Page 24: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Manageability of RDF Semantic GraphBuilt in support from Oracle Database utilities and tools

Control query execution:• in database & Jena client

Create & monitor graph w/ SQL Developer:• Semantic Network• Models, virtual models• Btree indexes• Rule bases• Entailments• Security data labels• Semantic index policies

Tune / AnalyzeIngest / Replicate / Recover Manage

Tune load/ query/ inference:• Parallelism• Btree indexing triple/quad• Typed literals indexing• SPARQL query hints• Statistics gathering• Dynamic Sampling

Analyze performance:• Enterprise Manager: view

optimizer plans, monitor execution / resource usage

Bulk load:• Apache Jena bulk loader• Oracle external tables &• SQL*Loader (Direct Path)

w/ PL/SQL Bulk Load API

Replicate & recover:• Data Guard: physical standby• Data Pump: staging tables• Recovery Manager: RMAN

Page 25: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Open Geospatial Consortium: GeoSPARQL Support

Defines a Vocabulary for Spatial Query Patterns–Classes

• Spatial Object, Feature, Geometry–Properties

• Topological relations• Links between features and geometries

–Datatypes for geometry literals• ogc:wktLiteral, ogc:gmlLiteral

• Query Functions–Topological relations, distance, buffer, intersection, …

Page 26: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

• RDF Graph support in Oracle NoSQL Database Enterprise Edition

• High performance Key Value store• SPARQL 1.1 access to graph data• Jena & Joseki SPARQL Web Services• Massive horizontal scalability • Support for World Wide Web Consortium

(W3C) Semantic Web standards

RDF Graph for Oracle NoSQL

Graph Support on Oracle NoSQL DBBrings horizontal scalability to RDF graph applications

Page 27: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

• High volume, simple queries (low latency)• Queries aggregating over most of the

graph (e.g. what are the hobbies of the 100 most popular people in the network)

• Frequent, large-scale updates• Large Data Centers

RDF Graph for Oracle NoSQL

When to Consider a NoSQL Database for GraphsHorizontal scalability, low query latency/cost, ease of install & management

Page 28: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Quick Steps to Get Started

Page 29: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Quick Steps to Get Started

Install Oracle Database 12c

orUse a Prebuilt VM from OTN

Initialize- Creating a tablespace ‘ts’- Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’)

- Run as SYS (for 12.1.0.2 only) in SQL*Plus exec mdsys.enableGeoRaster;

Configure Joseki/Fusekiweb service

endpoint

Using Java APIsLoad/Query/Inference through

GraphOracleSem, DatasetGraphOracleSem,

OracleBulkUpdateHandler, …

Using SQL/PLSQL APIsexec create_sem_model

insert/delete triples, bulk load, run SEM_MATCH, create_entailment, …

SPARQL QuerySPARQL Update

REST APIs

Page 30: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Quick Steps to Get Started

Install Oracle Database 12c

orUse a Prebuilt VM from OTN

Initialize- Creating a tablespace ‘ts’- Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’)

- Run as SYS (for 12.1.0.2 only) in SQL*Plus exec mdsys.enableGeoRaster;

Configure Joseki/Fusekiweb service

endpoint

Using Java APIsLoad/Query/Inference through

GraphOracleSem, DatasetGraphOracleSem,

OracleBulkUpdateHandler, …

Using SQL/PLSQL APIsexec create_sem_model

insert/delete triples, bulk load, run SEM_MATCH, create_entailment, …

SPARQL QuerySPARQL Update

REST APIs

Page 31: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Quick Steps to Get Started

Install Oracle Database 12c

orUse a Prebuilt VM from OTN

Initialize- Creating a tablespace ‘ts’- Run as SYS in SQL*Plus exec sem_apis.create_sem_network(‘ts’)

- Run as SYS (for 12.1.0.2 only) in SQL*Plus exec mdsys.enableGeoRaster;

Configure Joseki/Fusekiweb service

endpoint

Using Java APIsLoad/Query/Inference through

GraphOracleSem, DatasetGraphOracleSem,

OracleBulkUpdateHandler, …

Using SQL/PLSQL APIsexec create_sem_model

insert/delete triples, bulk load, run SEM_MATCH, create_entailment, …

SPARQL QuerySPARQL Update

REST APIs

Page 32: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Performance

Page 33: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Spatial and Graph - LUBM 200K on 3-Node RAC X2-4Load, Inference and Query Performance

The LUBM 200K Graph has 48+ Billion triples (edges)

– Original graph has 26.6 Billion unique triples (quads)

– Inference produced another 21.4 Billion triples

Data Loading Performance

– Triples Loaded and Indexed Per Second (TLIPS): 273K

Inference Performance

– Triples Inferred and Indexed Per Second (TIIPS): 327K

SPARQL Query Performance

– Query Results Per Second (QRPS): 459K

Setup:Hardware: Sun Server X2-4, 3-node RAC

- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870) - Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)

256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

48+ Billion edges graph

Page 34: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. Oracle Confidential – Internal 34

Oracle Spatial and Graph – LUBM 4400K on Exadata X4-2Load, Inference and Query Performance

Degrees of Parallelism

Data set Load(B triples/hr)

OWL Inference(B triples/hr)

Query(B answers/hr)

256* LUBM 4400K

605.4B / 115.2hrs

475.6+ B / 86hrs 30m

92.5B / 22.5 hrs

Exadata X4-2 High capacity full rackZS3-2 with 2 controllers, 8 trays of diskEight compute nodes of ExadataOracle 12.1.0.1 DB standard install of Exadata* A mix of DOP used: 296, 256, 192

Open cursors = 1000Processes = 1000SGA = 132GB, PGA = 100GB32K blocksize was given to all graph tablespacesTEMP group was created with 3 bigfile tablespaces Test performed in Aug/Sept 2014.

Setup:

Data Loading Performance

– Triples Loaded and Indexed Per Second (TLIPS): 1.420M

Inference Performance

– Triples Inferred and Indexed Per Second (TIIPS): 1.527M

SPARQL Query Performance

– Query Results Per Second (QRPS): 1.130M

1.08 Trillion edges graph

Page 35: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Best Practices in Solving Performance Issues• When there is an underperforming SQL in RDF data loading, inference,

or query operations, check:

• Have you gathered statistics?• APIs: export_model_stats,export_entailment_stats,

export _network_stats, import_model_stats, import_entailment_stats, import_network_stats

• Have you tried parallel execution?• Balanced hardware is key.

• Have you tried dynamic sampling? (Level 6, 8, 11)• Is there a lack of indexes (including text index)?

• DO NOT just add indexes without careful & thorough testing

Page 36: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

• When there is an underperforming SQL in RDF data loading, inference, or query operations, check:

• Have you looked at the plan?• Is it possible to write the same query in a different way?• Is it possible to simplify?

• Simpler queries Better chance to find more efficient ways to execute

• Tweak plan through hints• Send a small, reproducible test case with the execution plan to Oracle Support

or post it on the Forum

Best Practices in Solving Performance Issues (2)

Page 37: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

• Find the top thread(s) in Java VM

• Are there excessive GC activities?• Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, …

• Has the heap size been set properly?• Try larger heap size, analyze heap by performing a heap dump

• Send a small, reproducible test case with the thread dump to Oracle Support or post it on the Forum

Best Practices in Solving Performance Issues (3)

Page 38: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Cool Ongoing Activities:

• Enable Oracle Cloud Services: Oracle Social Network • Integration with Oracle business applications and middleware• Ongoing support for RDF Graph on all major platforms

• Relational Database• NoSQL Database• Big Data (Hadoop)• Cloud

Page 39: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Page 40: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

Appendix

Page 41: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

W3C Semantic Technology Stack

http://www.w3.org/2007/03/layerCake.svg

• Core Technologies• URI

• Uniform resource identifier

• RDF• Resource description

framework

• RDFS• RDF Schema

• OWL• Web ontology language

Page 42: What’s New in Oracle Database 12c Graph Database

Copyright © 2014 Oracle and/or its affiliates. All rights reserved.

What is RDF A graph data model for web resources

and their relationships

The graph can be serialized into- RDF/XML, N3, N-TRIPLE, …

Construction unit: Triple (or assertion, or fact)

<http://foobar> <:produces> <:mp3>

Quads (named graphs) add context, provenance, identification, etc. to assertions

<http://foobar> <:produces> <:mp3 > <:ProductGraph>

Subject Predicate Object

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

http://…/locatedIn

http://…/produce

http://www.oracle.com

http://www.oracle.com/products/RDF

http://…/produce

http://…/customerOf

http://…/uses