112

Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Embed Size (px)

Citation preview

Page 1: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle
Page 2: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Oracle Database Semantic Technologies: An Overview ofCore and Enterprise Functionality

Feb 2012

Souri Das, Ph.D. Architect, Oracle

Page 3: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Outline• Fundamentals

– Semantic Technologies in a nutshell– Source for RDF data– OWL Inferencing Primer

• Overview– Architecture– Core Functionality: Load, Infer, Query– Enterprise Functionality– Tools: OBIEE, Visualization

• Detailed look– Storage– Installation and configuration– Loading– Querying in SQL & SPARQL– Inferencing– Semantic Indexing of Unstructured Content– Security

3

Page 4: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Semantic Technologies in a nutshell

• Data (expressed in RDF)– <entity, attr, value> triplets (similar to: unpivoted tables)– entity typically is associated with a type (rdf:type

<employee>)– May be extended to quads: a grouping of triples– BENEFIT: uniform structure => allows syntactic integration

• Schema / Ontology (expressed in OWL)– extended with class and property hierarchies– Many other such “rules”– BENEFIT => allows discovery of implicit knowledge– BENEFIT => allows semantic integration

• Query (expressed in SPARQL)– BENEFIT => Suits the triple or quad-based structure of RDF

Page 5: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Source for RDF data

• Native RDF– Example: Social network

• Converted to RDF– Example: Tables (via unpivoting)– Example: XML– Example: Text (via NLP extraction)– Example: Spatial (ogc:WKTLiteral)– Example: Multimedia (via extraction)

• Viewed as RDF– Example: Tables

Page 6: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

OWL Inferencing: A short Primer

rdfs:subClassOf

rdfs:subPropertyOf

rdfs:domain

rdfs:range

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

owl:someValuesFrom

owl:allValuesFrom

owl:hasValue

owl:sameAs

owl:differentFrom

owl:equivalentClass

owl:equivalentProperty

owl:disjointWith

owl:complementOf

Page 7: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

:hasMother rdf:type owl:FunctionalProperty

:John :hasMother :Mary:John :hasMother :Maria=>:Mary owl:sameAs :Maria:Maria owl:sameAs :Mary

Page 8: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

:hasSSN rdf:type owl:InverseFunctionalProperty

:John :hasSSN 123-45-6789:Johny :hasSSN 123-45-6789=>:John owl:sameAs :Johny:Johny owl:sameAs :John

Page 9: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

:hasSibling rdf:type owl:SymmetricProperty

:John :hasSibling :Mary=>:Mary :hasSibling :John

Page 10: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

:hasAncestor rdf:type owl:TransitiveProperty

:John :hasAncestor :Mary:Mary :hasAncestor :Tom=>:John :hasAncestor :Tom

Page 11: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

owl:FunctionalProperty

owl:InverseFunctionalProperty

owl:SymmetricProperty

owl:TransitiveProperty

owl:inverseOf

:hasParent owl:inverseOf :hasChild

:John :hasParent :Mary=>:Mary :hasChild :John

Page 12: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

:Male owl:disjointWith :Female

owl:equivalentClass

owl:equivalentProperty

owl:disjointWith

owl:complementOf

:John rdf:type :Male:Mary rdf:type :Female=>:John owl:differentFrom :Mary:Mary owl:differentFrom :John

Page 13: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

:NonHuman owl:complementOf :Human

owl:equivalentClass

owl:equivalentProperty

owl:disjointWith

owl:complementOf

:Fish rdfs:subClassOf :NonHuman=>:Fish owl:disjointWith :Human:Human owl:disjointWith :Fish

Page 14: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

:Player owl:equivalentClass _:c1 _:c1 owl:onProperty :participateIn_:c1 owl:someValuesFrom :Sports_:c1 rdf:type owl:Restriction

owl:someValuesFrom

owl:allValuesFrom

owl:hasValue

:Soccer rdf:type :Sports:John :particpateIn :Soccer =>:John rdf:type :Player

Page 15: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

:Vegetarian rdfs:subClassOf _:c1 _:c1 owl:onProperty :eat_:c1 owl:allValuesFrom :Vegetable_:c1 rdf:type owl:Restriction

owl:someValuesFrom

owl:allValuesFrom

owl:hasValue

:John rdf:type :Vegetarian:John :eat :bean =>:bean rdf:type :Vegetable

Page 16: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: Examples

:EligibleToBePresident owl:equivalentClass _:c1 _:c1 owl:onProperty :countryOfBirth_:c1 owl:hasValue :USA_:c1 rdf:type owl:Restriction

owl:someValuesFrom

owl:allValuesFrom

owl:hasValue

:John :countryOfBirth :USA=>:John rdf:type :EligibleToBePresident

:Tom rdf:type :EligibleToBePresident=>:Tom :countryOfBirth :USA

Page 17: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION. THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE.

17

Page 18: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Importance of W3C & OGC Semantic Standards

• Key W3C Web Semantic Activities:• W3C RDF Working Group• W3C SPARQL Working Group• W3C RDB2RDF Working Group (Editors of R2RML)• W3C OWL Working group• W3C Semantic Web Education & Outreach (SWEO)• W3C Health Care & Life Sciences Interest Group (HCLS) • W3C Multimedia Semantics Incubator group• W3C Semantic Web Rules Language (SWRL)

• OGC GeoSPARQL Standard Working Group (Tech. Editor)

18

Page 19: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

3rd-P

arty Callouts

Reasoners:

Pellet

NLP

E

xtractors

Java API support

SPARQL: Jena / Sesame

JDBC

Java Programs

SQL Interface

SQ

Lplu

s

PL/

SQ

L

SQ

Ldev

.

Pro

gram

min

gIn

terf

ace

SPARQL Endpoints Joseki / Sesame

Architectural Overview

Enterprise (Relational)

dataRDF/OWL data and

ontologies

Rulebases: OWL, RDF/S, user-defined

Inferred RDF/OWL

dataRD

F/O

WL

Ora

cle

DB Security: Oracle Label Security

Sem

antic

In

dexe

s

Ontology-assisted Query of

Enterprise Data

Query RDF/OWL data and

ontologies

INFERLOAD

RD

F/S

Use

r-de

f.

OW

Lsubs

ets

Bul

k-Lo

ad

Incr

. DM

L

Cor

e fu

nct

ion

alit

y

QUERY (SPARQL in SQL)

OBIEE via SPARQL Gateway

Tools

VisualizerCytoscape-based

Page 20: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Semantic Store: Core Entities

OWL subset

RDF / RDFS

Rulebase m…

Rulebases & Vocabularies

X1 X2

Xp

Entailments

A1

A2

An

…R

R

R

her

man

sco

ttsc

ott

Application Tables

Semantic Network (MDSYS)

• Sem. Network Dictionary and data tables. New entities: models, rule bases, and rules indexes (entailments). OWL and RDFS rule bases are preloaded.

• Model A model holds an RDF graph (set of S-P-O triples)

• Application Table Contains col. of object type sdo_rdf_triple_s, associated with an RDF model, to allow DML on the RDF model.

• Rulebase Built-in or user-defined rulebase is a set of rules used for inferencing.

• Entailments An entailment stores triples derived via inferencing.

M1

M2

Mn

Models

Values

Triples

Oracle Database

Page 21: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Mapping Core Entities to DB objects

Sem. Store entity type Database View name

Model m mdsys.RDFM_m

Rulebase rb mdsys.RDFR_rb

Rules Index (entailment) x mdsys.RDFI_x

Virtual Model vm mdsys.SEMV_vm (allows duplicates)

mdsys.SEMU_vm (unique)

• SELECT privilege for a core entity is directly related to SELECT privilege for corresponding view object.

21

• Each core entity is mapped to a view object in the database:

Page 22: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Core Functionality: Load / Query / Inference

OWL subset

RDF / RDFS

Rulebase m…

Rulebases & Vocabularies

X1 X2

Xp

Entailments

A1

A2

An

…R

R

R

her

man

sco

ttsc

ott

Application Tables

Semantic Network (MDSYS)

• Load – Bulk load– Incremental load

• Query and DML • SPARQL (from Java/endpoint)

• Inference • Native support for OWL 2 RL,

SNOMED (OWL 2 EL subset), OWLprime, SKOSCORE, etc.

• Named Graph (Local/Global) Inference

• User-defined rules

M1

M2

Mn

Models

Values

Triples

Oracle Database

Page 23: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Enterprise Functionality: SQL / Sem. Indexing / Security

• SPARQL query (embedding) in SQL– Allows joining SPARQL results with relational data

– Allows use of rich SQL operators (such as aggregates)

• Semantic indexing– Index consists of RDF triples extracted from documents stored

(directly or indirectly) in a table column

– Extraction done by one or more 3rd party information extractors

• Security: Fine-Grained Access Control (for each triple)– Uses Oracle Label Security (OLS)

– Each RDF triple has an associated sensitivity label

• Querying Text and Spatial data using SPARQL

Page 24: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Tools: OBIEE for RDF data (using SPARQL Gateway)

Easy integration of RDF data with Business Intelligence (OBIEE) through SPARQL Gateway

Page 25: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Tools: Visualization using Cytoscape and SEM_ANALYSIS

• Detail Graph– Whole graph or a subgraph

• Summary Graph– Static Summaries

• Representative Instance• Particular Instance

– Dynamic Summaries• Summary for SPARQL-pattern based dynamic subgraph

• (Summary-Detail) Hybrid Graph

Page 26: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Example 1: Detail Graph

Page 27: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Example 2: Detail SubGraphs

Page 28: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Example 3: Representative & Particular Instance Summary Graphs

Note: The right graph was manually re-arranged to simplify comparison.

Page 29: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Example 4: Dynamic Subset-based Summary

Page 30: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Example 5: Particular Instance also an example of hybrid (detail-summary) graph

Detail Edge

Aggregate Edge

(can be expanded

Using expand Property) Absent Edge

Page 31: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Installation and Configuration of

Oracle Database Semantic Technologies

31

Page 32: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Installation and Configuration (1)

• Load the PL/SQL packages and jar file– cd $ORACLE_HOME/md/admin– Login as sysdba– SQL> @catsem

• Create a tablespace for semantic networkcreate bigfile tablespace semts datafile '?/dbs/semts01.dat' size 512M reuse autoextend on next 512M maxsize unlimited extent management local segment space management auto;

customize

32

Page 33: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Installation and Configuration (2)

• Create a temporary tablespace create bigfile temporary tablespace semtmpts tempfile ‘?/dbs/semtmpts.dat' size 512M reuse autoextend on next 512M maxsize unlimited EXTENT MANAGEMENT LOCAL ;ALTER DATABASE DEFAULT TEMPORARY TABLESPACE semtmpts;

• Create an undo tablespaceCREATE bigfile UNDO TABLESPACE semundots DATAFILE ‘?/dbs/semundots.dat' SIZE 512M REUSE AUTOEXTEND ON next 512M maxsize unlimited EXTENT MANAGEMENT LOCAL ; ALTER SYSTEM SET UNDO_TABLESPACE=semundots;

33

Page 34: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Installation and Configuration (3)

• Create a semantic network– As sysdba– SQL> exec sem_apis.create_sem_network(‘semts’);

• Create a semantic model– As scott (or other)– SQL> create table test_tpl (triple sdo_rdf_triple_s) compress;– SQL> exec sem_apis.create_sem_model(‘test’,’test_tpl’,’triple’);

34

Page 35: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Loading RDF triples

35

Page 36: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Loading Semantic Data

• Incremental DMLs (small number of changes)• SQL: Insert • SQL: Delete• Java API (Jena): GraphOracleSem.add, delete• Java API (Sesame): OracleSailConnection.addStatement,

removeStatements

• Bulk load (adding many triples)• PL/SQL: sem_apis.bulk_load_from_staging_table(…)

• Staging table may be populated using SQL*Loader or External Table• Java API (Jena)

• OracleBulkUpdateHandler.addInBulk, prepareBulk, completeBulk• Java API (Sesame)

• OracleBulkUpdateHandler.addInBulk, prepareBulk, completeBulk

Recommended loading method for very small number

of triples

Recommended loading method for very large number

of triples

36

Page 37: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Bulk load: Load Data into Staging Table

• Create a staging tableCREATE TABLE STAGE_TABLE (

RDF$STC_sub varchar2(4000) not null,

RDF$STC_pred varchar2(4000) not null,

RDF$STC_obj varchar2(4000) not null,

RDF$STC_graph varchar2(4000)

) compress;

-- RDF$STC_graph column is required if loading N-Quads

• Grant appropriate privileges to MDSYSGRANT SELECT, INSERT on STAGE_TABLE to MDSYS;

-- INSERT privilege is required if using External Table (see below)

• Two ways for loading from file(s)– Using External Table (for N-Triple or N-Quad format)– Using SQL*Loader (for N-Triple format only)

37

Page 38: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Load Data into Staging Table: using External Table• Create an External Table and associate it with the data files BEGIN

sem_apis.create_source_external_table( source_table => 'stage_table_source' ,def_directory => 'DATA_DIR' ,bad_file => 'CLOBrows.bad' ); END; /grant SELECT on "stage_table_source" to MDSYS;alter table "stage_table_source" location ('demo_datafile.nt');

• Load content of External Table into the Staging TableBEGIN sem_apis.load_into_staging_table( staging_table => 'STAGE_TABLE' ,source_table => 'stage_table_source' ,input_format => 'N-QUAD'); END;/

• For large loads, consider parallel loading– Distribute the input data into multiple files (associate with a single External Table)– Use parallel=><n> when invoking sem_apis.load_into_staging_table

38

Page 39: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Load Data into Staging Table: using SQL*Loader

• Create a control file for SQL*LoaderUNRECOVERABLELOAD DATAAPPENDinto table stage_tablewhen (1) <> '#‘(  RDF$STC_sub   CHAR(4000) terminated by whitespace,  RDF$STC_pred  CHAR(4000) terminated by whitespace,  RDF$STC_obj   CHAR(5000)

“rtrim(:RDF$STC_obj,'. '||CHR(9)||CHR(10)||CHR(13))”)

• Invoke SQL*Loader– sqlldr userid=<DBuser>/<passwd> control=<control_file_name>

data=<data_file_name> direct=true

• For large loads, consider parallel loading– Distribute the input data into multiple files and invoke sqlldr from multiple sessions– sqlldr userid=<DBuser>/<passwd> control=<control_file_name>

data=<data_file_name> direct=true parallel=true &

39

Page 40: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Loading RDF model from Staging Table

• Once Staging Table has been loaded, issue the following call

BEGIN sem_apis.bulk_load_from_staging_table( model_name => ‘my_rdf_model' ,table_owner => ‘SCOTT' ,table_name => 'STAGE_TABLE' ,flags => ‘PARSE'); END;/

• For parallel loading of large data consider additional attributes in flags parameter– PARALLEL=<n>– MBV_METHOD=SHADOW

40

Page 41: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Load Data into Staging Table using prepareBulk

• When you have many RDF/XML, N3, TriX or TriG files

OracleSailConnection osc = oracleSailStore.getConnection();

store.disableAllAppTabIndexes();for (int idx = 0; idx < szAllFiles.length; idx++) { … osc.getBulkUpdateHandler().prepareBulk( fis, "http://abc", // baseURI RDFFormat.NTRIPLES, // dataFormat "SEMTS", // tablespaceName null, // flags null, // register a

// StatusListener

"STAGE_TABLE", // table name (Resource[]) null // Resource... for contexts ); osc.commit(); fis.close(); }

• The latest Jena Adapter has prepareBulk and completeBulk APIs

Can start multiple

threads and

load files

in parallel

41

Page 42: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

More Data Loading Choices (1)

• Use External Table to load data into Staging TableCREATE TABLE stable_ext( RDF$STC_sub varchar2(4000), RDF$STC_pred varchar2(4000), RDF$STC_obj varchar2(4000)) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY tmp_dir ACCESS PARAMETERS( RECORDS DELIMITED by NEWLINE PREPROCESSOR bin_dir:'uncompress.sh' FIELDS TERMINATED BY ' ' ) LOCATION (‘data1.nt.gz',‘data2.nt.gz',…,‘data_4.nt.gz') ) REJECT LIMIT UNLIMITED;

Multiple

files

is critical to

performance

42

Page 43: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

More Data Loading Choices (2)

• Load directly using Jena Adapter Oracle oracle = new Oracle(szJdbcURL, szUser, szPasswd); Model model = ModelOracleSem.createOracleSemModel(

oracle, szModelName); InputStream in = FileManager.get().open("./univ.owl" ); model.read(in, null);

• More loading examples using Jena Adapter – Examples 7-2, 7-3, and 7-12 (SPARUL) [1]

• Loading RDFa– graphOracleSem.getBulkUpdateHandler().prepareBulk( rdfaUrl, … )

[1]: Oracle® Database Semantic Technologies Developer's Guide http://download.oracle.com/docs/cd/E11882_01/appdev.112/e11828/toc.htm

43

Page 44: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

More Data Loading Choices (3)

• Load directly using Sesame AdapterOraclePool op = new OraclePool( OraclePool.getOracleDataSource(jdbcUrl, user, password));

OracleSailStore store = new OracleSailStore(op, model);

SailRepository sr = new SailRepository(store);RepositoryConnection repConn = sr.getConnection(); repConn.setAutoCommit(false); repConn.add(new File(trigFile), "http://my.com/", RDFFormat.TRIG); repConn.commit();

• More loading examples using Sesame Adapter – Examples 8-5, 8-7, 8-8, 8-9, and 8-10 [1]

[1]: Oracle® Database Semantic Technologies Developer's Guide http://download.oracle.com/docs/cd/E11882_01/appdev.112/e11828/toc.htm

44

Page 45: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Utility APIs

• SEM_APIS.remove_duplicates– e.g. exec sem_apis.remove_duplicates(’graph_model’);

• SEM_APIS.merge_models– Can be used to clone model as well.– e.g. exec sem_apis.merge_models(’model1’,’model2’);

• SEM_APIS.swap_names– e.g. exec

sem_apis.swap_names(’production_model’,’prototype_model’);

• SEM_APIS.alter_model (entailment)– e.g. sem_apis.alter_model(’m1’, ’MOVE’, ’TBS_SLOWER’);

• SEM_APIS.rename_model/rename_entailment

45

Page 46: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference

46

Page 47: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Core Inference Features

• Inference done using forward chaining• Triples inferred and stored ahead of query time• Removes on-the-fly reasoning and results in fast query times

• Various native rulebases provided• E.g., RDFS, OWL 2 RL, SNOMED (EL+), SKOS

• Validation of inferred data• User-defined rules• Proof generation

• Shows one deduction path

47

Page 48: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

OWL Subsets Supported

• OWL subsets for different applications– RDFS++

• RDFS plus owl:sameAs and owl:InverseFunctionalProperty– OWLSIF (OWL with IF semantics)

– Based on Dr. Horst’s pD* vocabulary¹– OWLPrime

– Includes RDFS++, OWLSIF with additional rules– Jointly determined with domain experts, customers and partners

– OWL 2 RL – W3C Standard– Adds rules about keys, property chains, unions and intersections to OWLPrime

– SNOMED

–Choice of rulebases• If ontology is in EL, choose SNOMED component• If OWL 2 features (chains, keys) are not used, choose OWLPrime• Choose OWL2RL otherwise.

1 Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary 48

Page 49: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

11g Release 2 Inference Features

• Richer semantics support• OWL 2 RL, SKOS, SNOMED (subset of OWL 2 EL)

• Performance enhancements• Large scale owl:sameAs handling

• Compact materialization of owl:sameAs closure• Parallel inference

• Leverage native Oracle parallel query and parallel DML

• Incremental inference• Efficient updates of inferred graph through additions

• Compact Data Structures

49

Page 50: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Semantics Characterized by Entailment Rules

• These rules have efficient implementations in RDBMS50

• RDFS has 14 entailment rules defined in the spec.– E.g. rule : p rdfs:domain x .

u p y . u rdf:type x .

• OWL 2 RL has 70+ entailment rules.– E.g. rule : p rdf:type owl:FunctionalProperty .

x p y1 .

x p y2 . y1 owl:sameAs y2 .

x owl:disjointWith y .

a rdf:type x .

b rdf:type y . a owl:differentFrom b .

Page 51: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference APIs

• SEM_APIS.CREATE_ENTAILMENT(• index_name• sem_models(‘GraphTBox’, ‘GraphABox’, …), • sem_rulebases(‘OWL2RL),• passes,• inf_components,• options)• Use “PROOF=T” to generate inference proof

• SEM_APIS.VALIDATE_ENTAILMENT(• sem_models((‘GraphTBox’, ‘GraphABox’, …), • sem_rulebases(‘OWLPrime’),• criteria,• max_conflicts,• options)

• Jena Adapter API: GraphOracleSem.performInference()

Typical Usage:

• First load RDF/OWL data

• Call create_entailment to generate inferred graph

• Query both original graph and inferred data

Inferred graph contains only new triples! Saves time & resources

Typical Usage:

• First load RDF/OWL data

• Call create_entailment to generate inferred graph

• Call validate_entailment to find inconsistencies

51

Recommended API

for inference

Page 52: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

• Option 1: add user-defined rules– Both 10g and 11g RDF/OWL support user-defined rules in this form:

• Filter expressions are allowed• ?x :hasAge ?age.

?age > 18 ?x :type :Adult.

Extending Semantics Supported by 11.2 OWL Inference

Antecedents

Consequents

?x :parentOf ?y .?z :brotherOf ?x .

?z :uncleOf ?y

52

Page 53: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

• Option 2: Separation in TBox and ABox reasoning through PelletDb (using Oracle Jena Adapter)– TBox (schema related) tends to be small in size

• Generate a class subsumption tree using a complete DL reasoners like Pellet

– ABox (instance related) can be arbitrarily large• Use the native inference engine in Oracle to infer new

knowledge based on class subsumption tree from TBox

Extending Semantics Supported by 11.2 OWL Inference

TBox

TBox & Complete class

tree

ABox

DL reasoner

Inference Engine in

Oracle

53

Page 54: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Enabling Advanced Inference Capabilities

• Parallel inference optionEXECUTE sem_apis.create_entailment('M_IDX',sem_models('M'),sem_rulebases('OWLPRIME'), null, null, 'DOP=x');– Where ‘x’ is the degree of parallelism (DOP)

• Incremental inference optionEXECUTE sem_apis.create_entailment ('M_IDX',sem_models('M'),sem_rulebases('OWLPRIME'),null,null, 'INC=T');

• Enabling owl:sameAs option to limit duplicatesEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

sem_rulebases('OWLPRIME'),null,null,'OPT_SAMEAS=T');

• Compact data structuresEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

sem_rulebases(‘OWLPRIME'),null,null, 'RAW8=T');

• OWL2RL/SKOS inferenceEXECUTE Sem_apis.create_entailment('M_IDX',sem_models('M'),

sem_rulebases(x),null,null…);• x in (‘OWL2RL’,’SKOSCORE’)

54

Page 55: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Named Graph Based Global and Local Inference

• Named Graph Based Global Inference (NGGI)

• Perform inference on just a subset of the triples

• Some usage examples

• Run NGGI on just the TBox

• Run NGGI on just a single named graph

• Run NGGI on just a single named graph and a TBox

• Named Graph Based Local Inference (NGLI)

• Perform local inference for each named graph (optionally with a common Tbox)

• Triples from different named graphs will not be mixed together.

• NGGI and NGLI together can achieve efficient named graph based inference maintenance

55

Page 56: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Querying Semantic Data

56

Page 57: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Semantic Operators Expand Terms for SQL SELECT

• Scalable, efficient SQL operators to perform ontology-assisted query against enterprise relational data

Finger_Fracture

Arm_Fracture

Upper_Extremity_Fracture

Hand_FractureElbow_FractureForearm_Fracture

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

rdfs:subClassOf

Rheumatoid_Arthritis2

Hand_Fracture1

DIAGNOSISID

Patientsdiagnosistable

Query: “Find all entries in diagnosis column that are related to

‘Upper_Extremity_Fracture’”

Syntactic query against relational table will not work!

SELECT p_id, diagnosis FROM Patients Zero Matches! WHERE diagnosis = ‘Upper_Extremity_Fracture;Traditional Syntactic query against relational data

New Semantic query against relational data (while consulting ontology)

SELECT p_id, diagnosis FROM PatientsWHERE SEM_RELATED ( diagnosis, ‘rdfs:subClassOf’, ‘Upper_Extremity_Fracture’, ‘Medical_ontology’) = 1;

SELECT p_id, diagnosis FROM PatientsWHERE SEM_RELATED ( diagnosis, ‘rdfs:subClassOf’, ‘Upper_Extremity_Fracture’, ‘Medical_ontology’ = 1)AND SEM_DISTANCE() <= 2;

57

Page 58: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SPARQL Query Architecture

Jena APIJena Adapter

Sesame APISesame Adapter

Standard SPARQL EndpointEnhanced with query management control

SEM_MATCHSQL

Java

HTTP

58

SPARQL-to-SQL Core Logic

Page 59: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SEM_MATCH: Adding SPARQL to SQL

• Extends SQL with SPARQL constructs– Graph Patterns, OPTIONAL, UNION– Dataset Constructs– FILTER – including SPARQL built-ins– Prologue– Solution Modifiers

• Benefits:– Allows SQL constructs/functions: – JOINs with other object-relational data– DDL Statements: create tables/views

59

Page 60: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…));

Page 61: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…));

n1 n2

Alex Jerry

Alex Tom

Alice Bill

Alice Jill

Alice John

SQL Table Function

Page 62: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROM TABLE( SEM_MATCH( ‘PREFIX foaf: <http://...> SELECT ?n1 ?n2 FROM <http://g1> WHERE {?p foaf:name ?n1 OPTIONAL {?p foaf:knows ?f . ?f foaf:name ?n2 } FILTER (REGEX(?n1, “^A”)) } ORDER BY ?n1 ?n2’, SEM_MODELS(‘M1’),…));

SQL Table FunctionRewritable

( SELECT v1.value AS n1, v2.value AS n2 FROM VALUES v1, VALUES v2 TRIPLES t1, TRIPLES t2, … WHERE t1.obj_id = v1.value_id AND t1.pred_id = 1234 AND … )

Get 1 declarative SQL query- Query optimizer sees 1 query- Get all the performance of Oracle SQL Engine

- compression, indexes, parallelism, etc.

Page 63: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SEM_MATCH Table Function Arguments

SEM_MATCH(

query,

models,

rulebases,

options

);

‘SELECT ?a WHERE { ?a foaf:name ?b }’

Container(s) for asserted quads

Built-in (e.g. OWL2RL)and user-definedrulebases

‘ALLOW_DUP=T STRICT_TERM_COMP=F’

Entailed triples+

Basic unit ofaccess control

63

Page 64: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

GovTrack RDF Data

RDF/OWL data about activities of US Congress• Political Party Membership• Voting Records• Bill Sponsorship• Committee Membership• Offices and Terms GOV_TBOX

GOV_PEOPLE

GOV_BILLS_110

GOV_BILLS_111

GOV_VOTES_07

GOV_VOTES_09

GOV_VOTES_08

GOV_TRACK_OWL

GOV_ALL_VM INFERENCE

OWL2RL

Virtual Models

Semantic Models

Rulebases

Entailments

GovTrack in Oracle

http://www.govtrack.us/developers/rdf.xpd

GOV_ASSERT_VM

Asserted data only(2.8M triples)

Asserted + Inferred(3.1M triples)

64

GOV_DISTRICTS (US Census)

Page 65: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Virtual Models

• A virtual model is a logical RDF graph that can be used in a SEM_MATCH query.– Result of UNION or UNION ALL of one or more models and

optionally the corresponding entailment

• create_virtual_model (vm_name, models, rulebases)• drop_virtual_model (vm_name)• SEM_MATCH query accepts a single virtual model

– No other models or rulebases need to be specified

• DMLs on virtual models are not supported

65

Page 66: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Virtual Model Example

begin sem_apis.create_virtual_model('gov_assert_vm', sem_models('gov_tbox', 'gov_people', 'gov_votes_07', 'gov_votes_08', 'gov_votes_09', 'gov_bills_110', 'gov_bills_111', 'gov_districts'));

sem_apis.create_virtual_model('gov_all_vm', sem_models('gov_tbox', 'gov_people', 'gov_votes_07', 'gov_votes_08', 'gov_votes_09', 'gov_bills_110', 'gov_bills_111', 'gov_districts'), sem_rulebases('OWL2RL'));end;/

grant select on mdsys.semv_gov_assert_vm to scott;grant select on mdsys.semv_gov_all_vm to scott;

Creation

Access Control

66

Page 67: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 1: Basic Query

select fn, bday, g, t, hp, rfrom table(sem_match('SELECT ?fn ?bday ?g ?t ?hp ?r WHERE { ?s vcard:N ?n . ?n vcard:Family "Kennedy" . ?s foaf:name ?fn . ?s vcard:BDAY ?bday . ?s foaf:gender ?g . ?s foaf:title ?t . ?s foaf:homepage ?hp . ?s foaf:religion ?r }',sem_models('gov_all_vm'), null, null, null,null,' ALLOW_DUP=T '));

Find information about all Kennedys

67

Page 68: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 2: OPTIONAL Query

select fn, bday, g, t, hp, rfrom table(sem_match('SELECT ?fn ?bday ?g ?t ?hp ?r WHERE { ?s vcard:N ?n . ?n vcard:Family "Kennedy" . ?s foaf:name ?fn . ?s vcard:BDAY ?bday . ?s foaf:gender ?g . OPTIONAL { ?s foaf:title ?t . ?s foaf:homepage ?hp . ?s foaf:religion ?r } }',sem_models('gov_all_vm'), null, null, null, ,null, ' ALLOW_DUP=T '));

Find information about all Kennedys, with title, homepage and religion optional

68

Page 69: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 3: Simple FILTER

select fname, lnamefrom table(sem_match('SELECT ?fname ?lname WHERE { ?s rdf:type foaf:Person . ?s vcard:N ?vcard . ?vcard vcard:Given ?fname . ?vcard vcard:Family ?lname FILTER (STR(?lname) < "B") }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Find all people with a last name that starts with “A”

69

Page 70: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 4: Negation as Failure

select fn, bday, hpfrom table(sem_match('SELECT ?fn ?bday ?hp WHERE { ?s vcard:N ?n . ?n vcard:Family "Lincoln" . ?s vcard:BDAY ?bday . ?s foaf:name ?fn . FILTER (!BOUND(?hp)) OPTIONAL { ?s foaf:homepage ?hp } }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Find all Lincolns without a homepage

70

Page 71: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 5: UNION

select title, dtfrom table(sem_match('SELECT ?title ?dt WHERE { ?s foaf:name "Barack Obama" . { ?b bill:sponsor ?s } UNION { ?b bill:cosponsor ?s } ?b dc:title ?title . ?b rdf:type bill:LegislativeDocument . ?b bill:introduced ?dt FILTER("2007-02-01"^^xsd:date <= ?dt && ?dt < "2007-03-01"^^xsd:date ) }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Find all Legislative Documents introduced in February 2007 that were sponsored or cosponsored by Barack Obama

71

Page 72: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 6: Inference GovTrack Bill Types

72

Page 73: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 6: Inference

select title, dt, btypefrom table(sem_match('SELECT ?title ?dt ?btype WHERE { ?s foaf:name "Barack Obama" . ?b bill:sponsor ?s . ?b dc:title ?title . ?b rdf:type ?btype . ?b bill:introduced ?dt FILTER("2007-03-28"^^xsd:date <= ?dt && ?dt < "2007-04-01"^^xsd:date ) }',sem_models('gov_assert_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Find all Legislative Documents (and their types) sponsored by Barack Obama

73

Page 74: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Example 7: SQL Constructs (temporal interval computations)

select * from ( select fn, bday, tsfrom, (to_date(tsfrom,'YYYY-MM-DD') – to_date(bday,'YYYY-MM-DD')) YEAR(3) TO MONTHfrom table(sem_match('SELECT ?fn ?bay ?tsfrom WHERE { ?s vcard:BDAY ?bday . ?s foaf:name ?fn . ?s pol:hasRole ?role . ?role pol:forOffice ?office . ?role time:from ?tfrom . ?tfrom time:at ?tsfrom FILTER (?tsfrom >= "2000-01-01"^^xsd:date) }',sem_models('gov_all_vm'), null, null, null)) order by (to_date(tsfrom,'YYYY-MM-DD') – to_date(bday,'YYYY-MM-DD')) asc)where rownum <= 1;

Find the youngest person to take office since 2000

74

Page 75: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Oracle Extensions for Text and Spatial

75

Page 76: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Full Text Indexing with Oracle Text

• Filters graph patterns based on text search string• Indexes all RDF Terms

– URIs, Literals, Language Tags, etc.

• Provide SPARQL extension function– orardf:textContains(?var, “Oracle text search string”)

– Search String• Group Operators: AND, OR, NOT, NEAR, …• Term Operators: stem($), soundex(!), wildcard(%)

SQL> exec sem_apis.add_datatype_index( 'http://xmlns.oracle.com/rdf/text');

Page 77: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Text Query Example

select s, title, dtfrom table(sem_match('SELECT ?s ?title ?dt WHERE { ?b bill:sponsor ?s . ?s foaf:name ?n . ?b dc:title ?title . ?b bill:introduced ?dt FILTER (orardf:textContains(?title, "$children AND $taxes"))}',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Find all bills about Children and Taxes

Page 78: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Spatial Support with Oracle Spatial

• Support geometries encoded as orageo:WKTLiterals

:semTech2011 orageo:hasPointGeometry "POINT(-122.4192 37.7793)"^^orageo:WKTLiteral .

• Provide library of spatial query functions

SELECT ?sWHERE { ?s orageo:hasPointGeometry ?geom FILTER(orageo:withinDistance(?geom, "POINT(-122.4192 37.7793)"^^orageo:WKTLiteral, "distance=10 unit=KM"))

Page 79: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

orageo:WKTLiteral Datatype

SRS: WGS84 Longitude, Latitude"POINT(-122.4192 37.7793)"^^orageo:WKTLiteral

SRS: NAD27 Longitude, Latitude"<http://xmlns.oracle.com/rdf/geo/srid/8260> POINT(-122.4181 37.7793)"^^orageo:WKTLiteral

• Optional leading Spatial Reference System URI followed by OGC WKT geometry string.<http://xmlns.oracle.com/rdf/geo/srid/{srid}>

• WGS 84 Longitude, Latitude is the default SRS (assumed if SRS URI is absent)

SQL> exec sem_apis.add_datatype_index( 'http://xmlns.oracle.com/rdf/geo/WKTLiteral', options=>'TOLERANCE=1.0 SRID=8307 DIMENSIONS=((LONGITUDE,-180,180)(LATITUDE,-90,90))');

• Prepare for spatial querying by creating a spatial index for the orageo:WKTLiteral datatype

Page 80: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

What Types of Spatial Data are Supported?

• Spatial Reference Systems– Built-in support for 1000’s of SRS– Plus you can define your own– Coordinate system transformations applied transparently

during indexing and query

• Geometry Types– Support OGC Simple Features geometry types

• Point, Line, Polygon• Multi-Point, Multi-Line, Multi-Polyon• Geometry Collection

– Up to 500,000 vertices per Geometry

Page 81: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Spatial Function Library

• Topological Relations– orageo:relate

• Distance-based Operations– orageo:distance, orageo:withinDistance, orageo:buffer, orageo:nearestNeighbor

• Geometry Operations– orageo:area, orageo:length– orageo:centroid, orageo:mbr, orageo:convexHull

• Geometry-Geometry Operations– orageo:intersection, orageo:union, orageo:difference, orageo:xor

Page 82: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

GovTrack Spatial Demo

• Congressional District Polygons (435)– Complex Geometries– Average over 1000 vertices per geometry

Load .shp filefrom US Censusinto Oracle Spatial

Generate triples using sdo_util.toWKTGeometry()

Load into Oraclesemantic model

Page 83: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Spatial Query 1

select name, cdistfrom table(sem_match('SELECT ?name ?cdist WHERE { ?person usgovt:name ?name . ?person pol:hasRole ?role . ?role pol:forOffice ?office . ?office pol:represents ?cdist . ?cdist orageo:hasWKTGeometry ?cgeom FILTER (orageo:relate(?cgeom, "POINT(-71.46444 42.7575)"^^orageo:WKTLiteral, "mask=contains")) } ',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Which congressional district contains Nahsua, NH

Page 84: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Spatial Query 2

select name, cdistfrom table(sem_match('SELECT ?name ?cdist WHERE { ?person usgovt:name ?name . ?person pol:hasRole ?role . ?role pol:forOffice ?office . ?office pol:represents ?cdist . ?cdist orageo:hasWKTGeometry ?cgeom FILTER (orageo:nearestNeighbor(?cgeom, "POINT(-71.46444 42.7575)"^^orageo:WKTLiteral, "sdo_num_res=10")) } ORDER BY ASC(orageo:distance(orageo:centroid(?cgeom), "POINT(-71.46444 42.7575)"^^orageo:WKTLiteral, "unit=KM"))',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T '));

Who are my nearest 10 representatives ordered by centerpoint

Page 85: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SPARQL QueryingJena Adapter for Oracle Database 11g Release 2

85

Page 86: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Jena Adapter for Oracle Database 11g Release 2

• Implements Jena Semantic Web Framework APIs • Popular Java APIs for semantic web based applications• Adapter adds Oracle-specific extensions

• Jena Adapter provides three core features:• Java API for Oracle RDF Store • SPARQL Endpoint for Oracle with SPARQL 1.1. support• Oracle-specific extensions for query execution control and

management

86

Page 87: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Jena Adapter as a Java API for Oracle RDF

• “Proxy” like design• Data not cached in memory for scalability• SPARQL query converted into SQL and executed inside DB

• Various optimizations to minimize the number of Oracle queries generated given a SPARQL 1.1. query

• Various data loading methods• Bulk/Batch/Incremental load RDF or OWL (in N3, RDF/XML, N-TRIPLE

etc.) with strict syntax verification and long literal support

• Allows integration of Oracle Database 11g RDF/OWL with various tools• TopBraid Composer• External OWL DL reasoners (e.g., Pellet)

http://www.oracle.com/technology/tech/semantic_technologies/documentation/jenaadapter2_readme.pdf

87

Page 88: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

• Create a connection object– oracle = new Oracle(oracleConnection);

• Create a GraphOracleSem Object– graph = new GraphOracleSem(oracle, model_name, attachment);

• Load data– graph.add(Triple.create(…)); // for incremental triple additions

• Collect statistics– graph.analyze();

• Run inference– graph.performInference();

• Collect statistics– graph.analyzeInferredGraph();

• Query– QueryFactory.create(…);– queryExec = QueryExecutionFactory.create(query, model);– resultSet = queryExec.execSelect();

Programming Semantic Applications in Java

No need to create model

manually!

Important for performance

!

88

Page 89: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Jena Adapter Feature: SPARQL Endpoint

• SPARQL service endpoint supporting full SPARQL Protocol– Integrated with Jena/Joseki 3.4.0 (deployed in WLS 10.3 or Tomcat 6)– Uses J2EE data source for DB connection specification– SPARQL 1.1. and Update (SPARUL) supported

• Oracle-specific declarative configuration options in Joseki– Each URI endpoint is mapped to a Joseki service:<#service> rdf:type joseki:Service ; rdfs:label "SPARQL with Oracle Semantic Data Management" ; joseki:serviceRef "GOV_ALL_VM" ;#web.xml must route this name to Joseki joseki:dataset <#oracle> ; # dataset part joseki:processor joseki:ProcessorSPARQL_FixedDS;

89

Page 90: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

SPARQL Endpoint: Example

• Example Joseki Dataset configuration: <#oracle> rdf:type oracle:Dataset; joseki:poolSize 4; # Number of concurrent connections # allowed to this dataset. oracle:connection [ a oracle:OracleConnection ; ];

oracle:defaultModel [ oracle:firstModel "GOV_PEOPLE"; oracle:modelName "GOV_TBOX“; oracle:modelName "GOV_VOTES_07“; oracle:rulebaseName "OWLPRIME"; oracle:useVM "TRUE” ] ;

oracle:namedModel [ oracle:namedModelURI <http://oracle.com/govtrack#GOV_VOTES_07>; oracle:firstModel "GOV_VOTES_07" ].

90

Page 91: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Property Path Queries

• Part of SPARQL 1.1.– Regular expressions for properties: ? + * ^ / |

• Translated to hierarchical SQL queries – Using Oracle CONNECT BY clause

• Examples:– Find all reachable friends of John

• SELECT * WHERE { :John foaf:friendOf+ ?friend. }

– Find reachable friends through two different paths• SELECT * WHERE {

:John (foaf:friendOf|urn:friend)+ ?friend. }

– Get names of people John knows one step away:• SELECT * WHERE {:John foaf:knows/foaf:name ?person}.

– Find all people that can be reached from John by foaf:knows• SELECT * WHERE {

?x foaf:mbox <mailto:john@example> . ?x foaf:knows+/foaf:name ?name . }

91

Page 92: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

• Query management and execution control– Timeout– Query abort framework

• Including monitoring threads and a management servlet• Designed for a J2EE cluster environment

– Hints allowed in SPARQL query syntax– Parallel execution

• Support ARQ functions for projected variables– fn:lower-case, upper-case, substring, …

• Native, system provided functions can be used in SPARQL– oext:lower-literal, oext:upper-literal, oext:build-uri-for-id, …

Query Extensions in Jena Adapter

92

Page 93: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Query Extensions in Jena Adapter

• Extensible user-defined functions in SPARQL – Example

PREFIX ouext: <http://oracle.com/semtech/jena-adapter/ext/user-def-function#>

SELECT ?subject ?object (ouext:my_strlen(?object) as ?obj1) WHERE { ?subject dc:title ?object }

– User can implement the my_strlen functions in Oracle Database

• Connection Pooling through OraclePool java.util.Properties prop = new java.util.Properties(); prop.setProperty("InitialLimit", "2"); // create 2 connections prop.setProperty("InactivityTimeout", "1800"); // seconds …. OraclePool op = new OraclePool(szJdbcURL, szUser, szPasswd, prop,

"OracleSemConnPool"); Oracle oracle = op.getOracle();

93

Page 94: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Semantic Indexing for

Unstructured Content

94

Page 95: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Ro

wid

Overview: Creating and Using a Semantic Index

docId Article p_date

1 Indiana authorities filed felony charges and a court issued an arrest warrant for a financial manager who apparently tried to fake his death by crashing his airplane in a Florida swamp. Marcus Schrenker, 38 …

02/01/11

2 Major dealers and investors … 11/30/10

.. ..

Newsfeed table

Subject Property Object Graph

p:Marcus rdf:type rc::Person <…/r1>

p:Marcus :hasName “Marcus”^^… <…/r1>

p:Marcus :hasAge “38”^^xsd:.. <…/r1>

… … … …

Triples table with rowid references

SemContext index on Article column

SELECT Sem_Contains_Select(1) FROM Newsfeed

WHERE Sem_Contains (Article, ‘{?x rdf:type rc:Person . ?x :hasAge ?age .

FILTER(?age >= 35)}’,1)=1

CREATE INDEX ArticleIndex

ON Newsfeed (Article)

INDEXTYPE IS SemContext

PARAMETERS (‘my_policy’)

AND p_date > to_date(‘01-Jan-11’)

An

alytical Qu

eriesO

n G

raph

Data

An

alytical Qu

eriesO

n G

raph

Data

r1

r2

LOCAL1 PARALLEL 4

extractor

1LOCAL index support for semantic indexing is restricted to range-partitioned base tables only.Batch

Bulk

conte

nt typ

e:

•Te

xt

•File

(pat

h)

•URL

auto maintained like a B-tree index

Page 96: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

• The triples extracted from each document can be augmented with – Entailment created by combining the triples with schema ontologies and rulebase(s)– Knowledge bases (stored as RDF models) and their entailments

• Augmentation achieved using dependent policies begin sem_rdfctx.create_policy ( policy_name => ‘my_policy_plus_geo’ , base_policy => ‘my_policy’ , user_models => SEM_MODELS(‘USGeography’) , user_entailments => SEM_MODELS( ‘Doc_inf’ ,‘USGeography_inf’)); end;

• SELECT docId FROM Newsfeed WHERE SEM_CONTAINS (Articles, ‘ { ?comp rdf:type c:Company . ?comp p:categoryName c:BusinessFinance . ?comp p:location ?city . ?city geo:state “NY”^^xsd:string}’, ‘my_policy_plus_geo’) = 1

Entailments from KBs

Combining Ontologies with extracted triples

Will result in a multi-model query involving: the RDF model for my_policy index, the RDF model USGeography, and the entailments.

Extracted triples

Knowledge bases (KBs)

(local) Entailments from extr. triples

Page 97: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Inference: document-centric

Subject Property Object Graph

<John> rdf:type <Parent> <…/r1>

<John> <grewUpIn> <NYC> <…/r1>

<John> rdf:type <Man> <…/r2>

… … … …

Subject Property Object Graph

<John> rdf:type <Adult> <…/r1>

<John> <familiarWith> <NYC> <…/r1>

<John> rdf:type <Adult> <…/r2>

<John> rdf:type <Male> <…/r2>

… … … …

Semantic Index: extracted triples

<Man>

<Adult>

<Parent>

<Male> <familiarWith>

<grewUpIn>

rdfs:subClassOf rdfs:subPropertyOf

Ontology: schema triples (for extracted data)

Entailment: set of inferred triples

id document

1 John is a parent. He grew up in NYC.

2 John is a man.

Base table

Page 98: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Abstract Extractor Type• An abstract extractor type, rdfctx_extractor (in PL/SQL), defines the

common interfaces for all extractor implementations.

• Specific implementations for the abstract type interact with individual third-party extractors and produce RDF/XML documents for the input document.

create or replace type rdfctx_extractor authid current_user as object ( … member function extractRdf (document CLOB, docId VARCHAR2, params VARCHAR2, options VARCHAR2 default NULL) return CLOB member function batchExtractRdf (docCursor SYS_REFCURSOR, extracted_info_table VARCHAR2, params VARCHAR2, partition_name VARCHAR2 default NULL, docId VARCHAR2 default NULL, preferences SYS.XMLType default NULL, options VARCHAR2 default NULL) return CLOB, …) not instantiable not final/

Page 99: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

A sample extractor type -- interface

create or replace type rdfctxu.info_extractor under rdfctx_extractor (

overriding member function getDescription return VARCHAR2,

overriding member function rdfReturnType return VARCHAR2,

overriding member function getContext(attribute VARCHAR2) return VARCHAR2,

)

overriding member function extractRDF( document CLOB, docId VARCHAR2, params VARCHAR2 …) return CLOB,

overriding member function batchExtractRdf( docCursor SYS_REFCURSOR, extracted_info_table VARCHAR2, params VARCHAR2, partition_name VARCHAR2 default NULL …) return CLOB

Page 100: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Enterprise Security for Semantic Data

100

Page 101: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Enterprise Security for Semantic Data

• Model-level access control• Each semantic model accessible through a view

(RDFM_modelName)• Grant/revoke privileges on the view

• Discretionary access control on application table for model

• Finer granularity possible through Oracle Label Security• Triple level security• Mandatory Access Control

101

Page 102: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Oracle Label Security

• Oracle Label Security – Mandatory Access Control• Data records and users tagged with security labels• Labels determine the sensitivity of the data or the rights a person

must posses in order to read or write the data.

• User labels indicate their access rights to the data records. • For reads/deletes/updates: user’s label must dominate row’s label• For inserts: user’s label applied to inserted row

• A Security Administrator assigns labels to users

ContractID Organization ContractValue Label

ProjectHLS N. America 1000000 SE:HLS:US

Page 103: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

OLS Data Classification

Label Components:• Levels – Determine the vertical

sensitivity of data and the highest classification level a user can access.

• Compartments – Facilitate compartmentalization of data. Users need exclusive membership to a compartment to access its data.

• Groups – Allow hierarchical categorization of data. A user with authorization to a parent group can access data in any of its child groups.

CONF : NAVY,MILITARY : NY,DC

HIGHCONF : MILITARY,NAVY,SPCLOPS : US,UK

Row Label matches User Access Label

Page 104: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

RDF Triple-level Security with OLS

SE:HLS:FIN,US1000000ContractValueprojectHLS

SE:HLS:USN.AmericaOrganizationprojectHLS

RowLabelObjectPredicateSubject

• Sensitivity labels associated with individual triples control read access to the triples.

• Triples describing a single resource may employ different sensitivity labels for greater control.

Triples table

projectHLS

N.America

1000000

Organization

ContractValue

Subject Predicate Objects

SE:HLS:US

Security Label

SE:HLS:FIN,US

Security Label

Page 105: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Securing RDF Data using OLS: Example (1)

• Create an OLS policy– Policy is the container for all the labels and user authorizations– Can create multiple policies containing different labels

• Create label components– Levels:

UN (unclassified) < SE (secret) < TS (top secret)– Compartments:

HLS (Homeland Security), CIA, FBI– Groups:

NY, DC EASTUS USSD, SF WESTUS

• Create labels– “EASTSE” = SE:CIA,HLS:EASTUS– “USUN” = UN:FBI,HLS:US

105

Page 106: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Securing RDF Data using OLS: Example (2)

• Assign labels to users– John

“EASTSE” (SE:CIA,HLS:EASTUS)• John can read SE and UN triples• John can read triples for CIA and HLS• John can read triples for NY, DC, and EASTUS• When inserting a row, the default write label is “EASTSE”

– Mary “USUN” (UN:FBI,HLS:US)• Mary can only read UN triples• Mary can read triples for FBI and HLS• Mary can read all group triples (e.g. SF, NY, WESTUS, etc)• When inserting a row, the default write label is “USUN”

106

Page 107: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Securing RDF Data using OLS: Example (3)

• Apply the OLS policy to RDF store– Triple inserts, deletes, updates, and reads will use the policy

• John inserts triple: <http://John> <rdf:type> <http://Person>

• Mary inserts triple: <http://Mary> <rdf:type> <http://Person>

• Both these triples inserted in model but tagged with different label values (“EASTSE”, “USUN”)

• Users can have multiple labels– Only one label active at any time (user can switch labels)– Only active label applied to operations (e.g. queries, deletes,

inferred triples)

107

Page 108: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Securing RDF Data using OLS: Example (4)

108

John Read Triple Label Mary Read

No TS:HLS:DC No

No SE:HLS,FBI:DC No

Yes UN:HLS:DC Yes

Yes UN:HLS,CIA:NY No

No SE:CIA:SF No

No UN:HLS,FBI:NY Yes

No UN:HLS:SF Yes

• Example labels and read access

“EASTSE” (SE:CIA,HLS:EASTUS)

“USUN” (UN:FBI,HLS:US)

Page 109: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

Securing RDF Data using OLS: Example (5)

• Same triple may exist with different labels: <http://John> <rdf:type> <http://Person> ‘UN:HLS:DC’ <http://John> <rdf:type> <http://Person> ‘SE:HLS:DC’

• When Mary queries, only 1 triple returned (UN triple)• When John queries, both UN and SE triples are returned

– No way to distinguish since we don’t return label information!– Solution: use FILTER_LABEL option in SEM_MATCH– This query will filter out triples that are dominated by SE: SELECT s,p,y FROM table(sem_match('{?s ?p ?y}' , sem_models(TEST'),

null, null, null, null, ‘FILTER_LABEL=SE POLICY_NAME=DEFENSE’));

– MIN_LABEL can be used to filter out untrustworthy data

109

Page 110: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

For More Information

orSearch web for:

or

oracle.com

Oracle RDF

110

See documentation at:http://docs.oracle.com/cd/E11882_01/appdev.112/e11828/toc.htm

Page 111: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle

111

Page 112: Oracle Database Semantic Technologies: An Overview of Core and Enterprise Functionality Feb 2012 Souri Das, Ph.D. Architect, Oracle