View
225
Download
0
Category
Preview:
Citation preview
Federated Data Stores using
Semantic Web Technology
Steve Ray
Distinguished Research Fellow
Carnegie Mellon University
Interoperability is all about DATA
Three Technology Trends
that could help*
1. Semantic Web technologies
2. Cloud
3. Natural Language Processing
I will focus on semantic web technologies
*Inspired by “Top Three Technologies to Tame the Big Data Beast,” Huffington Post, 11/22/2011 Steve Ray, Carnegie Mellon University
Representation Trends
IBM Card Format
EDI
XML
Metadata
Metamodels
Meta-meta-
models
RDF/OWL
XML Schema
BPML/
BPEL
CBA
Semantic Mediation
Web Services
Protocols
40
25
7
6
5
0
2
4
3
1
SOA
Legacy
Current Practice
Exploratory
18 Info Modeling
FOL
(Slide adapted from Donald Hall, Logistics Enterprise Services Office, DLA)
Steve Ray, Carnegie Mellon University
Why Consider RDF & OWL
Semantic Web Technology?
RDF = Resource Description Framework
OWL = Web Ontology Language
1. Simple representation
– Everything is a triple: <subject – predicate – object>
2. Self-describing models
– Schemas and data coexist in data stores
3. Easy to interrogate
– SPARQL queries (over schema and data)
4. Easy to validate
– Supports automated reasoning
5. Easy to interoperate
– Natively supports distributed data stores
Steve Ray, Carnegie Mellon University
Simple Representation
Everything is stored as triples:
<subject predicate object>
Steve Ray, Carnegie Mellon University
Self-Describing Models
• The schema (model) and the data is stored in
the same place
• Schema:
– Mammal subClassOf Animal
– Human subClassOf Mammal
• Data:
– george is-a Human
– george marriedTo lisa
Steve Ray, Carnegie Mellon University
Easy to Interrogate
SPARQL†
language to query an RDF database
(Just matches against patterns of triples)
SELECT ?x
WHERE {
george marriedTo ?x .
}
Returns a table:
x
lisa
SELECT ?y
WHERE {
y? subClassOf Animal .
}
Returns a table:
y
Mammal
†
SPARQL = SPARQL Protocol and RDF Query Language Steve Ray, Carnegie Mellon University
Easy to Validate
SPARQL can be used
for reasoning,
not just interrogating
In SPARQL:
If
George sonOf Fred and
Fred siblingOf Mary Then
George nephewOf Mary
CONSTRUCT
{ ?a nephewOf ?c .}
WHERE
{
?a sonOf ?b ;
?b siblingOf ?c .
}
Steve Ray, Carnegie Mellon University
Easy to Interoperate
• A single query can interact with more than one
RDF database
– Linked Movie Database contains movies, actors
– DBPedia contains people and birthdates
• Find the birthdates of all Star Trek actors
– Answer does not exist in one source
Dbpedia is just one
of many RDF data stores
on the Web
We are not alone
Implications
• OWL/RDF provides a representation that can
natively support transformations from other
modeling languages and native formats for
product and process models
• The API is SPARQL
• Storage can be local or web-based
Steve Ray, Carnegie Mellon University
Take-away
• Poor interoperability is expensive
• Interoperability solutions can be expensive
• Semantic technology can make interoperability
solutions easier and cheaper to implement
Steve Ray, Carnegie Mellon University
Recommended