Upload
steve-ray
View
225
Download
0
Embed Size (px)
Citation preview
Federated Data Stores using
Semantic Web Technology
Steve Ray
Distinguished Research Fellow
Carnegie Mellon University
Interoperability is all about DATA
Three Technology Trends
that could help*
1. Semantic Web technologies
2. Cloud
3. Natural Language Processing
I will focus on semantic web technologies
*Inspired by “Top Three Technologies to Tame the Big Data Beast,” Huffington Post, 11/22/2011 Steve Ray, Carnegie Mellon University
Representation Trends
IBM Card Format
EDI
XML
Metadata
Metamodels
Meta-meta-
models
RDF/OWL
XML Schema
BPML/
BPEL
CBA
Semantic Mediation
Web Services
Protocols
40
25
7
6
5
0
2
4
3
1
SOA
Legacy
Current Practice
Exploratory
18 Info Modeling
FOL
(Slide adapted from Donald Hall, Logistics Enterprise Services Office, DLA)
Steve Ray, Carnegie Mellon University
Why Consider RDF & OWL
Semantic Web Technology?
RDF = Resource Description Framework
OWL = Web Ontology Language
1. Simple representation
– Everything is a triple: <subject – predicate – object>
2. Self-describing models
– Schemas and data coexist in data stores
3. Easy to interrogate
– SPARQL queries (over schema and data)
4. Easy to validate
– Supports automated reasoning
5. Easy to interoperate
– Natively supports distributed data stores
Steve Ray, Carnegie Mellon University
Simple Representation
Everything is stored as triples:
<subject predicate object>
Steve Ray, Carnegie Mellon University
Self-Describing Models
• The schema (model) and the data is stored in
the same place
• Schema:
– Mammal subClassOf Animal
– Human subClassOf Mammal
• Data:
– george is-a Human
– george marriedTo lisa
Steve Ray, Carnegie Mellon University
Easy to Interrogate
SPARQL†
language to query an RDF database
(Just matches against patterns of triples)
SELECT ?x
WHERE {
george marriedTo ?x .
}
Returns a table:
x
lisa
SELECT ?y
WHERE {
y? subClassOf Animal .
}
Returns a table:
y
Mammal
†
SPARQL = SPARQL Protocol and RDF Query Language Steve Ray, Carnegie Mellon University
Easy to Validate
SPARQL can be used
for reasoning,
not just interrogating
In SPARQL:
If
George sonOf Fred and
Fred siblingOf Mary Then
George nephewOf Mary
CONSTRUCT
{ ?a nephewOf ?c .}
WHERE
{
?a sonOf ?b ;
?b siblingOf ?c .
}
Steve Ray, Carnegie Mellon University
Easy to Interoperate
• A single query can interact with more than one
RDF database
– Linked Movie Database contains movies, actors
– DBPedia contains people and birthdates
• Find the birthdates of all Star Trek actors
– Answer does not exist in one source
Dbpedia is just one
of many RDF data stores
on the Web
We are not alone
Implications
• OWL/RDF provides a representation that can
natively support transformations from other
modeling languages and native formats for
product and process models
• The API is SPARQL
• Storage can be local or web-based
Steve Ray, Carnegie Mellon University
Take-away
• Poor interoperability is expensive
• Interoperability solutions can be expensive
• Semantic technology can make interoperability
solutions easier and cheaper to implement
Steve Ray, Carnegie Mellon University