Upload
chad-butler
View
215
Download
0
Embed Size (px)
Citation preview
Improve the way you create, manage and distribute information
www.innodata-isogen.com
INNOVATION INSPIRATION
Relational database integration with RDF/OWL
Bob DuCharme
December 7, 2006
XML 2006
2
2
About me
• Senior Consultant, Innodata Isogen
• weblog: http://www.snee.com/bobdc.blog
• other writing: See http://www.snee.com/bob
3
3
What is an RDF/OWL ontology?
• Ontology: “Computational formalization of a subject matter” (Bijan Parsia et al)
• Describe metadata about resource classes and their relationships
• Web Ontology Language a W3C update of DAML+OIL
• Good fit with Knowledge Representation and other AI work
• Ontologies vs. traditional schemas
4
4
“Ontologies for the sake of ontologies”
• If metadata is data about data, what data is your metadata about?
• Field of Dreams attitude of many ontology developers
5
5
RDF in one slide
• A data model, not a syntax.• Three-part statement called a triple:
(Subject, Predicate, Object)
• For example: (urn:isbn:0553213113, http://purl.org/dc/elements/1.1/creator, ”Herman Melville”)
• Great for loosely structured data, but…
6
6
RDBMS integration with RDF/OWL
• This presentation: background + demo• Paper accompanying presentation:
7
7
Use Cases
• Two address book databases that use different names (e.g. workState, businessState)
• Find useful queries across the two that are easier in SPARQL than in SQL, thanks to RDF/OWL:• Who works in NY state? • List any phone numbers (home, mobile, business, etc.)
that I have for Alfred Adams.• Find all info for Bobby Fischer at 2304 Eighth Lane,
even if the other database lists him as Robert L. Fischer of 2304 8th Ln.
8
8
Basic Steps
• Generate data• Load into MySQL• Let D2RQ (RDBMS/RDF interface server) know
about those databases• Get a dump of representative RDF data• Create ontology for that data• Issue ontology-aware SPARQL queries against
that data
9
9
Generate Data
• Fill out every field in a Eudora address book entry, export to CSV, see what’s there
• Repeat for Outlook• Write python script to generate data, e.g.
"Miguel","[email protected]","Miguel Porter","Miguel","Porter","1462 Oak St.","Kitchener","TN","US","67117-2620","(364) 769-1070","(431) 985-7923","(850) 998-7790","http://www.radioshack.com/Miguel","RadioShack","","2109 Green Ave.","Boston","MP","US","48379-6760","(824) 959-5268","(354) 384-8517","(992) 963-9772","http://www.radioshack.com", "[email protected]","(748) 965-6871","","Here is a sample note.\n\nThat was two carriage returns."
10
10
Load into MySQL
CREATE DATABASE eudora;
USE eudora;
CREATE TABLE entries (
nickname VARCHAR(20),
email1 VARCHAR(50),
fullName VARCHAR(30),
firstName VARCHAR(15),
lastName VARCHAR(20),
address VARCHAR(60),
# etc.
PRIMARY KEY (lastName,firstName)
);
11
11
Tell D2RQ about databases
• Generate mapping files (command lines split):• generate-mapping -o eudoraMapping.ttl -u root -p mypw
jdbc:mysql://localhost/eudora • generate-mapping -o outlookMapping.ttl -u root -p if27
jdbc:mysql://localhost/outlook
• Combine two mapping files• Start server with combined mapping file:
• d2r-server comboMapping.ttl
12
12
Get some data to use for ontology creation
•SPARQL Query:
CONSTRUCT { ?s ?p ?o }
WHERE { ?s ?p ?o }
•URL version:http://localhost:2020/sparql?query=CONSTRUCT+%7B+%3Fs+%3Fp+%3Fo+%7D+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D
13
13
rdfcat.xsl
• XSLT 1.0 stylesheet to create a single RDF file from a source file like this:
<rdfcat xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href=“myfile1.rdf"/>
<xi:include href=“myfile2.rdf"/>
<xi:include href=“myfile3.rdf"/>
</rdfcat>
14
14
List of files to concatenate together (rdfcat.rdf)
<rdfcat xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="http://localhost:2020/sparql?query=CONSTRUCT+%7B+%3Fs+%3Fp+%3Fo+%7D+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D"/>
<!--xi:include href="properties.owl"/-->
</rdfcat>
• Short XSLT stylesheet reads listed resources, concatenates them together. Now we have RDF of sample data.
15
15
Generate ontology
• Tell SWOOP to load an ontology… then just load a regular RDF file!
• Save it right away, see what you have.• Add That Value:
• Define more relationships between properties with Swoop
• Save it• Look at the resulting ontology
16
16
New ontology rules
• Define equivalent fields in the two databases• Declare “phone” property, name its subproperties
(home, mobile, cell, work, business, fax…)• email as inverse function
17
17
Separate new rules into separate file
<rdfcat xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="http://localhost:2020/sparql?query=CONSTRUCT+%7B+%3Fs+%3Fp+%3Fo+%7D+WHERE+%7B+%3Fs+%3Fp+%3Fo+%7D"/>
<xi:include href="properties.owl"/>
</rdfcat>
18
18
Issue Queries
• Who works in NY state? • List any phone numbers (home, mobile,
business, etc.) that I have for Alfred Adams.• Find all info for Bobby Fischer at 2304 Eighth
Lane, even if other database lists him as Robert L. Fischer of 2304 8th Ln.
• Sample running of pellet query (split onto two lines):
pellet -if file:///dat/xml/rdf/databaseint/sampleout.rdf -ifmt RDF/XML -qf atest1.spq
19
19
Who works in NY state?
PREFIX e: <http://localhost:2020/resource/eudora/>PREFIX o: <http://localhost:2020/resource/outlook/>
SELECT * WHERE { ?s e:entries_workState "NY" }--------------------------------------------------------------Query Results (9 answers):s================jill:Jonessarah:Richardsonvictor:Hernandezelaine:Sanchezannie:Butlerrodney:Jonesjesus:Wellscurtis:Barnescrystal:Martin
20
20
Alfred Adams’ phone numbers
PREFIX e: <http://localhost:2020/resource/entries/>
SELECT ?phoneType ?phone WHERE {
?s ?phoneType ?phone.
?s e:phone ?phone.
?s eud:entries_lastName "Adams".
?s eud:entries_firstName "Alfred".
}
-------------------------------------------------------
Query Results (13 answers):
phoneType | phone
================================================
outlook:entries_businessPhone | "(768) 629-3639"
eudora:entries_workPhone | "(768) 629-3639"
eudora:entries_workFax | "(865) 937-1192"
eudora:entries_workMobile | "(262) 851-6276"
eudora:entries_otherPhone | "(840) 290-6143"
eudora:entries_mobile | "(257) 372-7719"
et cetera…
outlook:entries_mobilePhone | "(257) 372-7719"
21
21
Bobby Fischer info
SELECT * WHERE { <http://localhost:2020/resource/entries/Bobby/Fisher> ?p ?o }--------------------------------------------------------------------------------Query Results (41 answers):p | o
===============================================================================eudora:entries_mobile | "(989) 402-5141"eudora:entries_workWebAddress | "http://www.atmosenergy.com"outlook:entries_lastName | "Fisher"eudora:entries_firstName | "Bobby"eudora:entries_state | "NE"eudora:entries_zip | "29565-9670"outlook:entries_businessPhone | "(167) 559-3177"eudora:entries_lastName | "Fisher"eudora:entries_workCity | "El Paso"eudora:phone | "(974) 270-6457"# et cetera...eudora:entries_country | "US"eudora:entries_otherPhone | "(974) 270-6457"outlook:entries_mobilePhone | "(974) 270-6457"outlook:entries_homePhone | "(254) 133-8460"eudora:entries_workMobile | "(602) 997-9361"eudora:entries_workAddress | "3839 Maple Lane"eudora:entries_workOrganization | "Atmos Energy"eudora:entries_email1 | "[email protected]"eudora:entries_fullName | "Bobby Fisher"eudora:entries_workTitle | ""outlook:entries_businessState | "NE"outlook:entries_firstName | "Bobby"eudora:entries_city | "New York"