Ontologies & linked open data

  • Published on
    10-May-2015

  • View
    355

  • Download
    0

Embed Size (px)

DESCRIPTION

A brief presentation I made as an invited lecture.

Transcript

  • 1.Ontologies & Linked Open Data A brief overview and some real-world applicationsJoo Rocha da Silva joaorosilva@gmail.com December 2013

2. Contents Ontologies: the importance of semantics in the data storage and querying layerPopular ontologies : DCTerms, FOAFThe Semantic Web in practice: Linked Open Data in the Facebook API and in DBpediaRelational vs Graph : differencesThe SPARQL Language : examplesA non-relational database : OpenLink Virtuoso 3. The importance of semantics 4. The importance of semantics How does someone understand the meaning of the columns in a relational database?Reading a lot of documentationHard to provide information to external systems Tailor-made web services required! 5. SAP (one of 78,826 tables and counting) source : http://scn.sap.com/thread/1743542 6. MediaWiki source http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/MediaWiki_1.20_%2844edaa2%29_database_schema.svg/2500px-MediaWiki_1.20_%2844edaa2%29_database_schema.svg.png 7. now imagine we want to have images of different kinds, with different attributesMediaWiki source http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/MediaWiki_1.20_%2844edaa2%29_database_schema.svg/2500px-MediaWiki_1.20_%2844edaa2%29_database_schema.svg.png 8. The importance of semantics Building a query over such a system is complexRequires knowledge of its intricate and subtle aspectsSome columns even contain ags for business logic processing (o_O) Bad design decisions = spaghetti code 9. Relational vs. Ontology 10. ! SELECT employee.id AS employee_id, engineer.id AS engineer_id, manager.id AS manager_id, employee.name AS employee_name, employee.type AS employee_type, engineer.engineer_info AS engineer_engineer_info, manager.manager_data AS manager_manager_data FROM employee LEFT OUTER JOIN engineer ON employee.id = engineer.id LEFT OUTER JOIN manager ON employee.id = manager.id [] 11. Building the U.Porto Ontology 12. foaf:Personrdfs:subclassOforg:Organizationorg:memberOfrdfs:subclassOfup:Studentup:Facultyrdfs:subclassOfup:PhDStudentrdfs:literal up:Thesisup:thesisdc:titleup : a hypothetical ontology for U.Porto http://www.w3.org/TR/vocab-org/ 13. Representing a person 14. up:PhDStudentrdf:typehttp://www.fe.up.pt/ ~pro11004http:// www.fe.up.pt/Joo Rochafoaf:nameorg:memberOf http://www.w3.org/TR/rdf-schema/ http://www.foaf-project.org/ 15. Getting all the students SELECT ?uri ?attribute ?value FROM WHERE { ?uri rdfs:type up:Student. ?uri ?attribute ?value } Will fetch all the students, regardless of their typeWill also return their attributes (database columns)Different types of students will have different attributes 16. How does the system know that a manager is also an employee?Inference The inference engine recognizes certain properties and builds virtual triples in the backgroundhttp://docs.openlinksw.com/virtuoso/rdfsparqlrule.html 17. Inference is good Transitive Properties (subclass of subclass) Subclasses Multiple Inheritance Handling (Student + Researcher + ScholarshipHolder)Saves coding time spent writing complex queries 18. Nothing comes for free NO referential integrity or foreign keys!Aggregation operators slowTransactions are not supported in standard SPARQL (SPARQL 1.1 Query/Update Services should be atomic but that they are not required to be atomic.)Graph DBMS Solutions are in early stages (many bugs, many betas, many mailing lists) 19. However Graph databases allow for exible, intuitive representations of the dataThey handle billions of triplesRestriction-based querying makes queries more high-level 20. Query examples 21. DBpedia Find Facebooks CEO and the university where he studied PREFIX prop: PREFIX dbprop: select distinct ?s ?almaMater where { ?s dbpedia-owl:almaMater ?almaMater. ?s dbprop:knownFor ?knownFor. FILTER regex(?occupation, "Facebook", "i") ?s dbprop:occupation ?occupation. FILTER regex(?occupation, "CEO", "i") } LIMIT 100Try it at http://dbpedia.org/sparql 22. DBpedia Find all fun (aka rear-wheel-drive) cars from the eighties, made by Japanese manufacturers select distinct (?car) ?manufacturer where { ?car rdf:type dbpedia-owl:Automobile. ?car dbpedia-owl:layout . ?car dbpedia-owl:productionStartYear ?startYear. FILTER ( ?startYear < "1990-01-01 00:00:00"^^xsd:date ) FILTER ( ?startYear > "1980-01-01 00:00:00"^^xsd:date ) ?car ?manufacturer. { SELECT distinct(?manufacturer) WHERE { ?car dbpedia-owl:manufacturer ?manufacturer. ?manufacturer ?location. FILTER regex(?location, "Japan", "i") } } } LIMIT 100Try it at http://dbpedia.org/sparql 23. Custom queryWhat do you want to know? 24. Virtuoso, a graph database 25. Conclusions Relational databases Mature, robust, support transactions Hard to model entities with dynamic attributes Complex queryingGraph Databases Recent technology Handle billions of triples Higher-level querying, more abstract 26. ! Joo Rocha da Silva!Research Data Management and Semantic Web Researcher, Web & iPhone DeveloperJoo Rocha da Silva is an Informatics Engineering PhD student at the Faculty of Engineering of the University of Porto. He specializes on research data management, applying the latest Semantic Web Technologies to the adequate preservation and discovery of research data assets.!He is experienced in many programming languages (Javascript-Node, PHP with MVC frameworks, Ruby on Rails, J2EE, etc etc) running on the major operating systems (everyday Mac user). Regardless of language, he is a quick learner that can adapt to any new technology quickly and effectively.!He is also an experienced freelancer iOS Developer with several Apps published on the App Store, and a self-taught DIY mechanic with a special interest in classic cars, particularly his 1987 Toyota Corolla GT Twin Cam, also known as Hachi-Roku or AE86.joaorosilva@gmail.com