View
230
Download
1
Tags:
Embed Size (px)
DESCRIPTION
UnifiedViews is a joint project currently maintained by Semantic Web Company (SWC) and Semantica.cz (Semantica.cz). It has been mainly developed by Charles University in Prague as a student project called ODCleanStore (version 2). It is based on the experience SWC obtained with the LOD Management Suite (LODMS) used in WP7 and ODCleansStore (version 1) developed by Charles University in Prague for the WP9a use case of the LOD2 FP7 project. In the next stack release of the LOD2 stack, UnifiedViews will replace LODMS as an ETL tool in the stack and the tool has already been adopted in other projects. In the webinar we will give a brief overview of the UnifiedViews project (Helmut Nagy). The main part will be a presentation of the tool and it's capabilities (Tomas Knap)
Citation preview
LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu
Creating Knowledge out of Interlinked Data
LOD2 Webinar . 29.11.2011 . Page 2 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
LOD2 is a large-scale integrating project co-funded by the European Commission within the FP7 Information and Communication Technologies Work Programme. This 4-year project comprises leading Linked Open Data technology researchers, companies, and service providers. Coming from across 12 countries the partners are coordinated by the Agile Knowledge Engineering and Semantic Web Research Group at the University of Leipzig, Germany.
LOD2 will integrate and syndicate Linked Data with existing large-scale applications. The project shows the benefits in the scenarios of Media and Publishing, Corporate Data intranets and eGovernment.
LOD2 Webinar . 29.11.2011 . Page 3 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle.
Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data.
LOD2 Webinar . 29.11.2011 . Page 4 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
UnifiedViewsTomáš Knap, Semantica.cz Helmut Nagy, Semantic Web Company
LOD2 Webinar . 29.11.2011 . Page 5 http://lod2.eu
Creating Knowledge out of Interlinked Data
• What is UnifiedViews• Short History: From ODCleanstore & LOD Manager to Unified Views• Presentation of Unified Views• Outlook, Impact, the UnifiedViews project
Agenda
LOD2 Webinar . 29.11.2011 . Page 6 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Suppose a Linked Data consumer, who is defining a data processing task - building a data mart integrating information from various RDF and non-RDF sources.
– There are tools available for RDF data extraction, enrichment, linking, transforming, ...
– Any23, Virtuoso, Silk, …
• Stil, the consumer has to (among other activities):– Write his own script executing the tools in the required order and with the required
configurations– Schedule the script– Add notification capabilities, such as sending an email in case of problems
• Maintenance of such task is challenging– In case of problems, consumer has to manually launch the problematic tool with
the proper input data and the problematic configuration, load the output data to a RDF store and browse/query these data
– Consumer can get very quickly lost as the amount of configurations and tools, he is using, is increasing; as a result, he may start creating duplicated configurations.
– Consumer cannot share already prepared configurations, cannot use configurations already prepared by others
Motivation for UnifiedViews
LOD2 Webinar . 29.11.2011 . Page 7 http://lod2.eu
Creating Knowledge out of Interlinked Data
• General Problem: Consumers have to write most of the logic to define, execute, monitor, schedule, and share the data processing tasks
• We propose UnifiedViews, an Extract-Transform-Load (ETL) framework
– The concept of data processing task is a central concept– Another central concept is the native support for RDF data format and ontologies
Problem and Our Solution
LOD2 Webinar . 29.11.2011 . Page 8 http://lod2.eu
Creating Knowledge out of Interlinked Data
Short History: From ODCleanstore & LOD Manager to Unified Views
Two tools targetting the same purpose with different strenght
One tool aligning the ideas of both tools and going beyond that
LOD2 Webinar . 29.11.2011 . Page 9 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Basic Concepts• Key Features• Demo
Presentation of Unified Views
LOD2 Webinar . 29.11.2011 . Page 10 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Every data processing task is modelled as a pipeline.
Basic Concepts in UnifiedViews – A Pipeline
LOD2 Webinar . 29.11.2011 . Page 11 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Component, plugin, module, on the pipeline • Every DPU has certain inputs, outputs, business logic and
configuration. Based on the input and the configuration, the outputs are created.
– E.g., DPU may apply certain set of SPARQL Update queries to the input RDF and produces output RDF data.
Basic Concepts in UnifiedViews - Data Processing Unit (DPU)
LOD2 Webinar . 29.11.2011 . Page 12 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Web administration interface:– Define and manage pipelines– Validate, execute, monitor and debug pipelines– Possibility to schedule tasks, set up notifications about the pipeline executions– Define and manage DPUs– Possibility to debug inputs to/outputs from DPU– Possibility to share pipelines and DPUs – Possibility to get notifications about the result of the pipeline execution– Multi-user environment
• Robust engine running the tasks– Ensures that DPUs on the pipeline are executed in the proper order– It may send notifications about the result of the pipeline execution
• Core DPUs to work with RDF data• Easy way how to extend UnifiedViews with your own DPUs
– Every DPU is an OSGi bundle, as a result, two DPUs with the requirement for two different versions of the same library may coexist in the framework
– Possibility to reload DPUs on the fly
Key Features
LOD2 Webinar . 29.11.2011 . Page 13 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Part A – instance http://odcs.xrg.cz:8080/unifiedviews– Introduction to the Web user interface (2mins)– Simple pipeline and basic operations with the pipeline (5mins)– DPU templates, how they can be managed (1-2mins)
• Part B – instance http://odcs.xrg.cz:8080/odcleanstore– More complex pipelines (1-5mins)
Demo
LOD2 Webinar . 29.11.2011 . Page 14 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Non-RDF ETL Frameworks– Plenty of ETL frameworks, some of them are open source– No support for RDF data format and ontologies in the framework itself
• E.g., DPUs are not prepared to suggest ontological terms in DPU configurations
– No native support for exchanging RDF data between DPUs– No RDF data processing units available out of the box
• Linked Data Integration Framework (LDIF)
• DERI Pipes– When adding new DPUs, Core must be rebuilt– It is not possible to reload Dpus on the fly– Does not provide solution for library version clashes– No possibility to debug inputs/outputs of DPUs
Related Work
LOD2 Webinar . 29.11.2011 . Page 15 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Integrated into the LOD2 stack • Replacin the existing LOD Manager integration
• Used in LOD2• WP9a, to process public contracts data• WP7, to enrich documents with links to Dbpedia and WKD Thesauri
• Used by other projects• OpenData.cz initiative• INITLIB• COMSODE FP7 project (2013-2015)• OpenFridge project.
• Used for commercial purposes by companies Semantica.cz, Czech Republic, and
Semantic Web Company, Austria, to help their customers to prepare and process RDF
data
Impact
LOD2 Webinar . 29.11.2011 . Page 16 http://lod2.eu
Creating Knowledge out of Interlinked Data
• UnifiedViews is available under open source license– GPLv3 + LGPLv3
• Hosted on GitHub– Respository: https://github.com/UnifiedViews/Core
• Current latest version: UnifiedViews 1.0 Candidate– Branch in the repository
• User Documentation:– https://grips.semantic-web.at/display/UDDOC/UnifiedViews+User+Documentation
How to try UnifiedViews?
LOD2 Webinar . 29.11.2011 . Page 17 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Guide for Plugin (DPU) developers:– https://grips.semantic-web.at/display/UDDOC/Creation+of+Plugins
• In short, every DPU typically consists of 4 main files– Core DPU file
• Implement execute() method• Define inputs, outputs
– pom.xml File– DPU dialog – DPU config object
How to develop new DPUs?
LOD2 Webinar . 29.11.2011 . Page 18 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Guideline for contributors:– https://grips.semantic-web.at/display/UDDOC/Guidelines+for+Contributors
How to contribute?
LOD2 Webinar . 29.11.2011 . Page 19 http://lod2.eu
Creating Knowledge out of Interlinked Data
• We presented UnifiedViews, an ETL framework with a native support for processing RDF data, which addresses the problem of sustainable RDF data processing
– Users may define, execute, monitor, debug, schedule, and share data processing tasks (pipeline)
– Users may create their own plugins - data processing units
• UnifiedViews has a living community around and is already used in many projects
– It is Maintained by Semantic Web Company and Semantica.cz
Conclusions
LOD2 Webinar . 29.11.2011 . Page 20 http://lod2.eu
Creating Knowledge out of Interlinked Data
Credits
Jingle R.E.M., Martin Kaltenböck, Florian Kondert
Coordination Thomas Thurner
Martin Kaltenböck
Moderation Martin Kaltenböck
Presented by Tomas Knap, Helmut Nagy
LOD2 Webinar . 29.11.2011 . Page 21 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
Hope you enjoyed staying with us – if you need more detailed information, visit us at www.lod2.eu and let us know how we can improve to meet your expectations!
Don’t forget to register for our next webinar
20.12. 2011 - Virtuoso (Open Link Software) 24.01. 2012 - OntoWiki (University of Leipzig, Germany)
Have a great day and don’t forget ...
LOD2 Webinar . 29.11.2011 . Page 22 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu