19
BIG DATA EUROPE PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL EUROPE IN A CHANGING WORLD - INCLUSIVE, INNOVATIVE AND REFLECTIVE SOCIETIES HANG OUT 28 SEPTEMBER 2016 MARTIN KALTENBOECK (CFO, SEMANTIC WEB COMPANY) Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges BDE SC6 Hangout

BDE SC6-hang out - technology part-SWC - Martin

Embed Size (px)

Citation preview

Page 1: BDE SC6-hang out - technology part-SWC - Martin

BIG DATA EUROPEPILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVELEUROPE IN A CHANGING WORLD - INCLUSIVE, INNOVATIVE AND REFLECTIVE SOCIETIES

HANG OUT28 SEPTEMBER 2016MARTIN KALTENBOECK (CFO, SEMANTIC WEB COMPANY)

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

BDE SC6 Hangout

Page 2: BDE SC6-hang out - technology part-SWC - Martin

Big Data Europe (CSA: 2015-17)

Show societal value of Big Data: 7 Domains

Lower barrier for using big data technologieso Required effort and resourceso Limited data science skills

Help establishing cross-lingual/organizational/domain Data Value Chains

3 mai 2023

Page 3: BDE SC6-hang out - technology part-SWC - Martin

Big Data Europe

3 mai 2023

COORDINATIONStakeholder Engagement

(Requirements Elicitation)

SUPPORTDesign, Realise, Evaluate

Big Data Aggregator Platform

Create and Manage Societal Big Data Interest

Groups

Cloud-deployment ready Big Data Aggregator

Platform

CSA Measures

Results

Page 4: BDE SC6-hang out - technology part-SWC - Martin

THE BDE PLATFORM ARCHITECTURE & COMPONENTS

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

Page 5: BDE SC6-hang out - technology part-SWC - Martin

The three Big Data „V“ Variety is often neglected

Page 6: BDE SC6-hang out - technology part-SWC - Martin

Current State of Platform Architecture

Page 7: BDE SC6-hang out - technology part-SWC - Martin

Adding a Semantic Layer to Data Lakes

Manufacturing Marketing Sales SupportAccounting

Semantic Data Lake• central place for

model, schema and data historization

• Combination of Scale Out (cost reduction) and semantics (increased control & flexibility)

• grows incrementally (pay-as-you-go)

Inbound

Data Sources

Outbound and Consumption

Inbound Raw Data Store

Data Lake (order of magnitude cheaper scalable data store)

Knowledge Graph for Relationship Definition and Meta Data

Frontend to Access Relationship and KPI Definition / Documentation Frontend to Access (ad hoc) Reports Outbound Data Delivery to

Target Systems

JSON-LD CSVW R2RMLXML2RDF

Page 8: BDE SC6-hang out - technology part-SWC - Martin

Why to use BDE Technology?Hortonworks Cloudera MapR Bigtop BDE

File System HDFS HDFS NFS HDFS HDFS

Installation Native Native Native Native lightweight virtualization

Plug & play components (no rigid schema)

no no no no yes

High Availability Single failure recovery (yarn)

Single failure recovery (yarn)

Self healing, mult. failure rec.

Single failure recovery (yarn)

Multiple Failure recovery

Cost Commercial Commercial Commercial Free Free

Scaling Freemium Freemium Freemium Free Free

Addition of custom components

Not easy No No No Yes

Integration testing yes yes yes yes --

Operating systems Linux Linux Linux Linux All

Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom

Page 9: BDE SC6-hang out - technology part-SWC - Martin

SC6 PILOTCITIZENS BUDGET ON MUNICIPAL LEVELARCHITECTURE & COMPONENTS

Integrating Big Data, Software & Communities for Addressing Europe’s Societal Challenges

                    

Page 10: BDE SC6-hang out - technology part-SWC - Martin

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot Architecture & Components

Page 11: BDE SC6-hang out - technology part-SWC - Martin

SC6: Social Sciences

3 mai 2023www.big-data-europe.eu

Pilot focus area:Citizens budget

spending on municipal levelBig Data Focus area:

Statistical and research data linking & integrationSelected Key Data assets: Detailed budget execution data in city level, statistical data from public data portals and statistical offices, federated social sciences data catalogs

Page 12: BDE SC6-hang out - technology part-SWC - Martin

4 Vs of Big Data in SC6 Pilot Variety: requirement based on the harvesting of budget data

and budget execution data from several sources, available in different structures and formats.

Volume: requirement regarding the growing amount of open budget data available as well as of budget execution data

Velocity: requirements regarding budget execution data that is provided on continuous basis by the publisher (daily, weekly, monthly).

Veracity: Veracity refers to the biases, noise and abnormality in data. Even for within the same country there are differences on the published data because often are coming from different systems or public accounting standards are not enforced absolutely uniformly (e.g. different municipal departments)

3 mai 2023www.big-data-europe.eu

Page 13: BDE SC6-hang out - technology part-SWC - Martin

SC6 Pilot - Architecture

3 mai 2023www.big-data-europe.eu

Page 14: BDE SC6-hang out - technology part-SWC - Martin

SC6 Pilot: Technical Components

Apache Flume, https://flume.apache.org/ (data ingestion) Apache Kafka, http://kafka.apache.org (messaging service) Apache Spark, http://spark.apache.org (distributed analysis, transformation) Apache HDFS, http://hadoop.apache.org (raw data storage) SWCs’ PoolParty Semantic Suite, http://poolparty.biz (data consolidation,

curation, mapping) OpenLink s’ Virtuoso, http://virtuoso.openlinksw.com (triple store – data

storage) Apache HTTP, http://httpd.apache.org (linked data serving) Apache Avro, http://avro.apache.org/docs/current/ (intermediate data

schema) D3 JS Library, https://d3js.org/ (visualisation of RDF data using SPARQL

queries) SWCs’ PoolParty GraphSearch (SPARQL based interface component for

filter & faceted search)

3 mai 2023www.big-data-europe.eu

Page 15: BDE SC6-hang out - technology part-SWC - Martin

SC6 Pilot: 1st MockUp / WireFrame

3 mai 2023www.big-data-europe.eu

Page 16: BDE SC6-hang out - technology part-SWC - Martin

SC6 Pilot: Pilot EvaluationEvaluation Approach SC6 Pilot: Invite municipalities to evaluate and use the system Invite community (open data, data community, BDE community,

W3C) Evaluate within the 2 participating projects (BDE, YourDataStories) BDE SC6 workshop in Cologne, 5.12.2016 + Overall BDE Tech

WS (ApacheCon)Additional evaluation – tests over time with a growing amount of data a growing number of different sources & formats docked onto the

system additional analytics in place

3 mai 2023www.big-data-europe.eu

Page 17: BDE SC6-hang out - technology part-SWC - Martin

How to benefit best from BDE

3 mai 2023www.big-data-europe.eu

Health19 October

Brussels

Standalone Workshop

Food&Agri 30 September 2016

Brussels

Collocated with DG AGRI WP2018-20 stakeholder consultation

Energy 20 September 2016

Brussels

Collocated with H2020 Energy InfoDay (19th)

Transport 16 September 2016

Brussels

Collocated with TM 2.0 Steering Body meeting

Climate February 2017 Brussels

Collocated with EC JRC ISPRA Workshop

Societies 5 December 2016

Cologne

Collocated with EDDI16- 8th Annual European DDI User Conference: http://bde-sc6-2016.eventbrite.com (40 seats)

Security 18 October 2016 Brussels

Standalone Workshop

• BDE Workshops& Webinars• Use & expand the BDE Platform• Visit Website: news, events,

community, …• Big Data Europe W3C

Community Group• 7+1x Mailing Lists

Page 18: BDE SC6-hang out - technology part-SWC - Martin

Contacts: CESSDA, http://cessda.net/ Ivana Ilijasic Versic, [email protected] Abroshan, [email protected]

NCSR-D, http://www.demokritos.gr/?lang=en Michalis Vafopoulos, [email protected]

Semantic Web Company (SWC), http://www.semantic-web.at Martin Kaltenböck, [email protected] Jürgen Jakobitsch, [email protected]

3 mai 2023www.big-data-europe.eu

Page 19: BDE SC6-hang out - technology part-SWC - Martin

Questions & Contactswww.big-data-europe.eu

3 mai 2023#BigDataEurope

Martin KaltenböckCFO, Semantic Web [email protected]

http://www.linkedin.com/in/martinkaltenboeckhttps://twitter.com/kalte2707http://de.slideshare.net/MartinKaltenboeck http://blog.semantic-web.at