30
© Prof. Dr. -Ing. Wolfgang Lehner | In the Age of Open Information Do-It-Yourself Analytical Mashups on Schema-optional Data Katrin Braunschweig Julian Eberius Maik Thiele Wolfgang Lehner OUTPUT 2011

In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

Embed Size (px)

DESCRIPTION

The increasing amount and variety of open and crowdsourced data available in the web leads to new challenges in end-user focused data analysis. This data is characterized by a great structural diversity which causes serious problems regarding their integration. On the other site there is a lack of end-user friendly tools to make productive use of the data available on the web. We want to address the first problem by developing a schema-optional graph-based data model that enables incremental schema augmentation and evolution. The second problem should be adressed by a multi-layered domain-specific language for data mashup construction on schema-optional data.

Citation preview

Page 1: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr. -Ing. Wolfgang Lehner |

In the Age of Open InformationDo-It-Yourself Analytical Mashups on Schema-optional Data

Katrin BraunschweigJulian EberiusMaik ThieleWolfgang Lehner

OUTPUT 2011

Page 2: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 2

> The Roots of Open Data

The open society is a concept originally developed by philosopher Karl Popper

In open societies, government is responsive and tolerant, and political mechanisms are transparent and flexible

The state keeps no secrets from itself in the public sense It is a non-authoritarian society in which all are trusted with the knowledge of all

Page 3: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

1984 - Freedom of Information Campaign starts up

Page 4: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 4

> Why Data Should be Open

Many scientific data can be deemed to belong to the commons (“the human race”), e.g. the human genome, medical science, environmental data

They have an infrastructural role essential for scientific endeavour (e.g. in Geographic Information Systems and maps)

Data published in scientific articles are factual and therefore not copyrightable Public money was used to fund the work and so it should be universally available It was created by or at a government institution

Page 5: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 5

> Open Data – Examples

Page 6: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 6

> data.gov

Page 7: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 7

> data.gov.uk

Page 8: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 8

> data.worldbank.org

Page 9: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 9

> unData

Page 10: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 10

> OpenStreetMap

Page 11: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 11

> Civic Applications based on Open Data

Page 12: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 12

> Explore How U.S. Budget Proposal

Page 13: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 13

> Mapnificient

Page 14: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 14

> Schooloscope

Page 15: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 15

> Fluglärmkarte (taz.de)

Database

Journalism

Page 16: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 16

> Open Data – Challenges an

Challenges Lots of contributors / maintainers Small information pieces distributed, decentralised and very loosely coupled Different degree of schema information and meta data Innovation / unexpected reuse No standardized development process

Contributions Schema-optional data store, collaborative schema augmentation (basic

operators) Measure degree of schema information Non-destructive schema changes Capture data provenance Visualizations and interaction patterns Iterative and guided development Data and visualization recommendation

Page 17: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 17

>

The Big Picture

Page 18: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 18

> Do-It-Yourself Schema Augmentation

Automated Schema Extraction

Schema Augmentation

AT

E

TT

V

AT

E

TT

V

ET

ReferenceNode

EntityTypes AttributeTypes

NoType

NoType

AT1AT2

AT3

AT4

ET1

ET2 ET3

ET4

AT1 : valueAT2 : value

AT3 : valueAT4 : value

CSV File Relational Table

Application

Page 19: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

>

19

Do-It-Yourself Analytical Mashups

„number of cafes vs. age distribution per district

of Dresden“ Look up fitting data sets

Process Query

Compute suitable visualization

Compute interaction / exploration features

Page 20: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

>

20

Do-It-Yourself Analytical Mashups (2)

Look up fitting data sets

Process Query

Compute suitable visualization

Compute interaction / exploration features

number of cafes vs. age distribution per district of Dresden

natural geographic entityvalue dimensions relations/operations

NLP techniques + Lookup services (e.g. GeoNames)

„number of cafes vs. age distribution per district

of Dresden“

Page 21: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

>

21

Do-It-Yourself Analytical Mashups (3)

Process Query

Compute suitable visualization

Compute interaction / exploration features

Look up fitting data sets

Identified Dimension Candidate Datasets

number of cafes OpenStreetMap

Recommendation service, e.g., Yelp

age distribution Municipal Statistics Agency Dresden

district of Dresden OpenStreetMap

Ambiguity user feedbackOR

„number of cafes vs. age distribution per district

of Dresden“

Page 22: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

>

22

Do-It-Yourself Analytical Mashups (4)

Process Query

Compute suitable visualization

Compute interaction / exploration features

Look up fitting data sets

Identified Dimension Properties Visualization Candidates

number of cafes Number for each district Bars per district ORcolor of each district

age distribution Distribution for each district Multiple histograms

district of Dresden Polygon for each district Map

„number of cafes vs. age distribution per district

of Dresden“

Page 23: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information

>

23

Do-It-Yourself Analytical Mashups (5)

Process Query

Compute suitable visualization

Compute interaction / exploration features

Look up fitting data sets

„number of cafes vs. age distribution per district

of Dresden“

number of cafes age distribution

Too much information for one visualization enable exploration, e.g., clicking a district in the

map opens histogram

Page 24: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 24

>

Demo

Page 25: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 25

>

Mobile application

Map-centric webapplication

REST Interface

#

3rd-party applications

Persistence Layer# #

Open Civic Platform for Dresden

Mobile Application Add new requests by guiding the user

through a wizard-style input form Show (own) reports and there current

rating and processing actual state Visualize all reports on a map Subscribe to a set of urban district and

notify the user about newsWeb Application

Filter the requests by their category, their creation time (last 24 hours, last week, last month, all)

Change the requests state (open, closed, closed) for authorized users

Zoom in/out and adapt the type of visualization if the issue density gets very sparse

Page 26: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 26

> Open Civic Platform for Dresden (2)

Page 27: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 27

> Open Civic Platform for Dresden (3)

Page 28: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 28

> New York – Example

Page 29: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 29

> New York – Example (2)

Page 30: In the Age of Open Information - Do-It-Yourself Analytical Mashups on Schema-optional Data

© Prof. Dr. -Ing. Wolfgang Lehner |

In the Age of Open InformationDo-It-Yourself Analytical Mashups on Schema-optional Data

Katrin BraunschweigJulian EberiusMaik ThieleWolfgang Lehner

OUTPUT 2011