Upload
maikthiele
View
1.973
Download
0
Tags:
Embed Size (px)
DESCRIPTION
The increasing amount and variety of open and crowdsourced data available in the web leads to new challenges in end-user focused data analysis. This data is characterized by a great structural diversity which causes serious problems regarding their integration. On the other site there is a lack of end-user friendly tools to make productive use of the data available on the web. We want to address the first problem by developing a schema-optional graph-based data model that enables incremental schema augmentation and evolution. The second problem should be adressed by a multi-layered domain-specific language for data mashup construction on schema-optional data.
Citation preview
© Prof. Dr. -Ing. Wolfgang Lehner |
In the Age of Open InformationDo-It-Yourself Analytical Mashups on Schema-optional Data
Katrin BraunschweigJulian EberiusMaik ThieleWolfgang Lehner
OUTPUT 2011
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 2
> The Roots of Open Data
The open society is a concept originally developed by philosopher Karl Popper
In open societies, government is responsive and tolerant, and political mechanisms are transparent and flexible
The state keeps no secrets from itself in the public sense It is a non-authoritarian society in which all are trusted with the knowledge of all
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
1984 - Freedom of Information Campaign starts up
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 4
> Why Data Should be Open
Many scientific data can be deemed to belong to the commons (“the human race”), e.g. the human genome, medical science, environmental data
They have an infrastructural role essential for scientific endeavour (e.g. in Geographic Information Systems and maps)
Data published in scientific articles are factual and therefore not copyrightable Public money was used to fund the work and so it should be universally available It was created by or at a government institution
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 5
> Open Data – Examples
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 6
> data.gov
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 7
> data.gov.uk
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 8
> data.worldbank.org
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 9
> unData
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 10
> OpenStreetMap
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 11
> Civic Applications based on Open Data
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 12
> Explore How U.S. Budget Proposal
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 13
> Mapnificient
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 14
> Schooloscope
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 15
> Fluglärmkarte (taz.de)
Database
Journalism
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 16
> Open Data – Challenges an
Challenges Lots of contributors / maintainers Small information pieces distributed, decentralised and very loosely coupled Different degree of schema information and meta data Innovation / unexpected reuse No standardized development process
Contributions Schema-optional data store, collaborative schema augmentation (basic
operators) Measure degree of schema information Non-destructive schema changes Capture data provenance Visualizations and interaction patterns Iterative and guided development Data and visualization recommendation
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 17
>
The Big Picture
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 18
> Do-It-Yourself Schema Augmentation
Automated Schema Extraction
Schema Augmentation
AT
E
TT
V
AT
E
TT
V
ET
ReferenceNode
EntityTypes AttributeTypes
NoType
NoType
AT1AT2
AT3
AT4
ET1
ET2 ET3
ET4
AT1 : valueAT2 : value
AT3 : valueAT4 : value
CSV File Relational Table
Application
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
>
19
Do-It-Yourself Analytical Mashups
„number of cafes vs. age distribution per district
of Dresden“ Look up fitting data sets
Process Query
Compute suitable visualization
Compute interaction / exploration features
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
>
20
Do-It-Yourself Analytical Mashups (2)
Look up fitting data sets
Process Query
Compute suitable visualization
Compute interaction / exploration features
number of cafes vs. age distribution per district of Dresden
natural geographic entityvalue dimensions relations/operations
NLP techniques + Lookup services (e.g. GeoNames)
„number of cafes vs. age distribution per district
of Dresden“
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
>
21
Do-It-Yourself Analytical Mashups (3)
Process Query
Compute suitable visualization
Compute interaction / exploration features
Look up fitting data sets
Identified Dimension Candidate Datasets
number of cafes OpenStreetMap
Recommendation service, e.g., Yelp
age distribution Municipal Statistics Agency Dresden
district of Dresden OpenStreetMap
Ambiguity user feedbackOR
„number of cafes vs. age distribution per district
of Dresden“
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
>
22
Do-It-Yourself Analytical Mashups (4)
Process Query
Compute suitable visualization
Compute interaction / exploration features
Look up fitting data sets
Identified Dimension Properties Visualization Candidates
number of cafes Number for each district Bars per district ORcolor of each district
age distribution Distribution for each district Multiple histograms
district of Dresden Polygon for each district Map
„number of cafes vs. age distribution per district
of Dresden“
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information
>
23
Do-It-Yourself Analytical Mashups (5)
Process Query
Compute suitable visualization
Compute interaction / exploration features
Look up fitting data sets
„number of cafes vs. age distribution per district
of Dresden“
number of cafes age distribution
Too much information for one visualization enable exploration, e.g., clicking a district in the
map opens histogram
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 24
>
Demo
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 25
>
Mobile application
Map-centric webapplication
REST Interface
#
3rd-party applications
Persistence Layer# #
Open Civic Platform for Dresden
Mobile Application Add new requests by guiding the user
through a wizard-style input form Show (own) reports and there current
rating and processing actual state Visualize all reports on a map Subscribe to a set of urban district and
notify the user about newsWeb Application
Filter the requests by their category, their creation time (last 24 hours, last week, last month, all)
Change the requests state (open, closed, closed) for authorized users
Zoom in/out and adapt the type of visualization if the issue density gets very sparse
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 26
> Open Civic Platform for Dresden (2)
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 27
> Open Civic Platform for Dresden (3)
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 28
> New York – Example
© Prof. Dr.-Ing. Wolfgang Lehner| In the Age of Open Information 29
> New York – Example (2)
© Prof. Dr. -Ing. Wolfgang Lehner |
In the Age of Open InformationDo-It-Yourself Analytical Mashups on Schema-optional Data
Katrin BraunschweigJulian EberiusMaik ThieleWolfgang Lehner
OUTPUT 2011