Upload
tanner
View
25
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Emerging Trends in Data Exchange and Data Hubbing Jacob Assa, UN Statistics Division. Regional Workshop on Data Dissemination and Communication. Manila, the Philippines June 20-22, 2012. United Nations Statistics Division 2012. Outline of the Presentation. Data Dissemination in Context - PowerPoint PPT Presentation
Citation preview
Emerging Trends in Data Exchange
and Data Hubbing
Jacob Assa, UN Statistics Division
Regional Workshop on Data Dissemination and Communication
Manila, the PhilippinesJune 20-22, 2012
United Nations Statistics Division2012
2
Outline of the Presentation
1. Data Dissemination in Context
2. Dissemination History at UNSD
3. Dissemination versus Communication
4. Data Exchange and SDMX
5. Data Hubbing Nationally and Globally
3
Data Dissemination in Context
Virtual Value Chain :
(Svend and Hollensen, 2001)
Dissemination – last but not least step Often done as an afterthought Can be made more efficient and effective:
From Data Publishing to Data Exchange From Data Silos to Data Hubbing
Organize, select and compile information
Defineinformation problem
Synthesize informationDistribute informationValue
4
Dissemination History in UNSD
League of Nations 1919-1948 – print publications
United Nations 1948-1995 – print publications (yearbooks, manuals) 1995-2000 – CD-ROM, static web pages 2000-2008 – online databases, dynamic web queries
(UN Comtrade, UN Common Database) 2008 – launch of UNdata – UN System data portal 2010 – World Statistics Pocketbook app for iPhones and
iPads 2012 – launch of CountryData – UN national data portal
5
Dissemination versus Communication
One-way vs. two-way communication Considerable evolution of statistical communication over
recent years Traditionally, statistical organizations focused on
Dissemination through printed publications One-way communication through few media channels
Newspapers Radio Television
Since 1990s, acknowledged need to do more than just disseminate data
Employing communication professionals Widespread use of the Internet New methods of communication and dissemination
6
Dissemination versus Communication
New methods of communication:
Web 2.0 technologies Blogs Wikis Social networks
Interactive websites Allow users to upload data and create graphs Sharing and discussion with other users
7
Paper questionnaires Excel sheets CSV files Email
Semi-structured XML files However, XML in itself is simply a mark-up
language and does not standardize data structure between exchanging parties
Data Exchange - Unstructured
8
XML - Example
Philippines,
GDP in constant 2000 US$
(World Bank)
1960 17,990,832,237
1961 19,001,301,599
1962 19,908,256,877
1963 21,313,876,851
9
10
Data Exchange - Structured
Statistical Data and Metadata Exchange (SDMX)
What is it? An initiative to foster standards for the electronic
exchange of statistical information Goal - explore e-standards that could increase efficiency
gains and avoid duplication Sponsored by BIS, ECB, EUROSTAT, IMF, OECD, UN, WB
What it is not Not a technology…but implemented using technology
(XML EDIFACT syntax and GESMES/TS message)
How does it work? Exchange partners agree on Data Structure Definitions Data and metadata exported and imported accordingly
11
Benefits of SDMX
Protection of existing technology investments
Many different types: Data warehouses OLAP cubes GESMES/TS Publication systems
SDMX standardizes formats and protocols at the point where data and metadata go between counter-parties
12
REPOSITORY Provisioning
Metadata
REGISTRY Data Set/
Metadata Set
REPOSITORY Structural Metadata
Subscription/Notification
Register
Query
Submit
Query
Submit
Query
Describes data and metadata structures
Describes data and metadata sources and reporting processes
Indexes data and metadata
SDMX Registry Interfaces
SDMX Registry/Repository
13
Impact of the SDMX Registry
The SDMX Registry allows for one of the major efficiency gains possible with SDMX:
Shifting from “push”-based reporting to “pull”-based reporting
This can save lots of time and duplication of effort
14
Specifies a set of concepts which describe and identify a set of data
Tells which concepts are the dimensions (identification and description) and which are attributes (just description)
Tells which code lists provide the possible values for the dimensions and attributes
What is a Data Structure Definition?
15
16
17
What is Data Hubbing?
In general, a hub is the central part of a wheel where the spokes come together. The term is familiar to frequent fliers who travel through airport "hubs" to make connecting flights from one point to another
In data communications, a hub is a place of convergence where data arrives from one or more directions and is forwarded out in one or more other directions
http://searchnetworking.techtarget.com
–
18
Data Hubbing at the National Level
Cambodia – DFID Project Objectives
Improve coordination in the National Statistical System
Collate development data in one place/hub
Make access to national data easier
Reduce data request burden
Use of latest IT software and practices
19
Line Ministries National Statistical Office United Nations
Line Ministry
Database
National Indicator Registry
National Repository DB
DevInfo
Upload
XLS
Scripts
Register files
Post
notification
Publish
SDMX-ML
Download
Mapping tool
Project Dissemination Model
20
Data Hubbing at the International Level (1)
The Joint External Debt Hub (JEDH)
Jointly developed by Bank for International Settlements (BIS) International Monetary Fund (IMF) Organization for Economic Cooperation and
Development (OECD) World Bank (WB)
21
JEDH Site before SDMX
BIS
IMF
OECD
WorldBank
WEBSITE
(VariousFormats) (3-month production cycle)
22
JEDH with SDMX
BIS
IMF
OECD
WorldBank
SDMX-ML
SDMX-ML
SDMX-ML
SDMX-ML
SDMX-ML(Debtor database)
[Info about data is registered]
SDMX“Agent”
SDMXRegistry
Discover data and URLs
Retrieves data from sites
JEDH Site
Data providedin real timeto site
SDMX-MLLoaded into
JEDH DB
23
Data Hubbing at the International Level (2)
UNdata Portal Before, a researcher interested analyzing the
effects of population, health and education on per capita income growth would need to visit: UNSD website for population figures WHO website for health indicators UNESCO website for education indicators UNSD/World Bank/IMF website for income data
Now all these indicators are available in one place through a single user interface
24
Comtrade World BankIMFILOFAOPopulation UNESCO
Abstraction layer
Source Databases
UNdata Portal
Data hub contained cached copies of source databases
Internet
Search Engine
25
http://data.un.org/
26
Data Hubbing at the International Level (3)
European Central Bank (ECB)
Push vs. pull plus a hybrid approach
Central Hub to which all member banks submit their SDMX data
The ECB then pulls the entire dataset from the Central Hub
SDMX-based visualizations
27
28
Resources
UNSD - Handbook of Statistical Organization (3rd ed.)
http://unstats.un.org/unsd/dnss/hb/default.aspx UNECE - Making Data Meaningful (2 parts)
http://www.unece.org/stats/documents/writing/ SDMX - http://sdmx.org/
ContactsUnited Nations Statistics Hotline - [email protected] Assa, UNSD - [email protected]