MEETING TODAY’S DISSEMINATION CHALLENGES:Implementing international standards in .Stat
Prepared by Jonathan Challener, OECDFor MSIS, April 2014 - Dublin, Ireland
Doesn’t non-standard power supplies make things difficult?
What happens when standards are not applied well?
Picture: ‘The day Sweden changed from left-hand drive to right’
Confusion entails
This all adds up…
…high costs…
and inefficiencies!
and inefficiencies!“A little like the grade 8 student who doesn’t pay attention in class all year”.
WHAT IS .STAT?
What is .Stat?
.Stat is the central repository ("warehouse") of validated statistics and related metadata
.Stat is the central hub connecting data production, sharing & dissemination
processes
It is the corporate source of data for data sharing and dissemination purposes
What is .Stat?
.Stat is the central repository ("warehouse") of validated statistics and related metadata
.Stat is the central hub connecting data production, sharing & dissemination
processes
It is the corporate source of data for data sharing and dissemination purposes
“.Stat is now being used and shared with 10 organisations including the OECD, as part of the Statistical Information System Collaboration Community (SIS-CC)”.
.Stat Positioning in Statistical Information System
DATA DELIVERY
INTERNAL DATA SHARING
DATA DISSEMINATION
DATA PRODUCTION
.STAT
.Stat Positioning in Statistical Information System
DATA DELIVERY
INTERNAL DATA SHARING
DATA DISSEMINATION
DATA PRODUCTION
.STAT
“The diagram illustrates the .Stat contribution to the SIS processes. .Stat’s core value-added lies in “Data Delivery”, a set of functions that enable dissemination and data sharing, and “Data Upload”, a set of functions interfacing data production processes into a single upload mechanism to feed dissemination channels”.
.Stat Functional Representation
.STAT DATA DELIVERY ENGINE
DATA PRODUCTION
DATA SHARING DATA DISSEMINATION
SEARCH ENGINES
DATA ANALYSIS TOOLS
PC
WEBSITES, APPSPUBLICATIONS.STAT BROWSER
.STAT DATA UPLOAD ENGINE
FILEUPLOAD
SDMX IMPORT
DATA PRODUCTION TOOLS
TABLE & CHART EXTRACTION SERVICES
RELEASE MGTSERVICES
.STAT BROWSERCONFIGURATION
DATA EXTRACTION SERVICES
SDMX INPUT
E
P
BATCH UPLOAD
SDMX GLOBAL
REGISTRY
PUBLISHINGBACK
OFFICE
DATA MAPPING
SDMX OUTPUT
X
X
.Stat Component
Process
Human userData ProducerData EditorData Consumer
API orWebservice
OtherSDMX hubs
.Stat Functional Representation
.STAT DATA DELIVERY ENGINE
DATA PRODUCTION
DATA SHARING DATA DISSEMINATION
SEARCH ENGINES
DATA ANALYSIS TOOLS
PC
WEBSITES, APPSPUBLICATIONS.STAT BROWSER
.STAT DATA UPLOAD ENGINE
FILEUPLOAD
SDMX IMPORT
DATA PRODUCTION TOOLS
TABLE & CHART EXTRACTION SERVICES
RELEASE MGTSERVICES
.STAT BROWSERCONFIGURATION
DATA EXTRACTION SERVICES
SDMX INPUT
E
P
BATCH UPLOAD
SDMX GLOBAL
REGISTRY
PUBLISHINGBACK
OFFICE
DATA MAPPING
SDMX OUTPUT
X
X
.Stat Component
Process
Human userData ProducerData EditorData Consumer
API orWebservice
OtherSDMX hubs
“The grey shaded boxes in the figure below show a visual representation of how .Stat fits within a broader Data Dissemination Information System of organisations; the boxes with dotted lines represent other components of the Data Dissemination Information System that are not supported by .Stat but are enabled by it”.
.Stat Functional Representation
In particular, .Stat provides the following 3 key functional areas…
.Stat Functional Representation.Stat Data Upload Engine
.Stat Functional Representation.Stat Data Delivery Engine
.Stat Functional Representation.Stat Data Browser
.Stat Positioning in GSBPM Reference Model
.Stat contributes to Planned additions
Archive incorporated into the over-arching process of data and metadata management
.Stat Positioning in GSBPM Reference Model
.Stat contributes to Planned additions
Archive incorporated into the over-arching process of data and metadata management
“.Stat can be mapped today to the Generic Statistical Business Process Model (GSBPM) under “Disseminate” and “Build”. In the future it will also incorporate archive functions as part of the over-arching process for data and metadata management”.
Multipurpose SDMX within .Stat…
For dissemination and data eXchange
SDMXWS and RESTful API• SDMX 2.0 compliant• SOAP + REST• Pull• SDMX-ML• SDMX Structural
metadata created on the fly
For ‘Open Data’ dissemination
SDMX-JSON (beta)SDMX-TWG agreed in mid 2013 on proposal for data and their structural metadata (inc. flat & sliced layouts) and referential metadata (dataset, series, obs) as annotations.
Further enhancements to come: Complete data structures and referential metadata
For data reporting
SDMX-Reference Infrastructure (RI)*• SDMX 2.0 and 2.1
compliant• SOAP + REST• SDMX Common APIs
(SdmxSource.NET)
• Pull + Push• SDMX-ML, GESMES ,
CSV• Structural metadata
stored in mini registry• One web service - several
mapped database instances
Mapping Store DB
XXX.StatData
warehouse
SDMX-RI Web Service
DisseminationMapping Assistant
SDMX-RI
* The integration of SDMX-RI in .Stat is based on collaboration with Eurostat, provider of the SDMX-RI component with ISTAT taking the lead on behalf of the OECD’s Statistical Information System Collaboration Community.
For internal data sharing
DirectAccess• Restful SDMX query• Flat data, flags, units• Referential metadata
Excel-add-in• DirectAccess (Rest
SDMX)• Native Excel pivot table• Wizard to select data
For a decentralised publishing environment
DataHub*• One interface to the
publishing tools• Centralised reporting and
auditing• SDMX based structural
metadata, and referential metadata management
• Flexible load tool that promotes ‘self publish’ for data custodians
• In-built checks and safeguards to minimise errors
• Manages security and access rights
• Can be extended to manage other outputs and not limited to .Stat
* DataHub has been developed and integrated with .Stat by Statistics NZ, with an additional connection to the Fusion Registry for managing structural metadata through the definition of DSDs.
Future outlook…
Further SDMX artifact support
SDMX ingest (Import)
SDMX global registry API
SDMX-RDF data cube vocabulary pilot
SDMX-RDF data cube vocabulary pilot
“Explore further semantic web/linked data opportunities (SDMX-RDF data cube vocabulary). To be taken forward by ISTAT and ABS under the SIS-CC umbrella”.
• Lower technology adoption costs• Increased development consistency,
simplicity and predictability• Improved code reuse• Reduced cost, time and effort to transition
between different solutions
We all know the…
• Reduced focus on infrastructure• Ability to create composite interfaces that are tailored to the
needs of specific task• Improved application portability• Enable faster time to market because it is easier to use off the
shelf components and applications that can integrate and provide features for the solution
References
1. Operationalising .Stat in a decentralised publishing environment (DataHub) by Tony Breen SNZ : https://community.oecd.org/docs/DOC-68362
2. Building a scalable architecture (.Stat) by Jens Dosse OECD: https://community.oecd.org/docs/DOC-68363
3. SDMX-RI and .Stat integration by Francesco Rizzo Istat: https://community.oecd.org/docs/DOC-68696
4. SDMX-JSON API: http://stats.oecd.org/opendataapi/Index.htm
Jonathan Challener, OECD [email protected]@Challener
MSIS - Dublin, 14-16 April 2014Meeting today’s dissemination challenges: Implementing international standards in .Stat
Thank you