Upload
dwain-robbins
View
216
Download
0
Embed Size (px)
Citation preview
Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
The European Census HubChanging the Business Model
Adam Wroński, Marco Pellegrino, Nadezhda Vlahova, Bengt-Åke LindbladEurostat, “Quality, Methodology and Information Systems” Directorate
2Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
The European Census Hub: key issuesThe European Census Hub: key issues
Dissemination of the data from the 2011 population and housing censuses in the European Union
Massive amount of data produced and controlled by Member States
Comparable data, structured according to agreed “hypercubes”: harmonised concepts and definitions (Census Regulation)
Easy access to detailed census data (advanced functionalities)
Easy to use: tabulation of data from different sources
High accessibility of data and metadata
3Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
EU Census: Implementing measuresEU Census: Implementing measures
Regulation (EC) 763/2008 on population and housing censuses authorises the European Commission to adopt implementing measures on:
– technical specifications of the topics and their breakdown (Regulation (EC) 1201/2009)
– programme of the statistical data and metadata to be transmitted to Eurostat (Regulation (EU) 519/2010)
– quality reporting and technical format of data transmission (Regulation (EU) 1151/2010)
4Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Article 6 of Regulation (EU) 1151/2010
– Member States shall transmit the required data conforming to the data structure definitions and related technical specifications provided by the Commission (Eurostat)
– The technical format to be used for the transmission of data and metadata for the reference year 2011 shall be the Statistical Data and Metadata eXchange (SDMX) format
– Member States shall store until 1 January 2025 the required data and metadata for any later transmission requested by the Commission (Eurostat)
Technical format for data transmissionTechnical format for data transmission
5Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
A system of 60 Data Structure DefinitionsA system of 60 Data Structure Definitions
60 DSDs built on SDMX 2.0 (standard concepts and codes) and in use for 31 countries of the European Statistical System
SDMX data message (cross-sectional format) SDMX query message Tests on-going: started with dummy data and sample
hypercube, now to continue with real hypercubes Infrastructure to be maintained (and data stored) at
least until 2025
6Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Population topics
Sex SEXAge AGELegal marital status LMSCountry/place of birth POBCountry of citizenship COCPlace of usual residence - one year prior to the census ROY(Size of the) Locality LOCHousehold status HSTType of private household TPHSize of private household SPHFamily status FSTType of family nucleus TFNSize of family nucleus SFN
Topics required for all geographical levels down to Local Administrative Units (LAU = municipalities)
Housing topics
Occupancy status of conventional dwellings OCSNumber of occupants NOCUseful floor space and/or Number of rooms UFS/NORDensity standard DFS/DRMDwellings by type of building TOBDwellings by period of constructionPOCType of living quarters TLQ
Which data for which geographical area?Which data for which geographical area?
7Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Topics required for aggregated geographical levels NUTS 2, NUTS 1 and nation
Population topics
Year of arrival in the country YAE / YATEducational attainmentEDULocation of place of work LPWCurrent activity status CASOccupation OCCIndustry INDStatus in employment SIETenure status of households TSH
Housing topics
Housing arrangements HARType of ownership (of dwellings) OWSWater supply system WSSToilet facilities TOIBathing facilities BATType of heating TOH
Which data for which geographical area?Which data for which geographical area?
8Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Example: DSD for Table 6 (Marital Status)
ID CONCEPT CODELIST
TIME Time period or range CL_TIME
GEO Geographical area CL_GEO
SEX Sex CL_SEX
FST Family status CL_FST
LMS Legal marital status CL_LMS
CAS Current activity status CL_CAS
POB Country/place of birth CL_POB
COC Country of citizenship CL_COC
AGE Age CL_AGE
FREQ Frequency CL_FREQ
ID ATTACHMENT LEVEL
CODELIST
OBS_STATUS Observation CL_OBS_STATUS
OBS_LEVEL Observation CL_OBS_LEVEL
OBS_NOTE Observation
HC_NOTE Series
ID NAME
OBS_VALUE Observation value
Dimensions
Measures
Attributes
9Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
The Hub concept
The Hub is based on the concept of data sharing, where a group of partners agree on providing access to their data according to standard processes, formats and technologies.
The SDMX Hub approach offers several advantages:
– decoupling of NSIs' systems from the central hub via standard formats and techniques for the exchange;
– NSIs free to provide more information than what is contained in the agreed hypercubes without additional effort;
– Limited investment, re-usability (with the advantage of using recognized international standards).
– SDMX-RI is furnished by Eurostat.
10Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
IT Architectures for Data ExchangeIT Architectures for Data Exchange
11Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Data Repository (Warehousing) Architecture
NSI
EurostatPull Requestor
eDAMIS
Data Input
SDMX Registry
Intermediatestorage
Verification /ConversionTo SDMX
Receiveddata in
SDMX-MLLoader
register
Warehousestorage
Eurobase
query
Dissemination
XSL forSDMX-ML
PULL
PUSH
12Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
The SDMX Hub architecture
Pull mode and data sharing Hub SDMX-RI infrastructure
13Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
14Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Through the SDMX Hub, a data user can…
Browse the Hub to define a dataset of interest, navigating via structural metadata:- Select hypercube or search by topic (filters)- Select data (level of detail, breakdowns)- Select layout (axes)
View a table
Save a query
Export a file (CSV, Excel, SDMX-ML)
15Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
How the Hub worksHow the Hub works
Eurostat CensusHub
National Statistical Institute
National Statistical Institute
16Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
http://193.109.72.201:8080/CensusHub2/
17Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Where are we - ESS?Where are we - ESS?
• Countries integrated within the Hub (19) (end 2011)
Czech Republic, Germany, Ireland, Italy, Luxembourg, Poland, Portugal, Spain, Norway, Sweden, Slovenia, Bulgaria, Belgium, Latvia, Malta, Kypros, Hungary (11/2012), Finland(11/2012), France*
• Expressing their interest or working on it, but not yet in the Hub (6)Estonia, Greece, Lithuania, Netherlands, Switzerland, United Kingdom
• Asking for more information (6)Austria, Denmark, Iceland, Liechtenstein, Romania, Slovakia
• 1-2 June 2010: 1st Census IT technical WG
• 14-15 February 2011: 2nd Census IT technical WG
• 7-8 December 2011: 3rd Census IT technical WG
• SDMX training, technical meetings
18Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
How to know more
CIRCA - > Eurostat - > X-DIS Census Hub -> Census Hub public documents
Read the experiences from the MSs that have already implemented the infrastructurehttp://circa.europa.eu/Public/irc/dsis/x-dis-xensus-hub/home
In case of need, contact Eurostat unit B5 for technical advice
– SDMX training available– Technical bilateral meeting can be arranged upon request
19Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
How to know more (2) The Implementation Kit
SDMX Structure files– Concepts– Codelists (excluded GEO)– Keyfamilies (one for each hypercube)– Partial Geo Codelist
SDMX data message example (cross-sectional), SDMX Query message example and Dummy hypercube in CSV format
MIG XML schema (XSD file) Explanatory notes
Census Hub WS Implementation Guideline Census Hub WS Security Guideline Census 2011 Regulation Census 2011 Explanatory notes
20Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Census Hub environment
CensusHubApplication
Offline download
QueryManagement
Cache
Multi Threading
Multilanguage support
Result Aggregation
Query MgtNSI1
Query MgtLocal WS
Result MgtNSI2
Query MgtNSI2
Result MgtNSI1
Result MgtLocal WS[.....]
Administration Application
Property Management
Web service configuration Management
User Management
LAU Management
LAU User mangement
View LAU code History
Search LAU Codes
Import/Export LAU codes
Freeze/UnFreeze LAUCodes
Public user Management
SMD Management
Import codelist,schema,
DSD
Manage Hypercube
categorization
Manage Principal Marginal
Publish LAU codes
Manage database
connection
Internet Connection
LAU Manager/NUTSManager For
each NSInsiDB nsiDB
NSI1 NSI2
Other NSIs
[……]
CensusHub LocalWS DB
CensusHub DB SMD/LAU DB
LAU Validator SMD Manager
Weblogic 10.3Application Server
LDAP/ECASSMTP
Internet
Internet
Public users
User Authentication
Polls data
Loads data
send
21Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
Lesson learnt and benefits in participating
Statistical needs first. Then, technological aspects.
Capacity-building is a must– Participating organisations are gaining a good in-house experience in
SDMX and its implementation
A system of distributed databases is harmonised through the use of SDMX standards and content guidelines
SDMX-RI can be reused for sharing data in other domains
– Limited cost for installations, development costs can be reduced– Step forward towards generic solutions for statistical domains
22Eurostat Unit B5 – Statistical Information TechnologiesSDMX Basics Training – October 2011
For more information:
Unit B5, Section «Standardisation and Advanced IT for Statistics»
Census Hub on CIRCA:
http://circa.europa.eu/Public/irc/dsis/x-dis-xensus-hub/library