Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
WP-K Methodology and QualityBig Data Typification
Magdalena Six, Sonia Quaresma, Piet Daas , Alexander Kowarik, Tiziana Tuoto, Jacek Maslankowski
Implementation Track KickOff Meeting,Vienna, 9-10th of December, 2019
ESSnet Big Data IIWPK: Methodology and Quality
Typification – based on the ideas we discussed during the Kick off
ESSnet Big Data IIWPK: Methodology and Quality
What is the purpose of the Typification Matrix?
• The aim is to consolidate knowledge gained in this ESSnet (and with limitations outside, in academia and the non-ESS official statistics community)
• To assess a big data source level of maturity regarding its statistical exploitation
• To help NSIs establish their strategy for statistical production using big data
ESSnet Big Data IIWPK: Methodology and Quality
To establish a big data source overall level of maturity regarding its statistical exploitation we will consider the elements that characterize the big data source:
- Source & Access - that comprises the type of access, legal and ethical issues, as well as other limitations, like technical or monetary matters
- Metadata - to characterize what is already known from the data source
- Data Type- formats, volumes and structures
ESSnet Big Data IIWPK: Methodology and Quality
To delineate a strategy for statistical production using big data every element has to be considered in terms of:
- Description - of the data source, that is crucial for the next steps
- Challenges - that the data source poses
- Procedures- to tackle the challenges, when already identified
- Investment- required to put in place the foreseen treatments
- Roadmap - whenever is possible to devise one to implement the procedures identified
ESSnet Big Data IIWPK: Methodology and Quality
• The matrix was developed on confluence but there were difficulties when filling or printing
ESSnet Big Data IIWPK: Methodology and Quality
Developed an online survey https://ec.europa.eu/eusurvey/runner/BGWPK_TypificationMatrix
ESSnet Big Data IIWPK: Methodology and Quality
We provided a prefilled example https://webgate.ec.europa.eu/fpfis/wikis/pages/viewpage.action?spaceKey=EstatBigData&title=Big+Data+Typification+Example+-+MNOs
ESSnet Big Data IIWPK: Methodology and Quality
We also explained what he expected to hear from you on our instructional video https://cloud.ine.pt/index.php/s/Us2tJKLCNFBDFfm
ESSnet Big Data IIWPK: Methodology and Quality
We are aware that in most cases it’s too early for roadmap and investments forecast, specially for the pilots track
In some cases the treatments are still under research and it’s possible that other/more challenges will still come up
ESSnet Big Data IIWPK: Methodology and Quality
But we already identified some bottlenecks to the big data statistical production, and will present some of the first results
ESSnet Big Data IIWPK: Methodology and Quality
Summary of results for Source & Access
ESSnet Big Data IIWPK: Methodology and Quality
Summary of results for Source & Access
ESSnet Big Data IIWPK: Methodology and Quality
Second Wave more ambitious in terms of Sources!!!
ESSnet Big Data IIWPK: Methodology and Quality
What is the purpose of the Typification Matrix?
Now in a more practical form!
• identify which are the basic building blocks required by each stage and if possibly they can be reused across different data sources but in similar stages
• establish how to ascertain the data quality in the particular examples whenever possible generalizing!
• collect a pool of methodologies to deal with specific problems gathering knowledge
ESSnet Big Data IIWPK: Methodology and Quality
What is the purpose of the Typification Matrix?
Find common building blocks but also to help you describe your work package in a way that makes it:
• Easier to replicate and evaluate by others meaning would this source be useful to us? Could we reuse this processing block or part of it?
• Be more generic to the type of problem and not so focused on intrincate details whenever possible generalizing!
• Be more modular and thus more reusable gathering knowledge
ESSnet Big Data IIWPK: Methodology and Quality
Phrasing in a constructive and informative way
ESSnet Big Data IIWPK: Methodology and Quality
Phrasing in a constructive and informative way
ESSnet Big Data IIWPK: Methodology and Quality
Second track is focusing more in combining data
Again more ambitious!
ESSnet Big Data IIWPK: Methodology and Quality
Identifying the building blocks in the Matrix (WPs/Data Classes) used to bridge between the Quality issues and the followed Methodologies
Timeline & Deliverables w.r.t. Typification
• M9: First draft of the quality guidelines
• M13: Updated literature review,Revised version of quality guidelinesQuality report template draft
• M17: First draft of methodological reportRevised quality report templateUpdated and extended literature Review
• M18: Typification Matrix for big data projects
• M24: Evolution roadmap between the areas of the typification matrix
• M25:Report describing quality aspects of the different pilotsRevised literature overviewReport describing the meth.steps of using big data in official statistics with a sectionon the most important questions for the future including guidelines
Thank your for your attention! Any questions, comments?!
Contact: Magdalena SixStatistics AustriaEmail: [email protected]
creativecommons