2
STRATEGY, PROGRAMMING AND PLANNING FOR BIG DATA IN OFFICIAL STATISTICS Facilitator: Donatella Fazio, Istat
Parallel sessions 3A and 3B
3
Session 3
Related Scheveningen challenges
[SCH2] A Big Data strategy for official statistics (3A-morning)
[SCH6] Strategic partnership with other branches of government (3A-morning)
[SCH1] New official statistics opportunities for addressing unmet user needs (3B-afternoon)
[SCH6] Strategic partnerships with private data providers (3B-afternoon)
4
Session 3: Organisation
Some introducing key issues to discuss the above Scheveningen challenges
Open discussion
Expected outputs of the session 3 for a Roadmap of the usage if Big Data in Official Statistics
• List of experiences
• List of issues and possible approaches to
address them
• List of suggested actions and priorities to
deal with identified list
5
STRATEGY, PROGRAMMING AND PLANNING FOR BIG DATA IN OFFICIAL STATISTICS Facilitator: Donatella Fazio, Istat
Parallel session 3A
6
[SCH2] A Big Data strategy for official statistics
[SCH6] Strategic partnership with other branches of government
Session 3A Related Scheveningen challenges
7
Session 3A Issues for discussion
Combining Big data with official statistics?
Replacing official statistics with Big data?
Filling new data gaps, i.e. developing new 'Big data - based' measurements to address emerging phenomena (not known in advance or for which traditional approaches are not feasible)?
8
Session 3A Issues for discussion
International collaboration vs national experiences
• Which frame for international collaboration:
• ESS?
• HLG?
• global working group as suggested by the United Nation Statistical Commission?
9
Besides the Methodological and Technological issues
Legislative for the access and use of data
Privacy managing public trust and acceptance of
data re-use and its link to other sources
Financial costs of sourcing data vs. benefits
costs of using big data for statistical production
Management policies and directives about the
management and protection of the data
Training new skills needed
Session 3A Issues for discussion
10
Pilot projects/isolated initiatives vs standard production: •mapping the experiences carried out by NSIs and IO
organisations
•distilling the best practices
Big Data sources with potential application to Official Statistics: an inventory of them
•Sources from private sector and from public sector
•Additional /supplementary sources for official statistics:
which domains?
•Alternative / replacing sources for official statistics:
which domains?
Session 3A Issues for discussion
11
Getting Big Data from other Govt branches
Medical records
•Data from sensors: Fixed sensors Home automation Weather/pollution sensors Traffic sensors/webcam Scientific sensors Security/surveillance videos/images Mobile sensors (tracking) Mobile phone location Cars Satellite images
Session 3A Issues for discussion
•Data from computer systems:
Logs Web logs
12
Session 3A Issues for discussion (cont.)
NSIs working alone and seizing the privileges of data acces.
Need of exploration of Goverment Big Data
sources before identification of potential outputs.
13
European Statistical Programme 2013-2017:
• which place for Big Data?
National Statistical Programmes
• which allocation of resources in annual and multi-annual work programmes?
Distribution of exploration of Big Data sets by countries?
Session 3A Issues for discussion
14
Session 3A Tasks for participants
1. Complete the list of suggested issues
2. Detail issues in sub-issues
3. Characterize each issue
4. Experiences to tell
5. If possible, sketch possible solutions to the issue
6. Discuss dependencies among issues
7. Propose a priority-based rating of issues
15
Session 3ª: Initial list of issues 1. International collaboration: how to?
2. Legislative-Privacy-Financial-Management-Training: which issues?
3. Pilot projects vs standard production: how to proceed?
4. Inventory of BD sources: how to get from private and public?
5. Gov BD: explore potential outputs. NSIs working alone or jointly?
6. BD in European Stratistical Programme: which place?
7. BD in National Stratistical Programme: which place?
8. Exploration of BD sources by contries?
Highlight also your experiences in dealing with the issues above
17
STRATEGY, PROGRAMMING AND PLANNING FOR BIG DATA IN OFFICIAL STATISTICS Facilitator: Donatella Fazio, Istat
Parallel session 3B
18
Session 3B Related Scheveningen challenges
[SCH1] New official statistics opportunities for addressing unmet user needs
[SCH6] Strategic partnerships with private data providers
19
Session 3B Issues for discussion
Context different needs for different users.
Investigate which users' needs can be solved with Big Data Data that do not yet exists
More timely data
More detailed data
Others?
Big Data for addressing unmet user needs: Real-time data
Spatial coverage
Spatial comparability
Reduction of response burden
20
Getting Big Data from private providers:
(structured / un structured)
• Telecom companies • Mobile phone location
• Mobile data content: text messages
• Utilities (electricity, water)
• Social media and aggregator companies (Google, Facebook, etc.)
• Blogs and comments
• Personal documents
• Pictures: Instagram, Flickr, Picasa etc.
• Youtube etc.
• Internet searches
• User-generated maps
Session 3B Issues for discussion
21
Getting Big Data from private providers: (cont.)
Commercial transactions
Banking/stock records
E-commerce
Credit cards
Data from sensors
Data from computer systems
Need for exploration of sources before identification of potential outputs.
Session 3B Issues for discussion
22
Legal aspects of getting big data from private providers: Access vs. privacy. Which modifications are needed in
statistical laws?
International vs national access
Financial aspects: Buying Big Data?
Fiscal incentives?
Access for free?
Definition of ESS guidelines for standard agreements with private providers?
Session 3B Issues for discussion
23
Getting Big Data from citizens : Community data
Crowd-sourced data
Citizen reporting
How to ensure privacy and good use?
Session 3B Issues for discussion
24
Session 3B Tasks for participants
1. Complete the list of suggested issues
2. Detail issues in sub-issues
3. Characterize each issue
4. Experiences to tell
5. If possible, sketch possible solutions to the issue
6. Discuss dependencies among issues
7. Propose a priority-based rating of issues
25
Session 3b: Initial list of issues
1. Users’ needs: which can be improved by BD?
2. Unmet users’ needs: which can be met by BD ?
3. Getting BD from private providers: legal and financial aspects
4. Getting BD from private providers: international-national agreements?
5. Getting BD from citizens: how to ensure privacy and good use?
Highlight also your experiences in dealing with the issues above