16
Eurostat Assessing the quality of integrated data (ESSnet on quality of multisource statistics) Sorina Vâju Eurostat, European Commission

Assessing the quality of integrated data (ESSnet on quality ......Eurostat Quality facets Input Quality of raw data Whether and how a given data source can be used on a regular basis

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Eurostat

    Assessing the quality of integrated data

    (ESSnet on quality of multisource statistics)

    Sorina Vâju Eurostat, European Commission

  • Eurostat

    Summary

    1. Problem statement – quality in a multisource

    environment

    2. ESS.VIP ADMIN

    3. ESSnet on quality of multisource statistics

    2

  • Eurostat

    How to measure quality in a new environment?

    3

    Multisource statistics

    Big data

    surveys admin data

    Newly available sources

    New demands

    Cost and burden

    Are we still measuring quality in a proper way?

  • Eurostat

    Quality facets

    Input

    Quality of raw data

    Whether and how a given data source can be used on a regular basis to produce statistics

    Process

    Whether final data is “real”

    Magnitude of errors introduced in processing stage

    Analyse of statistical process

    Output

    User easy to understand information on the quality of the final data

    4

  • Eurostat

    Output quality assessment by input and process

    Process step Risk Impacted quality

    dimension Error

    measurement

    Linkage and determination of the target population

    Missed link, wrong link: under/over coverage

    Accuracy, comparability

    Bias, confidence range of the target population

    Concept/ definition

    Aggregation of different concept/definitions

    Relevance, accuracy, comparability

    Bias, Variance error, qualitative assessment

    Imputation/ estimation

    Estimation error Accuracy Bias, variance error

    Classification Wrong classification

    Relevance, accuracy, comparability below a certain level of aggregation

    Bias, variance error

    5

  • Eurostat

    Is it feasible to assess output quality through input/process quality?

    6

    Multiple sources

    •Surveys

    •Admin data

    •Big data

    Multiple uses

    • Direct use

    • Sampling frame

    • Auxiliary information

    • Calibration

    Complex processes International level

    • ESS aggregation

    • International aggregation

    Multisource output

  • Eurostat

    Is input and process assessment enough?

    7

    Useful for Less useful for

    Will the user understand?

    Which final data is better?

    Is the output good enough?

    Improving the process

    Designing the process

    Deciding on which sources to use

  • Eurostat

    Alternative: assessment based on the output itself

    • ach

    8

    Exclusively based on output

    Time series/cross

    sectional data

    Breaks in series

    Revisions

    Outliers

    Based on reference

    source

    Comparison with other statistics/

    sources

    Quality

    surveys

    Bootstrapping based on admin

    data

    Primary/ complementary

    data

    Support sampling

    Auxiliary information

  • Eurostat

    Why the ESSnet on quality of multisource statistics?

    9

    Problem

    •Complex environment

    • Input + process assessment is cumbersome and not sufficient

    Suggestions

    •Direct assessment on the basis of the output

    •Use of reference source

    •Bootstrapping

    Needs

    •Develop quality indicators based on output

    •Develop step-by-step algorithms for implementation

    •Cost-benefit analysis

    •Update templates for quality reports

  • Eurostat

    ESS.VIP ADMIN areas of work

    10

    Access to data

    Quality measurement

    Methodology for integrating

    sources

    Frames for social statistics

    Use of Commission

    administrative data

    ESSnet on quality of multisource statistics

  • Eurostat

    ESSnet on quality of multisource statistics

    11

    Input quality

    Quality of frames for

    social statistics Output quality

  • Eurostat

    Work area 1: quality of input

    Tasks:

    • Critical review and testing of existing methodology

    • Consolidated version of input checklist

    • Gap analysis

    Delivery: June 2016

    12

  • Eurostat

    Work area 2: quality of frames for social statistics

    Tasks:

    • Review of literature and comparative analysis of frame

    types

    • Gap analysis

    • Development and test of some quality measures

    • Proposal for further work

    Delivery: November 2016, April 2017

    13

  • Eurostat

    Work area 3: quality evaluation of statistical output based on multiple sources Tasks:

    • Critical review of existing quality measures and approaches

    • Tests of the suitability of existing measures and approaches

    in several domains

    • Action plan for developing a theoretical framework for

    measuring output quality

    Delivery: November 2016, June 2017

    14

  • Eurostat

    Additional work

    Communication and dissemination:

    • CROS portal will be used for dissemination

    • Workshop on quality of multisource statistics, 21-22 April,

    Budapest

    Further work up to 2019

    • Framework and indicators for assessing the quality of

    frames for social statistics

    • Framework and indicators for assessing the quality of

    output

    • Recommendations for the ESS Handbook for Quality Reports

    15

  • Eurostat

    Contact

    ESS. VIP ADMIN

    • Sorina Vâju (Eurostat): [email protected]

    ESSnet on quality of multisource statistics

    • Niels Ploug (Statistics Denmark): [email protected]

    • Sorina Vâju (Eurostat): [email protected]

    16

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]