Rev Quality Concept for Official Statistics

Embed Size (px)

DESCRIPTION

Quality Concept

Citation preview

Data availability and quality

STATISTICS SWEDEN

Material 12International Consulting Office

Quality Concept for Official Statistics

CONTENTS COMPARABILITY and COHERENCE

Statistical target characteristics

Units and population

Variables

Statistical measures

Study domains Reference times Comprehensiveness

Comparability over time Comparability between domains Coherence with other statistics

ACCURACYAVAILABILITY and CLARITY

Overall accuracy Sources of inaccuracy

Sampling

Frame coverage

Measurement

Non response

Data processing

Model assumptions Presentation of accuracy measures

Dissemination forms Presentation DocumentationAccess to micro data Information services

TIMELINESS

Frequency Production time Punctuality

Elaboration on the quality components

This chapter specifies and discusses the components of the quality concept, with emphasis on their descriptive meaning. For some of them comments are made on users' views. The discussion follows the structure of the quality concept, in particular for numbering components and sub - components.

1 CONTENTS of the statistics concerns the statistical target characteristics

1 1 Statistical target characteristics

A target characteristic (with target parameter as a common synonym) is defined as follows. A specific statistical measure is used to summarise the individual variable values for the objects in a group. The totality of objects under consideration is referred to as the population.

1.1.1 - 1.1.3 Objects and population, variables, and statistical measures

These quality components concern definitions of objects, variables and statistical measures which are involved in the target characteristics.

1.1.4 Study domains

Most surveys present statistics not only for the entire population, but also for study domains (subpopulations and (sub)groups are common synonyms). One speaks of "breaking down" statistics to subgroups. A premise for breaking down is, of course, that variables which specify subgroups are known (by observations or otherwise). There are restraining factors, however, for how far statistics may be broken down. One is simply that the more subgroups one takes into account, the more work and publication space is required. Another is that integrity may be jeopardised. Moreover, when statistics from sample surveys are derived for successively smaller domains, their accuracy deteriorates and ultimately becomes so bad that the statistics no longer are meaningful. Breaking down has to be terminated at an appropriate level.

The foremost aspects of this quality component are:

Specification of what kind of variables are used to form study domains.

- Specification of how far - going the division into study domains is.

Objects and variables have reference times, as is further discussed in point 1.1.5.

Example

SCB conducts each year an Income survey (HINK), which reports on the in- come distribution for Sweden's population. Reference period is calendar year. A central object in the survey is "family", consisting of either two adults ( ( 18 years) living together "as if married" and the children (< 18 years) who might live with them, or a single adult and the children who might live with him/her.

The population consists of all families who lived in the country during at least half the calendar year.

Examples of variables for families: Wages during the year, disposable income during the year, socio - economic group, number of children.

Examples of study domains: Two - adult families with three or more children, families of a single adult with children, farmer families.

The dominating types of statistical measure are means and "number of", but also occur, e.g. median.

Examples of target characteristics: Average wage during the year for blue colour worker families, average disposable income during the year for two - adult families without children, median income for senior - salaried -employee families, number of farmer families.

Comments on statistical target characteristics

A user's wishes about statistical information depend on his/her subject matter problem. Of course he/she wants that target characteristics should relate closely to the concrete problem. The better objects, variables, statistical measures, and reference times agree with his/her interests, the better the quality is as regards target characteristics. A user who may participate in the planning of a new survey or in the re - designing of an existing one has the opportunity to influence the choices of objects, variables, statistical measures, and reference times. However, since surveys normally have many users, conflicting wishes often turn up, and compromises have to be made. Even in situations with consensus about target characteristics, cost and/or timeliness considerations may lead to target characteristics that differ from the ideal ones.

Comment on study domains

If a user's interest focuses on certain specific groups, he/she of course is keen to have statistics reported for these groups. If the user's interests are more general, he/she may pay most regard to a detailed division into study domains.

1.1.5 Reference time

Objects and variables have temporal dimensions. The time may be narrowly delimited, e.g. be a specific date, which is called the reference time point. A longer time span, e.g. a calendar year, is referred to as the reference period. Usually reference times agree for variables and objects in a target characteristic, but they need not. Several reference times may be involved in a target characteristic, notably in those which measure changes.

For statistics from recurrent surveys (e.g. about average yearly wages) it is natural to view population and variables as being the same in each survey round (although "updated" between rounds), while the reference time changes. For statistics from one-time surveys it is often more natural to view reference time as part of the definitions of objects and variables.

(Example: Objects may be defined as "women aged 16 50 years who lived in Stockholm July 1st, 1998".) Whether it is most convenient to regard reference times as "independent" or as "part of the object/variable" depends on the specific situation.

Example

Object and variable with the same reference time: Average work income during 1998 for persons who lived in the country during 1998.

Object and variables with different reference times: Salary during 1998 for persons who graduated as civil engineers in 1988.

Variables with different reference times: That consumer price index for July 1998 was 257 means (somewhat sweepingly) that a specific "commodity basket" which in the base year 1980 cost 100 SEK, in July 1998 cost 257 SEK. Here 1998 as well as July 1980 appear as reference times for prices. 1.2.Comprehensiveness This quality component concerns how well the output from a "statistics system" describes a subject matter field in its "vital aspects". (For instance, how well the labour market statistics informs about vital aspects of the labour market.) However, no statistics system will live up to what all users regard as all the vital aspects of the subject matter field, but it can do so in a greater or less degree.

2 ACCURACY of the statistics concerns agreement between statistics and target characteristics

Usually there are so many disturbances in a statistical survey that the statistics do not give exact values of the target characteristics, only estimates. Here, the lead star is unbiased ("correct in average") estimates. Normally there is a discrepancy/deviation ("error") between a statistic and its target value. The smaller the discrepancy is, the more accurate the statistic is. Of course users want discrepancies to be small, preferably negligible.

Often, however, deviations are not negligible, and this is particularly the case for statistics from sample surveys. Then, at least statistically knowledgeable users want numerical bounds for the deviations, referred to as accuracy measures or uncertainty measures. The derivation of such measures is a somewhat intricate matter, though, since the discrepancies depend on the target values, which are unknown. (If they were known, a survey would be superfluous.) Statements about accuracy of statistics inevitably fall in the category "statements concerning prevailing uncertainty", a conceptually intricate topic. The common way to present accuracy information is by uncertainty intervals, as indicated below. It is likely that the value of the target characteristic lies in: uncertainty interval = statistic margin of uncertainty.

Often such an interval is a confidence interval with a specified confidence level. If this is 95 %, the interval encompasses the target value with a likelihood of 95 %. Other accuracy measures, in essence equivalent to a confidence interval, are: standard deviation, coefficient of variation and relative margin of error. Sometimes uncertainty intervals are based on special evaluations of the survey, subject matter knowledge, experience, etc. Then one talks about judgement intervals.

A classification of "errors" which is helpful in some contexts is as random or systematic. Random errors "fluctuate around 0", while systematic ones tend to "pull in the same direction" (have the same sign). Systematic errors may lead to biased statistics.

2.1 Overall accuracy

A user is chiefly interested in how well he/she can rely on the statistic from an overall point of view, i.e. to have a limit for the total deviation between the statistic and the target value. Sometimes, but as an exception rather than a rule, the producer can exhibit quantitative overall uncertainty intervals. Mostly, however, precise information can only be given concerning the extent and consequences of certain narrower sources of inaccuracy.

A comment on accuracy

Production of statistics can often be speeded up if standards of accuracy are lowered (by accepting large non - response, incomplete editing, etc.). Production time and accuracy are in conflict. In cases where fast statistics are judged to be important the dilemma is often handled by presenting "preliminary" statistics in a first round, and later "definite" statistics with better accuracy.

2.2 Sources of inaccuracy

At least as a first approximation, with pedagogical merits, the total deviation may be seen as a sum of partial deviations emanating from different error sources. It should be noted, however, that

partial errors may cancel one another (by having different signs)

assigning a particular survey weakness to a specific error sources is sometimes discretionary.

2.2.1 Sampling

This quality component concerns the part of the total deviation which is due to the fact that only a sample of objects is observed. Uncertainty intervals can often be given in the form of confidence interval.

2.2.2 Coverage

This quality component concerns the part of the total deviation which is due to the fact that the population determined by the survey frame differs from the target population.

2.2.3 Measurement

This quality component concerns the part of the total deviation which is due to measurement difficulties.

2.2.4 Non response

Non - response occurs when values for one or more variables are not obtained. The term should be interpreted in the wide sense "missing value". (It covers e.g. the possibility that data are missing because they, for one reason or another, could not be transferred from an external register.) If no usable data are obtained for an observation object, one speaks of object non - response. If usable values are collected for some, but not all, variables, one speaks of partial non - response. The quality component concerns the part of the total deviation which is due to non - response.

2.2.5 Data processing

This quality component concerns the part of the total deviation which is due to imperfection in the data processing, which comprises editing (data entry, coding, scrutiny/correction) and estimation/aggregation. For a sample survey the estimation procedure is closely related to the sampling procedure. It is therefore often natural to merge the two into the combined aspect "sampling and estimation".

2.2.6 Model assumptions

This quality component concerns the part of the total deviation which is due to the fact that statistics depend on model assumptions which are not perfectly fulfilled. Choices of sampling and estimation procedures are often based on assumptions, which matter for the accuracy of the resulting statistics but not for their unbiasedness. Moreover, adjustments to parry non - response and deficiencies in coverage and measurement are usually based on model assumptions. The mentioned types of model assumptions are preferably referred to the specific error source, while the present quality component concerns model assumptions in addition to those relating to sampling, frame coverage, measurement and non - response.

Examples are: extrapolation models, models for dividing totals into sub - totals and models for estimating contributions from non - observed parts of the population.

2.3 Presentation of accuracy measures

This quality component concerns whether or not disseminated statistics are accompanied by uncertainty measures. Comments on accuracy measures in planning and decision contexts statistics are often, at least in a first round, used so that reasoning is carried out under the premise that the statistics provide correct values. However, many users are aware that they may come to more or less misleading conclusions by disregarding the uncertainty. Those who want to judge the consequences of uncertainty are in markedly different situations if uncertainty measures are presented or not. With accuracy measures available, users who are knowledgeable in statistics can make well - founded judgements of the possible consequences of the uncertainty, and thereby make appropriate adjustments to their preliminary conclusions. Accuracy measures are of relevance not least for estimates of change.

3 TIMELINESS of the statistics concerns the relation of statistics to the current state of affairs

Users normally want statistics that describe the "now" situation, timely statistics. The time between "now" and the reference time plays the central role. It depends on production time, publication frequency and punctuality.

Comment on timeliness

Another important aspect when judging timeliness of statistics is how fast reality changes. When judging whether statistics are sufficiently up - to - date, the user combines the "age" of the statistics with his/her own views on the pace of change in the real world phenomenon under consideration.

3.1 Frequency

Statistics from recurrent surveys are usually produced according to a regular timetable (monthly, quarterly, annually, etc.). It is then natural to speak of frequency (or periodicity).

Survey frequency is the periodicity of reference times.

Data collection frequency is the periodicity of the producer's data collection.

Dissemination frequency is the periodicity with which statistics are published.

Usually the three frequencies agree, but they may differ. Users normally care most about reference time and dissemination frequencies.

Example The SCB Investment survey illustrates that the frequencies mentioned above may differ. This survey collects primary data and publishes statistics three times a year, while quarters are reference periods.

3.2 Production time

Production time concerns the time lag between reference time (reference time point or end of reference period) and publication time.

3.3 Punctuality

Punctuality concerns the agreement between promised and factual dissemination times. Comments on frequency, production time and punctuality Statistics users are commonly anxious that statistics should be available at the "right time" (e.g. to a prescribed date for decision taking). For a one - time survey the crucial component is the production time. For recurrent surveys publication frequency and punctuality are also important. For economical statistics which affect financial markets, requirements on punctuality and simultaneity can be as high as fractions of a second. Survey frequency matters not only for timeliness, a user with "historical" interest cares as well. The higher this frequency is, the better the details of a development can be studied.

4 COMPARABILITY and COHERENCE concerns how well different statistics can be used together

A common and important usage of statistics is for making comparisons, comparisons over time (time series) and comparisons between groups. The latter often concern geographical groups, e.g. comparison of conditions in different countries. However, they may also concern groups of another nature (than geographical), e.g. comparison of disposable income for two - adults families with 0, 1, 2, 3 and 4+ children respectively.

Comparisons over time and between groups should ideally be made using statistics with the "same" target characteristic, but for different reference times and for different groups. How "similar" target characteristics are, depends primarily on their definition. Another factor may also be important. Generally formulated this concerns the fact that what one "really" measures depends on how one measures. The employed survey methodology is often crucial for the comparability of statistics. It is advantageous if the same kind of observation objects, questionnaire, data collection method, estimation procedure, etc. have been used.

What is said above is accentuated by the following. Even if biased statistics are an abomination, the negative effects of bias are mitigated if the statistics to be compared are "similarly biased". When using a common survey method, the aim is of course that the method will lead to unbiased statistics. If there is bias, however, it will be of the same kind, which facilitates comparisons.

This quality component, including also the subcomponent "coherence with other statistics", concerns similarity of the definitions of target characteristics and of survey methodology.

Comments on comparability

How meaningful conclusions one can draw from comparing statistics depends not only on how "similar" target values are. Of course, it also depends on the accuracy of the statistics used for comparison. If they are very inaccurate, differences may stem from casual "noise" rather than "real" differences.

4.1 Comparability over time

This quality aspect concerns to what extent the target characteristics for a recurrent survey have been stable or changed over time.

Even for statistics with the same contents and method interesting comparisons interest may be troublesome to make because of seasonal and/or calendar variations. (For instance, Easter may fall in different quarters in different years, and the number of working days may vary between months.) Procedures for seasonal adjustments facilitate comparisons, in particular for short-term statistics.

Comments on comparability over time

For this quality aspect there are often conflicting interests. A user who is chiefly interested in the long term development of a specific phenomenon will want the statistics with different reference times to tell about "the same thing ", i.e. that target characteristics, and survey method have been unchanged over time. Users whose main interest is the "present and future state of affairs" will want reality changes to be followed by appropriate changes in the statistics, aiming at a good description of present conditions. Such changes usually deteriorate comparability over time. Adequate comparisons over time are facilitated if the statistics are accompanied by information about earlier changes in circumstances which affect the statistics (e.g. changes in taxation rules), in survey methodology, etc.

4.2 Comparability between domains This quality component concerns how similar definitions of target characteristics and survey methods are. When statistics to be used in comparisons come from the same survey, "full comparability" usually is obtained. When they come from different producers (different surveys, different statistical offices, etc.) there may be divergences.

Example

Statistics are often used in international co-operation. An example is the EU context, where financial dues are based on the member countries' economical situation. In such a case it is important that the "same" statistics from different countries tell about "the same thing". As a consequence, there are a multitude of EU regulations aiming at harmonised statistics.

4.3 Coherence with other statistics Often subject matter fields are investigated by employing statistics (with different target characteristics) from different surveys. In that case definitions of objects, variables, population and reference times, which enter in the different targets should preferably be the same. The quality aspect concerns to what extent this holds.

Example

If one wants to assess the consequences of certain potential changes in the rules concerning income taxation and social benefits, it might be of interest to combine statistics from the SCB surveys as the Income survey (HINK), the Expenditure survey (HUT) and the Rent survey (BHU). Of course, it is then highly desirable that e.g. "household" and "disposable income" are defined in the same way in the three surveys.

Comment on coherence

Sometimes statistics from one survey are adjusted so as to be consistent with statistics from another survey, thereby facilitating joint use of the statistics. Consistency may be created by requiring that the adjusted statistics should satisfy relations which are known to hold for the target characteristics. Example: Sweden's economical development is followed through "fast" (monthly) statistics as well as more careful and deep-going yearly statistics, which are produced by different surveys. In this context a natural consistency requirement is that the twelve (adjusted) monthly values should add up to the (possibly adjusted) value for the year. 5 AVAILABILITY and CLARITY

Concerns physical availability and intellectual clarity of statistics

5.1 Forms of dissemination

This quality component concerns distribution media and distribution channels for the dissemination of the statistics.

Comments on dissemination forms

Statistics often are just one of many data materials in investigations, research projects, etc. For a user to be able to easily carry out compilations from the statistics and/or integrate them with other material, they should preferably be available in "the most suitable" form.

Example: Statistics on diskette or from Internet may allow for simple and fast own compilation of tables, own derivations, etc.

5.2 Presentation

This quality component concerns how the statistical information is presented as regards text, charts, pictures, references, etc. Also how particularly interesting aspects on new statistics are highlighted.

5.3 Documentation

This quality component concerns users' possibilities to find descriptions of the statistics themselves, and how they were produced. A technical description of the process that generated the statistics is given in production documentation, while the statistics are described in a quality declaration.

Comment on documentation

For users who intend to make their own compilations from primary data (see below) an informative production documentation is usually a necessary instrument.

5.4 Access to micro data

Here micro data stands for the primary data as they appear in a survey's final observation register. Certain users are interested in statistics which are not, but could be derived from the micro data. There are then two chief alternatives.

The producer makes special derivations, in accordance with the user's requests.

The user gets access to micro data for his/her own analysis.

Comments on access to micro data

Users with well-specified problems usually prefer special derivations made by the producer. Important aspects are how fast, and to what cost such derivations can be made. Researchers and investigators are often anxious to get access to micro data for their own analyses. Thereby they get the opportunity to analyse the material faster (often also cheaper) and/or in a more

5.5 Information services

This quality component concerns users' possibilities

to get answers to questions concerning specific statistics,

to get assistance in finding their way in the "store" of official (and other) statistics. Depending on the nature of the wish for assistance, it may be met by reference to the concerned survey, the central information desk, an appropriate research institute or to some other relevant office.

A guide to the General Data Dissemination System can be found on Internet at: http://dsbb.imf.org/Applications/web/gdds/gddsguidelangs/ From Meddelanden i samordningsfrgor fr Sveriges officiella statistik (MIS 2001:1) Reports on Statistical Coordination for the Official Statistics of Sweden, Statistics Sweden 2001

PAGE 1