40309461

Embed Size (px)

Citation preview

  • 8/13/2019 40309461

    1/17

    Page 1 of 17

    Andreas LorenzStatistics DepartmentDeutsche Bundesbank

    Revisions analysis and the role of metadata*

    Contribution to the OECD/Eurostat Task Force on

    Performing Revisions Analysis for Sub-Annual Economic Statistics

    Abstract

    Revisions analysis has become a widely used tool for describing many dimensions of the quality ofeconomic indicators such as the reliability of first estimates and the size and volatility of later revi-sions. An often neglected aspect in this exercise is the role that metadata plays in the calculation ofand use of the results of revisions analyses. The purpose of this paper is to highlight the impor-tance metadata has when carrying out and using revisions analysis. This is done in the form of acase study based on real-time data of the German Index of Industrial Production for the period1999 to 2007, comprising unadjusted and seasonally adjusted vintages. Changes in the revisionpattern can be traced back to changes in the collection and compilation methods and to some ex-traordinary single events in the period under investigation. Furthermore it can be seen that, withoutmetadata, past revisions analysis may be misleading when making inferences about future revi-sions.

    Keywords:revisions analyses, real-time data, metadata, industrial production.

    JEL classification:C19, C80, C82.

    Contact: [email protected]

    * This paper presents the authors personal opinions and does not necessarily reflect the view of the

    Deutsche Bundesbank or its staff. The joint OECD/Eurostat Task Force was established to develop a set of guidelines and best practices for

    performing and using the results of revision analysis.

  • 8/13/2019 40309461

    2/17

    Page 2 of 17

    1 Introduction

    Revisions analysis has become a widely used tool for characterising many dimensions of the qual-

    ity of economic indicators such as the reliability of first estimates and the size and volatility of later

    revisions. The increasing availability of real-time data bases1and of user-friendly tools2for perform-

    ing revisions analyses is fostering their widespread use. Often, the results of such analyses are

    used to assess the quality of official statistics and for making comparisons across countries. Even

    more, they are now commonly used for building expectations about future revisions and this influ-

    ences current analysis and forecasts of economic developments.3

    An often neglected aspect in this exercises is the role that metadata plays in performing revisions

    analyses and in the use of their results. Perhaps the reason is that metadata are sometimes re-

    ferred to as data that merely act as identifiers and descriptors of the data which are needed to iden-

    tify, use and process data matrixes and cubes.4Of course, such a view is much too narrow. Ac-

    cording to internationally agreed standards, the metadata contains information about methods usedin the collection and generation of data.5In the field of economic statistics, this would include infor-

    mation about the methodology underlying the collection and compilation of economic indicators,

    information about the quality of first and consecutive releases (ie amount of missing values in a

    provisional release) as well as information regarding the revisions policy.

    The purpose of this paper is to highlight the importance of metadata for performing and using revi-

    sions analyses. This is done in the form of a case study based on real-time data of the German

    Index of Industrial Production for the period 1999 to 2007, comprising unadjusted and seasonally

    adjusted vintages. For this indicator, a revisions analysis will be performed under the assumptionthat the user does not have any information about the data other than that contained in the figures

    themselves. The results will then be contrasted with the ideal case in which the user has knowledge

    of the most important metadata. It will be shown that the conclusion derived from the revisions

    analysis depends heavily on the metadata provided. The outline of the paper is as follows. Section

    2 describes the origin of the real-time data set that is used for the case study. Section 3 includes a

    revision analysis for both of the aforementioned assumptions about the information set available to

    the user. Section 4 concludes.

    1 See McKenzie (2006) and website of the OECD/Eurostat Task Force(http://www.oecd.org/document/10/0,3343,en_2649_34257_39129226_1_1_1_1,00.html ) for a listing ofvarious publicly accessible real time data bases.

    2 See the tool developed by the OECD/Eurostat task force available via the following internet site:http://www.oecd.org/document/27/0,3343,en_2649_34257_40010971_1_1_1_1,00.html .

    3 An example is given in the article Odd numbers from The Economist from January 31st 2008, whichuses the results of revisions analysis in this way as illustrated by the following excerpt: Between 1994 and2004 - years for which figures are no longer likely to be much updated - the average (annualised) revisionto the growth rate between the advance and the latest figure was 1.3 percentage points. Recently revi-sions have tended to be downwards. In the past five years 60% of initial estimates were later restated at alower rate.

    4 For example, the International Organization for Standardisation (ISO) definition of metadata is as generalas data that defines and describes other data (ISO/IEC 11179-1, 1999(E): Information technology -

    Specification and standardisation of data elements - Part 1: Framework for the specification and stan-dardisation of data elements, First edition 1999-12-01.

    5 See Data and Metadata Reporting and Presentation Handbook, OECD (2007), p. 75.

  • 8/13/2019 40309461

    3/17

    Page 3 of 17

    2 Data issues

    For the purpose of the case study, it would be helpful to be able to base the analysis on both unad-

    justed as well as on seasonally adjusted data. Since some major methodological changes in the

    compilation of production statistics took place at the beginning of 1999, it would also be helpful if

    the vintages start at least at that date and go up to the present. Ideally, the data should be ex-

    tracted from a comprehensive real-time data base.6While the Bundesbank has developed a real

    time data base covering a broad selection of economic indicators, including production statistics, it

    only began storing the vintages from November 2005 onwards.7Therefore, other sources had to be

    used for the vintages published before that date.

    Seasonally adjusted vintages8from January 1999 to October 2005 were taken from the Statistical

    Supplement 4 to the Monthly Report, a periodical which is published by the Bundesbank on a

    monthly basis and includes, among others, seasonally adjusted time series of production statistics.

    The unadjusted data used in the analysis were taken from a print publication of the FSO. 9The se-

    ries analysed is production in the manufacturing sector. The manufacturing sector differs from the

    industry sector as it is defined in the Statistical Supplement of the Bundesbank, where it comprises

    the manufacturing sector ex energy as well as the mining and quarrying sector ex energy producing

    materials. Unfortunately, unadjusted vintages for industrial production according to this definition

    are not available for the period before November 2005. Therefore, the closest equivalent unad-

    justed aggregate series from the FSO publication was used.10Although the two series differ in their

    composition regarding the inclusion or exclusion of energy, due to the relatively low weight of the

    energy sector in Germany the discrepancies are small. This difference notwithstanding, the focus ofthe exercise is not to decompose revisions stemming from revisions of unadjusted data and revi-

    sions of seasonal and calendar factors, a case where the difference could indeed be significant.

    Some further caveats are necessary. The fact that a bulk of the data had to be entered manually

    from print publications (particularly seasonally adjusted data and unadjusted data for vintages up to

    October 2005) makes the resulting real-time data set prone to typing errors. Although the data was

    double-checked, the data collection process for the present case study was not subject to the same

    scrutiny as the figures published in the Statistical Supplement regularly are.

    Finally, the resulting real-time data set does not have a purely symmetric triangular shape. Such a

    symmetric form would result if a) the publication schedule of the producer of the data has the same

    frequency as the periodicity of the indicator and b) the figures are released regularly for each re-

    porting period. Neither was the case in the period of investigation. Data for January 1999 were not

    6 For some recommendations on data and metadata requirements for building a real-time database to per-form revisions analysis of real-time data see McKenzie and Gamba (2008a).

    7 The real-time database of the Deutsche Bundesbank will be made publicly available via the internet(http://www.bundesbank.de/index.en.php) in the course of the year 2008.

    8 The seasonal adjustment includes a calendar adjustment procedure.

    9 Statistisches Bundesamt, Fachserie 4 Reihe 2.1, various issues.10 Actually, in December 2003 the FSO began publishing an equivalent aggregate series. However, in order

    to avoid a break in the series, the closest equivalent series is used for the whole period of investigation.

  • 8/13/2019 40309461

    4/17

    Page 4 of 17

    released in March (as was the case in later years), but together with the figures for February in May

    of the same year. The result is that the shape of the real-time data set deviates from a symmetric

    triangle. Furthermore, until the end of 2005, the production statistics were released twice a month.

    As the Statistical Supplement and the print publication from the FSO that was used to collect the

    unadjusted data have a monthly frequency, this second vintage in a month is missing in the as-

    sembled real-time data set.11However, the second release was usually only a correction of the first

    provisional release on the basis of the data from late respondents. Therefore, its information con-

    tent is included in the vintage of the following month, which includes the revision to the provisional

    release of the previous reporting month, the new provisional figure of the current reporting month

    and, possibly, revisions of figures from other reference periods (more details regarding the typical

    revisions cycle are given in section 3.2.) This procedure was changed in 2006, when the FSO be-

    gan to publish the revision to the first estimate together with the estimate for the following month, so

    that there is now only one release per month by the FSO. This means that the preliminary release

    for the reporting month and the first revision of the preliminary release for the previous month are

    published at the same time in one vintage.

    3 Revisions analysis and the role of metadata

    While the applications of revisions analysis are manifold, their usefulness can be seen from two

    basic perspectives. From the point of view of the producer, they are primarily an instrument for

    quality monitoring. From the user perspective, they are helpful for building expectations about fu-

    ture revisions of provisional figures in order to come to a better understanding of the current eco-

    nomic momentum, flash estimates or forecasts. The following case study aims to investigate the

    role the knowledge of metadata can have when interpreting the results of revisions analysis fromboth perspectives.

    Some words are in order regarding what kind of metadata is relevant for the case study. In recent

    years, a number of different initiatives have been involved in the development of standards as to

    what is to be considered an element of the metadata dimension of time series. One of these initia-

    tives, the Statistical Data and Metadata Exchange (SDMX) initiative, specifies the term in more de-

    tail and makes a distinction between structural and reference metadata. Structural metadata

    are metadata that act as identifiers and descriptors of the data which are needed in order to iden-

    tify, use and process data matrixes and cubes. Reference metadata include a) conceptual

    metadata describing the concepts used and their practical implementation, allowing users to under-

    stand what the statistics are measuring and, thus, their fitness for use; b) methodological meta-

    data, describing methods used for the generation of the data (eg sampling, collection methods,

    editing processes); c) quality metadata, describing the different quality dimensions of the resulting

    statistics (eg timeliness, accuracy).12

    For the case study, the focus is on methodological metadata. But before summarising the most

    important metadata, the exercise will focus on the (not so unrealistic!) case in which the user only

    11 In the real-time database mentioned in footnote 7, beginning with November 2005 each vintage is storedthe day it is released by the FSO.

    12 See SDMX (2006).

  • 8/13/2019 40309461

    5/17

    Page 5 of 17

    has the information contained in the vintages themselves. The purpose is to show the limits of the

    interpretation of revisions analysis in the absence of metadata. An alternative analysis and presen-

    tation of the metadata will be given in sub-section 3.2.

    3.1 Revisions analysis without metadata

    The revisions analysis will focus on month-on-previous-month (mom) growth rates that have been

    calculated on the basis of seasonally adjusted vintages.

    The aim is to answer two questions:

    1. What does revisions analysis tell us about the extent of revisions at different time periods

    (beginning with the preliminary releases and their consecutive revisions)?

    2. Can this information be used to build expectations about future revisions of preliminary es-timates?

    The answer to the first question gives insight into the effect of revisions of an indicator for a particu-

    lar reporting period and would be of particular value for producers of official statistics in the proc-

    ess of quality control. The usefulness revisions analysis would probably have for most consumers

    of economic indicators would presumably depend on its suitability to answer the second question,

    ie to gain information for building expectations about future revisions. In particular, how good the

    information set contained in the data themselves is for the quality of these expectations will be of

    interest.

    With both questions in mind, the next step is to select an adequate revision measure from the stan-

    dard possibilities mentioned in McKenzie and Gamba (2008b). A useful measure of the size of the

    revisions is the Mean Absolute Revision (MAR) which avoids offsetting effects on the indicator from

    positive and negative revisions:

    n n

    t t tt 1 t 1

    1 1MAR L P R

    n n= == = (1)

    Ltdenotes the later estimate, Ptis the preliminary (or earlier) estimate, Rt= LtPtis the revision and

    n is the number of observations.

    As the absolute revision can vary in proportion to the level or the mom growth rate of the indicator,

    it may be helpful to scale the MAR in terms of the size of the earlier estimates when doing compari-

    sons over time. (This would also be useful for complementing international comparisons because it

    adjusts for the differing average size of estimates across countries.) An MAR adjusted in such a

    way, relative mean absolute revision (RMAR), can also be interpreted as a measure of robustness

    for interpreting revisions of first published estimates as it gives the expected percentage of first

    published estimate that will be revised over the revision interval being considered:

  • 8/13/2019 40309461

    6/17

    Page 6 of 17

    n n

    t t tt 1 t 1

    n n

    t tt 1 t 1

    L P R

    RMAR

    L L

    = =

    = =

    = =

    (2)

    Figure 1 shows a yearly breakdown of the RMAR between estimates of the IIP mom rates at vari-ous revision intervals. One result is that the size of the revisions typically increases with the length

    of the interval being analysed. This is seen most clearly when the averages are calculated for the

    years 1999 to 2006. With the exception of the year 2005, the RMAR after one month is usually

    similar to the RMAR values calculated after 6, 12 or 24 months. A key message of figure 1 is that,

    on average, more than 45% of the initial mom growth rate is revised after 12 months or more.

    Relative mean absolute revision (RMAR) between estimates of the IIP Figure 1at various revision intervals for mom growth rates

    Seasonally adjusted, percentage points

    0.00

    0.10

    0.20

    0.30

    0.40

    0.50

    0.60

    0.70

    0.80

    1999 2000 2001 2002 2003 2004 2005 2006 1999-2006

    RMAR between first estimate and 1 month later RMAR between first estimate and 6 months later

    RMAR between first estimate and 12 months later RMAR between first estimate and 24 months later

    *) MAR between first estimate and 24 months later not yet available for 2006 at the time of writing.

    *)

    *)

    Source: Own calculations based on seasonally adjusted vintages.

    While the RMAR gives a good impression of the overall size of the revisions regardless of their

    sign, it does not indicate whether revisions to first releases of different reporting periods have a

    tendency to cancel out over the revision interval under consideration. One measure that gives such

    an indication is the mean revision. In the case of mean revision, positive and negative deviations of

    the same amount cancel out, giving an indication of the net effect of subsequent revisions on a time

    series. The Mean Revision (MR) is calculated according to the following formula (McKenzie and

    Gamba, 2008b):

    n n

    t t tt 1 t 1

    1 1

    R (L P ) Rn n= == = (3)

  • 8/13/2019 40309461

    7/17

    Page 7 of 17

    This indicator answers the question Is the average level of revision close to zero, or is there an

    indication that revisions are more in one direction than another, suggesting possible bias in the ini-

    tial estimate?.

    A breakdown of the mean revisions of the IIP by year is given in figure 2. Again, the measure is

    calculated for the mom growth rate of the first preliminary release of the IIP and its value 6, 12 and

    24 months later.

    Mean revision between estimates of the IIP at various revision intervals Figure 2for mom growth rates

    Seasonally adjusted, percentage points

    -0.10

    -0.05

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    0.40

    1999 2000 2001 2002 2003 2004 2005 2006 1999-2006

    Mean revision between first release and 1 month later Mean revision between first release and 6 months later

    Mean revision between first release and 12 months later Mean revision between first estimate and 24 months later

    *)

    *) Mean revision between first estimate and 24 months later not yet available for 2006 at the time of writing.

    *)

    Source: Own calculations based on seasonally adjusted vintages.

    The following main aspects can be observed:

    1. All in all, the mean revisions are below 1/4 percentage point (pp), apart from for 1999.

    2. The first mean revision to the mom rate, which takes place between the first estimate and

    the figure released one month later, is positive with the exception of the year 2000.

    3. The extent of the revision between taking place after 6 months is much higher than that af-

    ter 1 month. Sometimes most of the revisions are seen 12 or even 24 months later.

  • 8/13/2019 40309461

    8/17

    Page 8 of 17

    4. When averaging over the whole period under observation, the positive mean revision of the

    mom rate after one month is 0.1 pp after rounding, the mean revision after 6, 12 and 24

    months is well above 0.1 pp.13

    In summary, on the basis of the average revision of the preliminary release and the figure released

    one month later in the years 1999 to 2006, the user may expect an upward revision of the mom rate

    of 0.1 pp after rounding. (Recall that the exercise assumes that the user does not have any other

    information than that contained in the vintages). The source of the revisions can be revisions to the

    unadjusted data as well as revisions to seasonal factors.

    3.2 Revisions analysis with metadata

    What difference would knowledge of the metadata make to the user for understanding past revi-

    sions and making expectations about future revisions? To answer this question, the following sec-

    tion summarises the most important methodological issues and idiosyncratic factors in the periodunder investigation. Afterwards, changes in the metadata (such as changes in the compilation

    process of the production statistics) will be compared with historical changes in the revisions re-

    gime. Finally, the implications of this exercise will be looked at, especially the inclusion of these

    metadata in the information set of the user for forming expectations about future revisions.

    3.2.1 Methodology for collecting and compiling the production statistics

    The following section gives a summary of some important methodological issues regarding the

    compilation of production statistics.

    In the year 1999, a new survey method was introduced for the compilation of the production statis-

    tics. With the aim of lowering the statistical burden on the enterprises, the full reporting sample of

    the production survey was split into mutually exclusive quarterly and monthly reporting sub-

    samples. In the case of the monthly reporting sub-sample, in each of the Lnder(the German fed-

    eral states) the largest production units of the economic sectors, covering at least 75% of the sector

    output produced by firms with 20 or more employees, were obliged to submit a monthly production

    report. This ensured a national coverage in excess of 80%.14

    While the largest firms had to report monthly, the smaller firms only had to report at quarterly inter-

    vals. In order to give a closer representation of the monthly production of all enterprises, the

    monthly figures were benchmarked with the results from the full quarterly sample comprising small

    and large firms. This was done by comparing the results from the quarterly survey with the quarterly

    aggregate computed on the basis of the monthly figures and calculating an alignment factor. This

    factor can only be calculated ex post, ie about 2 months after the end of the past quarter. For pre-

    adjusting the most recent monthly figures, the alignment factor of the past quarter was used as an

    estimate of the current alignment factor (and in the first quarter of year t the factor of the first quar-

    13 When applying a modified t-Test, the mean revision after 1 month is marginally significant at the 15%level, the mean revision after 6, 12 and 24 months is significant at the 5% level.

    14 See Bald-Herbel (2000), Herbel and Weisbrod (1999) and Jung (2003).

  • 8/13/2019 40309461

    9/17

    Page 9 of 17

    ter in year t-1 was used). After the information of the most recent quarterly survey becomes avail-

    able, the alignment factor of the past month is replaced with the alignment factor calculated for the

    most recent data and the pre-aligned figures of the respective quarter are revised accordingly.

    Since this second revision of monthly figures is based on information from the quarterly survey, it is

    called quarterly revision.15It affects the three months of the respective quarter.

    After the conclusion of the quarterly report for the final quarter of the year, a third revision, the so-

    called annual revision of all monthly figures of the respective year is performed. After this third re-

    vision, the IIP is considered as final. (After that, further changes can be made in the course of re-

    basing or back-calculation according to a new classification system16).

    The provisional monthly figures are published according to a schedule fixed in advance for the en-

    tire year. It was scheduled about 37 days after the end of the reporting period (t+37). Due to the

    fact that not all monthly reporting enterprises submitted their report in time for the provisional re-

    lease (about 10% were usually missing), the first revision to the provisional figure took place afterfurther monthly reports had been received in the same publication month (about t+57 to 62 days).

    This revised figure is the result of incorporation of the data of late respondents. The yearly revision

    took place about 2 months after the last quarter of the year. The three later releases for the same

    month, the corrected figure, the quarterly and yearly revisions, do not follow an exact pre-defined

    schedule.17

    Further insight into the revisions process is gained by taking a closer look at the imputation practice

    for missing values in the provisional release. With respect to the origins of the provisional figure,

    local reporting units of the enterprises which report monthly have to submit their output figures afterthe end of the reporting month to the statistical office of the respective federal state. In case the

    production data is not available on time, the local reporting units should provide a provisional esti-

    mate. Only in cases where the local unit does not submit any figure at all on time, does the statisti-

    cal office of the respective federal state impute the missing data by using the figure reported by the

    same unit in the previous month.18

    The development of monthly figures may diverge from that of quarterly figures for a number of rea-

    sons. As the reporting units for the monthly survey are selected only once per year and not updated

    for the following eleven months, a dying out sample is to be expected. The reason is that any clo-

    sure of businesses, eg due to bankruptcy, immediately diminishes the size of the sample, whereas

    15 From November 1999 to October 2006, the estimated provisional adjustment of monthly industrial produc-tion has been stated in a footnote by the Bundesbank in its Statistical Supplement No. 4 to the MonthlyReport, Seasonally Adjusted Business Cycle statistics.

    16 The treatment of such benchmark revisions in revisions analysis has been investigated by Knetsch andReimers (2006).

    17 In the Bundesbank publication Statistical Supplement No. 4 to the Monthly Report, revisions of originaldata are flagged with an r.

    18 This imputation procedure implies that if the previous month was a month with many working hours,

    whereas the reporting month is a month with few working hours (eg due to public holidays), then the useof the figure of the previous month would overestimate the figure of the current month in a way that is sys-tematically dependent on the calendar constellation.

  • 8/13/2019 40309461

    10/17

    Page 10 of 17

    new enterprises will only enter into the sample during the annual update.19However, quarterly fig-

    ures are based on a survey updated regularly, which partly explains why, apart from the lower cut-

    off rate, they may diverge from the aggregated monthly figures.

    While the aforementioned aspects point to recurrent characteristics of the compilation process,

    there are also period-specific exceptional revisions in some months, quarters and even years.

    3.2.2 Revisions analysis of unadjusted data

    In order to uncover the revisions to unadjusted data in its pure form, the following revision analysis

    is carried out on the unadjusted data. This allows the revisions to be seen in a more direct way

    where they are not influenced by the update of seasonal factors. Furthermore, it will be performed

    directly on the levels, not on a transformation like the mom growth rate used in the previous sec-

    tion.

    It would be useful to see the absolute size of the revisions that take place a) over the whole interval

    between the provisional release and the revised figure released in the yearly correction, as well as

    b) over different revision intervals in between these two points in time (ie provisional to first correc-

    tion, first correction to quarterly correction, quarterly correction to yearly correction).

    In order to disentangle the absolute size of revisions of the whole interval and those across the

    relevant revision sub-periods, a measure of Cumulative Absolute Revisions (CAR) is computed

    according to the following formula:

    3

    i ij ijj 1

    CAR L P=

    = (4)

    where i denotes the month and j the revision (first, quarterly and yearly).20

    These incremental absolute revisions in a particular month can be visualised with a stacked column

    chart, whose segments are the incremental absolute revisions and whose total height are their ac-

    cumulation, which is CARi. Such a stacked column chart gives an impression about the absolute

    size of revisions over the whole revisions interval from the first provisional release to the last revi-

    sion and also unravels the distribution of the overall absolute revisions to the respective sub-

    periods.

    19 Actually, even then, new enterprises would only enter into the sample with a time lag of at least 2 to 3years, because the sampling universe is the enterprises register, which is, in turn, updated with a time lagof 1 to 2 years using secondary sources such as administrative data from the tax authorities.

    20 For example, in this formula the incremental absolute revision of the quarterly revision (j=2) is defined asthe absolute value of the figure from the quarterly correction minus the figure from the first revision (ie the

    figure published 20 to 25 days after the provisional release). The CARiof a particular month i is just thesum of the incremental absolute first, second and third revisions of the first provisional release of the re-spective reporting month.

  • 8/13/2019 40309461

    11/17

  • 8/13/2019 40309461

    12/17

    Page 12 of 17

    Cumulative relative absolute revisions by reporting period, 2001-2002 Figure 4

    Unadjusted data, percent of preliminary figure

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    Jan Feb Mar Apr May Jun Jul

    2001

    Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul

    2002

    Aug Sep Oct Nov Dec

    Preliminary release to first revision First to quarterly revision Quarterly to yearly revision

    Source: Own calculations based on unadjusted vintages published by the Federal Statistical Office.

    Cumulative relative absolute revisions by reporting period, 2003-2004 Figure 5

    Unadjusted data, percent of preliminary figure

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    Jan Feb Mar Apr May Jun Jul

    2003

    Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul

    2004

    Aug Sep Oct Nov Dec

    Preliminary release to first revision First to quarterly revision Quarterly to yearly revision

    Source: Own calculations based on unadjusted vintages published by the Federal Statistical Office.

  • 8/13/2019 40309461

    13/17

    Page 13 of 17

    Cumulative relative absolute revisions by reporting period, 2005-2006 Figure 6

    Unadjusted data, percent of preliminary figure

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    1.6

    1.8

    2.0

    Jan Feb Mar Apr May Jun Jul

    2005

    Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul

    2006

    Aug Sep Oct Nov Dec

    Preliminary release to first revision First to quarterly revision Quarterly to yearly revision

    Source: Own calculations based on unadjusted vintages published by the Federal Statistical Office.

    Up to the end of the year 2004 the missing values for the provisional figures were imputed by using

    previously reported figures. This can lead to systematic over/underestimation depending on the

    calendar constellation, such as in the case of December 2001. With the reporting month January2005, the method for imputing missing values was changed. The estimate is now based on the as-

    sumption that the mom rate for the non-reporting units equals the mom rate of the data received

    within the deadlines. This contrasts with the previous method for imputing missing values, which

    was based on figures from the previous month and was thus dependent on the seasonal and cal-

    endar constellation. Simulations show that the new estimation method does not induce an over-

    /underestimation that varies systematically with the calendar constellation. Hence, the average re-

    visions derived from the past do not yield to an expected revision measure for the current end. Or in

    other words, the assumption, that the past revision regime is still valid, is wrong. Therefore, for the

    time being no expected correction should be used at the current end. Particularly, it should not be

    expected that the revision pattern within a year depends systematically on the calendar constella-

    tion.

    In the year 2006 there was no major change in the methodology. However, in 2007 the compilation

    of the production index changed again. Now the Lnderconduct a survey at local manufacturing

    units with 50 employees or more. There is no cut-off line at the level of the Lnder. A quarterly pro-

    duction index is still calculated by aggregating the data reported for three months by the local

    manufacturing units with 50 employees or more and the production data of the other local units of

    enterprises with generally 20 employees or more in industry which are obliged to report quarterly.

    Furthermore, an update of the enterprises sample for the monthly survey is now done on a monthly

    basis. As the whole revisions cycle for the figures of the year 2007 was not yet finished at the time

  • 8/13/2019 40309461

    14/17

    Page 14 of 17

    of writing, figure 7 only shows the measures that can be calculated on the basis of the available

    data. The results suggest that particularly the revision from the first revised figure to the quarterly

    correction is indeed smaller in absolute size than the average of the previous years.

    Cumulative relative absolute revisions by reporting period, 2007 Figure 7

    Unadjusted data, percent

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    Jan Feb Mar Apr May Jun Jul 2007 Aug Sep Oct Nov

    Preliminary release to first revision First to quarterly revision Quarterly to yearly revision

    *

    *

    Note: *) Quarterly correction for Q4 as well as preliminary to first revision for the month of December and the yearly

    correction of 2007 not available at the time of writing.

    Source: Own calculations based on unadjusted vintages published by the Federal Statistical Office.

    3.2.3 Revisions analysis of seasonally adjusted mom growth rates

    Along with information about the methodology underlying the collection and compilation of the pro-

    duction statistics, the metadata also comprises information about the timing of revisions. This al-

    lows the calculation of the mean revision for different revision intervals for the mom rates of the IIP.

    Recall that in section 3 the mean revision was calculated after 1, 6, 12 and 24 months. Since the

    revision intervals do not exactly match the time lag in months, the procedure in section 3 meant that

    the mean revision between the provisional release and the release 6 months later would in some

    cases include the yearly correction and in some cases not. Knowledge of the exact dates of these

    revisions makes it possible to calculate the mean revisions for exact revision intervals.

    The result of mean revisions between actual revision intervals is depicted in figure 8. Negative

    mean revisions across all intervals can be seen in the year 2000 and, regarding the revision from

    the first estimate to the quarterly revision, in the year 2003. Usually, the bulk of the revisions take

    place after the first month, sometimes, in the years 1999, 2002 and 2006 they take place after the

    quarterly revision. When averaging over the years 1999 to 2006, particularly the mean revision be-

    tween the first revised figure and the quarterly correction is positive and higher than that of the year

  • 8/13/2019 40309461

    15/17

    Page 15 of 17

    2007, when the monthly update of the reporting sample was introduced. On average, the yearly

    revision does not add much information.

    Mean revision between estimates of the IIP at various revision intervals Figure 8

    for mom growth rates

    Seasonally adjusted, percentage points

    -0.10

    -0.05

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    0.35

    1999 2000 2001 2002 2003 2004 2005 2006 2007 1999-2006

    Mean revision between first release and 1 month later Mean revision between first estimate and quarterly revision

    Mean revision between first estimate and yearly revision

    *)

    *) Quarterly correction for Q4 as well as preliminary to first revision for the month of December not

    available. Yearly correction for the year 2007 not available at the time of writing.Source: Own calculations based on data from the Federal Statistical Office.

    3.3 Summary

    Awareness of the causes of revisions across different intervals over the revisions cycle is helpful for

    improving compilation methods. As the case study for the German index of industrial production

    shows, the official statistical institutes have clearly drawn lessons from the revisions history and

    improved methods for imputing missing values.

    The revision measures calculated on the bases of unadjusted levels for each release period and for

    different revisions intervals reveal that revisions are not constant from year to year or between

    months within a given year. Changes in the revision pattern can be traced back to changes in the

    collection and compilation methods and to single events like the interruption of the update of the

    survey of enterprises in the year 2002.

    While the exercise is helpful for interpreting the results of historical revisions analyses, it also gives

    useful information for forming expectations about future revisions. For example, up to the end of the

    year 2004 the missing values for the provisional figures were imputed by using previously reported

    figures. However, this can lead to systematic over/underestimation depending on the calendar con-

  • 8/13/2019 40309461

    16/17

  • 8/13/2019 40309461

    17/17

    Page 17 of 17

    visions. Expectations on the basis of the implicit assumption of stability of the metadata (ie the

    methodology underlying the collection and compilation of the statistic) may lead to false conclu-

    sions about future developments when compilation methods change.

    All in all, the results show that metadata, an often neglected dimension of performing and using

    revisions analyses, is in fact a key element for interpreting the results of revisions analyses. Ideally,

    the methodology underlying the economic indicators should be an easily accessible dimension of

    any concise real-time data set.

    References

    Bald-Herbel, C. (2000). Erste Erfahrungen mit dem neuen Konzept des Produktionsindex fr das

    Produzierende Gewerbe. Wirtschaft und Statistik 6/2000.

    Herbel, N. and Weisbrod, J. (1999). Auswirkungen des neuen Konzepts der Produktionserhebun-

    gen auf die Berechnung der Produktionsindizes ab 1999. Wirtschaft und Statistik 4/1999.

    Jung, S. (2003). Revisionsanalyse des deutschen Produktionsindex. Wirtschaft und Statistik

    9/2003.

    Knetsch, T. A. and Reimers, H.-E. (2006). How to treat benchmark revisions? The case of German

    production and orders statistics. Discussion Paper Series 1: Economic Studies, No 38/2006

    http://www.bundesbank.de/download/volkswirtschaft/dkp/2006/200638dkp.pdf

    McKenzie, R. (2006). Performing Revisions and Real-time Analysis. Introducing the Main Eco-

    nomic Indicators Original Release Data and Revisions Data Base, OECD Statistics Briefs

    no. 12, 2006. http://www.oecd.org/dataoecd/46/48/37669085.pdf

    McKenzie, R. and Gamba, M. (2008a). Data and metadata requirements for building a real-time

    database to perform revisions analysis. Contribution to the OECD / Eurostat taskforce on

    Performing Revisions Analysis for Sub-Annual Economic Statistics.

    http://www.oecd.org/dataoecd/47/15/40315408.pdf

    McKenzie, R. and Gamba, M. (2008b). Interpreting the results of Revisions Analyses: Recom-

    mended Summary Statistics. Contribution to the OECD / Eurostat taskforce on Performing

    Revisions Analysis for Sub-Annual Economic Statistics.http://www.oecd.org/dataoecd/47/18/40315546.pdf

    OECD (2007). Data and Metadata Reporting and Presentation Handbook.

    http://www.oecd.org/dataoecd/46/17/37671574.pdf

    SDMX (2006). Metadata Common Vocabulary BIS, ECB, Eurostat, IBRD, IMF, OECD, UNSD,

    presented on the SDMX website, available at www.sdmx.org

    The Economist (2008). Odd numbers. January 31st 2008.

    http://www.economist.com/finance/PrinterFriendly.cfm?story_id=10609385