10
Meteorologische Zeitschrift, Vol. 17, No. 5, 663-672 (October 2008) Article c by Gebr ¨ uder Borntraeger 2008 An automated procedure to detect discontinuities; performance assessment and application to a large European climate data set MICHAEL BEGERT 1,EVELYN ZENKLUSEN 1 ,CHRISTIAN H ¨ ABERLI 1 ,CHRISTOF APPENZELLER 1 and LISETTE KLOK 2 1 Federal Office of Meteorology and Climatology (MeteoSwiss), Z¨ urich, Switzerland 2 Royal Netherlands Meteorological Institute (KNMI), De Bilt, The Netherlands (Manuscript received November 11, 2007; in revised form February 28, 2008; accepted February 29, 2008) Abstract Within the framework of the EU-project ENSEMBLES, an automated procedure to detect shift inhomo- geneities in climatological time series was developed and used for the homogeneity assessment of a European data set including values for temperature (min, mean, max), precipitation and air pressure covering at least 45 years. The automated process combines VERAQC (Vienna Enhanced Resolution Analysis Quality Con- trol) output with Alexandersson’s Standard Normal Homogeneity Test. VERAQC is designed to find outliers, systematic errors and meteorological biases in a time series comparing the data with concurrent observations of neighbouring stations. The resulting deviations can be analysed as time series for a single station and used as an input for relative homogeneity testing. Shift inhomogeneities are detected in accordance with statistical significance. An iterative procedure ensures that multiple break points are detected. The performance of the automated homogenization method was tested comparing its findings to results of a manually homogenized Swiss data set. The latter is based on a relative homogenization procedure in combination with an in-depth analysis of station history information. It is shown that the new automated procedure is able to detect shift inhomogeneities in temperature, precipitation and air pressure series but several limitations exist. Mainly the number of false alarms as well as missed break points is comparatively high at least in the Swiss Alpine test region with its complex topography. The homogeneity assessment of the ENSEMBLES data set for the period 1960-2004 revealed that 12 % to 59 % of the series were homogeneous depending on the variable. By far the highest number of shifts was detected in air pressure series followed by temperature and precipitation. Zusammenfassung Im Rahmen des EU-Projektes ENSEMBLES wurde ein automatisches Verfahren zum Auffinden von Shift- Inhomogenit¨ aten in Klimadatenreihen entwickelt, getestet und auf den ENSEMBLES-Datensatz angewandt. Der Datensatz beinhaltet Messreihen der Temperatur (Mittel, Minimum, Maximum), des Niederschlags und des Luftdrucks von mindestens 45 Jahren L¨ ange. Die entwickelte Homogenisierungsmethode besteht aus einer Kombination von VERAQC (Vienna Enhanced Resolution Analysis Quality Control) mit dem SNHT (Standard Normal Homogeneity Test) nach Alexandersson. VERAQC, entwickelt zur r¨ aumlichen Qualit¨ atskontrolle von Daten, vergleicht f¨ ur jeden Zeitpunkt den Wert einer Teststation mit einem aus umliegenden Nachbarstationen interpolierten Wert. Die Zeitreihe der resultierenden Abweichungen werden mit Hilfe des SNHT auf signifikante Inhomogenit¨ aten gepr¨ uft. Das Verfahren wird iterativ angewandt, womit sichergestellt ist, dass mehrere Br¨ uche pro Stationsreihen gefunden werden k¨ onnen. Um die G¨ ute der automatisierten Methode zu ermitteln, wurden die Resultate mit solchen eines manuell homogenisierten schweizerischen Datensatzes verglichen. Hierbei zeigte sich, dass der automatische Homogenit¨ atstest in der Lage ist, Shift-Inhomogenit¨ aten in klimatologischen Zeitreihen korrekt zu detektieren. Zumindest in komplexer Topographie, wie den Schweizer Alpen, treten jedoch eine relativ hohe Anzahl falscher Alarme und nicht detektierter Inhomogenit¨ aten auf. Die Ergebnisse der Homogenit¨ atspr¨ ufung des ENSEMBLES- Datensatzes f¨ ur die Periode 1960-2004 zeigen, dass je nach Variable, 12 % bis 59 % der Klimadatenreihen homogen sind. Die gr¨ osste Anzahl der Br¨ uche fanden sich in den Luftdruckreihen gefolgt von denen f¨ ur Temperatur und Niederschlag. 1 Introduction Climate model verification and climate monitoring re- quire quality controlled, high resolution data sets ide- ally covering some previous decades. Such a European scale gridded daily data set is under development within Corresponding author: Michael Begert, MeteoSchweiz, Postfach 514, CH- 8044 Z¨ urich, Switzerland, e-mail: [email protected] the ENSEMBLES project (www.ensembles-eu.org; HE- WITT and GRIGGS, 2004). In this paper the homogene- ity of the European station series used as input data is as- sessed applying a new automated procedure. Numerous methods are in use to evaluate the homogeneity of cli- matic time series. A comprehensive overview is given in PETERSON et al. (1998) and WMO guidelines are pro- vided in AGUILAR et al. (2003). In general homogeneity tests are applied relatively, i.e. data are tested with re- 0941-2948/2008/0314 $ 4.50 DOI 10.1127/0941-2948/2008/0314 c Gebr¨ uder Borntraeger, Berlin, Stuttgart 2008

An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Embed Size (px)

Citation preview

Page 1: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Meteorologische Zeitschrift Vol 17 No 5 663-672 (October 2008) Articleccopy by Gebruder Borntraeger 2008

An automated procedure to detect discontinuitiesperformance assessment and application to a large Europeanclimate data set

MICHAEL BEGERT1lowast EVELYN ZENKLUSEN1 CHRISTIAN HABERLI1 CHRISTOF APPENZELLER1

and LISETTE KLOK2

1Federal Office of Meteorology and Climatology (MeteoSwiss) Zurich Switzerland2Royal Netherlands Meteorological Institute (KNMI) De Bilt The Netherlands

(Manuscript received November 11 2007 in revised form February 28 2008 accepted February 29 2008)

Abstract

Within the framework of the EU-project ENSEMBLES an automated procedure to detect shift inhomo-geneities in climatological time series was developed and used for the homogeneity assessment of a Europeandata set including values for temperature (min mean max) precipitation and air pressure covering at least45 years The automated process combines VERAQC (Vienna Enhanced Resolution Analysis Quality Con-trol) output with Alexanderssonrsquos Standard Normal Homogeneity Test VERAQC is designed to find outlierssystematic errors and meteorological biases in a time series comparing the data with concurrent observationsof neighbouring stations The resulting deviations can be analysed as time series for a single station and usedas an input for relative homogeneity testing Shift inhomogeneities are detected in accordance with statisticalsignificance An iterative procedure ensures that multiple break points are detected The performance of theautomated homogenization method was tested comparing its findings to results of a manually homogenizedSwiss data set The latter is based on a relative homogenization procedure in combination with an in-depthanalysis of station history information It is shown that the new automated procedure is able to detect shiftinhomogeneities in temperature precipitation and air pressure series but several limitations exist Mainly thenumber of false alarms as well as missed break points is comparatively high at least in the Swiss Alpine testregion with its complex topography The homogeneity assessment of the ENSEMBLES data set for the period1960-2004 revealed that 12 to 59 of the series were homogeneous depending on the variable By far thehighest number of shifts was detected in air pressure series followed by temperature and precipitation

Zusammenfassung

Im Rahmen des EU-Projektes ENSEMBLES wurde ein automatisches Verfahren zum Auffinden von Shift-Inhomogenitaten in Klimadatenreihen entwickelt getestet und auf den ENSEMBLES-Datensatz angewandtDer Datensatz beinhaltet Messreihen der Temperatur (Mittel Minimum Maximum) des Niederschlagsund des Luftdrucks von mindestens 45 Jahren Lange Die entwickelte Homogenisierungsmethode bestehtaus einer Kombination von VERAQC (Vienna Enhanced Resolution Analysis Quality Control) mit demSNHT (Standard Normal Homogeneity Test) nach Alexandersson VERAQC entwickelt zur raumlichenQualitatskontrolle von Daten vergleicht fur jeden Zeitpunkt den Wert einer Teststation mit einem ausumliegenden Nachbarstationen interpolierten Wert Die Zeitreihe der resultierenden Abweichungen werdenmit Hilfe des SNHT auf signifikante Inhomogenitaten gepruft Das Verfahren wird iterativ angewandtwomit sichergestellt ist dass mehrere Bruche pro Stationsreihen gefunden werden konnen Um die Guteder automatisierten Methode zu ermitteln wurden die Resultate mit solchen eines manuell homogenisiertenschweizerischen Datensatzes verglichen Hierbei zeigte sich dass der automatische Homogenitatstest inder Lage ist Shift-Inhomogenitaten in klimatologischen Zeitreihen korrekt zu detektieren Zumindest inkomplexer Topographie wie den Schweizer Alpen treten jedoch eine relativ hohe Anzahl falscher Alarmeund nicht detektierter Inhomogenitaten auf Die Ergebnisse der Homogenitatsprufung des ENSEMBLES-Datensatzes fur die Periode 1960-2004 zeigen dass je nach Variable 12 bis 59 der Klimadatenreihenhomogen sind Die grosste Anzahl der Bruche fanden sich in den Luftdruckreihen gefolgt von denen furTemperatur und Niederschlag

1 Introduction

Climate model verification and climate monitoring re-quire quality controlled high resolution data sets ide-ally covering some previous decades Such a Europeanscale gridded daily data set is under development within

lowastCorresponding author Michael Begert MeteoSchweiz Postfach 514 CH-

8044 Zurich Switzerland e-mail michaelbegertmeteoswissch

the ENSEMBLES project (wwwensembles-euorg HE-WITT and GRIGGS 2004) In this paper the homogene-ity of the European station series used as input data is as-sessed applying a new automated procedure Numerousmethods are in use to evaluate the homogeneity of cli-matic time series A comprehensive overview is given inPETERSON et al (1998) and WMO guidelines are pro-vided in AGUILAR et al (2003) In general homogeneitytests are applied relatively ie data are tested with re-

0941-294820080314 $ 450

DOI 1011270941-294820080314 ccopy Gebruder Borntraeger Berlin Stuttgart 2008

664 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

spect to homogeneous neighbouring stations A combi-nation of statistical methods and analysis of station his-tory is considered to be the most effective approach

The homogeneity assessment of the data set developedwithin ENSEMBLES (referred to as ENSEMBLES dataset) aimed to provide the necessary uncertainty esti-mates of the data that are the base for the following grid-ding process The large number of series to be tested re-quired a fully automated procedure based on statisticaltests only since metadata information was generally notavailable or only existent in native languages HABERLI

(2005) has demonstrated that a procedure combining thequality control method VERAQC (STEINACKER et al2000) with a conventional homogeneity test can meetthe requirements Following this idea we combined VE-RAQC with the standard normal homogeneity test de-veloped by ALEXANDERSSON (1986) and automatedthe procedure The test method is based on the analysisof monthly time series and focuses on the detection ofshifts in mean since these are known to be the most com-mon problems in climatological time series Knowledgeon existing shifts is useful for data gridding and supportthe decision whether to omit a certain series completelydue to inhomogeneities or only use part of it

This paper describes the developed automated de-tection procedure its performance compared to a highquality homogeneous Swiss data set and its applicationto temperature precipitation and air pressure series ofthe European data set collated in ENSEMBLES Aftera short introduction of the data set chapters 3 and 4 ex-plain the details and performance of the automated ho-mogeneity test procedure Chapter 5 shows the results ofthe method applied to the ENSEMBLES European dataset and finally chapter 6 concludes and summarizes theexperiences made during this study

2 European daily data set

The daily data set of KLOK and KLEIN TANK (submit-ted) which updates the ECA data set of KLEIN TANK etal (2002) will be used for the development of the EN-SEMBLES gridded data sets It contains series of about2000 stations and 9 variables minimum mean and max-imum temperature precipitation sea level air pressuresnow depth sunshine duration relative humidity andcloud cover Only the first six variables will be used tocreate the ENSEMBLES gridded data sets The homo-geneity of these series is assessed in this paper exceptfor snow depth (see Chapter 3) Most of the stations inthe data set contain precipitation series (90 ) or tem-perature series (60ndash67 ) whereas a smaller numberholds air pressure (13 ) and snow depth data (8 )Figure 1 shows the temporal data coverage for each cli-mate variable The best coverage is achieved after 1960The strong decline in the precipitation series over thelast 15 years is mainly caused by the precipitation se-ries from the former Soviet Union which cease in the

early nineties The spatial distribution of the stations isdepicted in the plots showing the number of breakpointsfor each series (Figures 5 to 9) Data sparse areas aremainly found in Eastern Europe and Northern Africa

All data in the data set have been automatically qualitychecked and flagged accordingly (KLOK et al 2006)No corrections or adjustments were made to the time se-ries There are three types of data quality flags (0) use-ful (1) suspicious ie the data value does not pass thequality control test and (9) missing The quality controltests are absolute implying that the data are not com-pared with respect to neighbouring station series In to-tal the data set contains 84 of useful data 1 ofsuspicious data and 15 of missing data For the ho-mogeneity assessment only useful data values were se-lected

3 Method

A climatological series is relatively homogeneous withrespect to a synchronous series if the differences (or ra-tios) of pairs of homologous averages constitute a se-ries of random numbers that satisfies the law of er-rors (CONRAD and POLLAK 1950) Following this ideathe relative homogeneity testing is usually based on thecomparison of the candidate series with a reference se-ries Inhomogeneities in the candidate series are de-tected in the difference or ratio series (henceforth q-series) using suitable statistical methods Most of thevariables used for the ENSEMBLES gridded data setcomply with the requirements for relative homogeneitytesting However due to poor spatial and temporal cor-relation the concept does not work for snow depth Thisvariable was therefore excluded from the homogeneityassessment

The relative homogeneity test procedures in generalcan be divided into two main steps 1) the creation of thereference (q-) series 2) the appliance of the statisticaldetection tests As a fully automated method to build a q-series the Vienna Enhanced Resolution Analysis Qual-ity Control (VERAQC) procedure was used VERAQCwas originally designed for quality control of meteoro-logical measurements and runs on an operational basisat MeteoSwiss VERAQC is based on an objective spa-tial interpolation algorithm For each candidate stationfive surrounding stations are selected in a way in whichtheir locations form an ideal pentagon around the centralcandidate station A surface with minimized curvature isfitted trough the values of the five neighbouring stationsusing a thin plate spline algorithm Finally the differ-ence between the measured value of the central stationand the corresponding value on the fitted surface is cal-culated An analysis of these differences in time (for asingle station) and space (for one time step) is used forquality control purposes A detailed description of themethod is given in STEINACKER et al (2000) The re-sults of its application to MAP (Mesoscale Alpine Pro-gramme) data are given in HABERLI et al (2004) and an

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 665

Figure 1 Number of stations for each year with series for precipita-

tion (black) mean temperature (green) maximum temperature (red)

minimum temperature (blue) air pressure (orange) and snow depth

(grey) A year was counted when at least 292 days contain useful

data (flag=0 see Section 2)

extension to test the relative homogeneity of radiosondetime series for the Alpine region in HABERLI (2005)

In the context of homogenization the time series of thedifferences calculated by VERAQC (henceforth devia-tion series) can be regarded as a q-series as long as theinter-station distance is less than the decorrelation dis-tance of the variable in question (SCHEIFINGER et al2003 AUER et al 2005)

The ENSEMBLES data set does fulfil this require-ment within the examined period 1960ndash2004 Variationsdue to non-climatic factors appear as shifts or trendsin the deviation series and can be detected by statis-tical test methods In order to enhance the power ofthe statistical tests the variability of the deviation se-ries has been reduced by running VERAQC on monthlyvalues In addition anomalies calculated from a com-mon period were used as input values to avoid influ-ences of a varying station network Otherwise the ap-pearance or disappearance of a neighbouring series canlead to abrupt changes in the deviation series similar toreal breaks Note that inhomogeneities in reference se-ries may result in false alarms (HANSSEN-BAUER andFORLAND 1994 MENNE and WILLIAMS 2005) Al-though the setup of VERAQC minimizes the influenceof inhomogeneities in single series this drawback hasto be accounted for when analyzing the homogeneity re-sults of the method

Alexanderssonrsquos standard normal homogeneity test(abbr SNHT ALEXANDERSSON 1986) has been usedin the second step of the homogeneity test procedureThe SNHT method is designed to find shifts in meanand returns the date of one possible break point Thetest is applied iteratively to each input series in order tosearch for multiple break points Significant test resultson the 95 confidence level serve to divide the devia-tion series into segments Each segment is then investi-gated separately and the iteration is repeated until each

segment is determined as homogeneous or falls below aminimum length Figure 2 gives an example of a devi-ation series and the according homogeneity test resultsfor maximum temperature at the station Basel

For temperature and air pressure the deviation seriescan also be used to examine the dimension of a detectedshift in mean by a simple comparison of the deviationsin the homogeneous segment before and after the breakpoint The mean difference indicates the order of mag-nitude of the shift Note that the method does not ac-count for possible non-zero slopes in the segments thatcan lead to over- or underestimation of the magnitude(PIELKE et al 2007) But magnitudes are only usedto separate small from large shifts in the performanceassessment of VERHOM (see chapter 4) The signifi-cance of the difference is tested using Studentrsquos t-test Ingeneral significant results of the homogeneity tests willalso turn out to be significant using the t-test Howeverthe mathematics of the homogeneity tests is in princi-ple strictly applicable only to a single break in the inputseries Therefore the test results may be disturbed bymultiple break points and have to be considered as indi-cators only

For precipitation ratio series instead of difference se-ries are commonly used in the homogenization processTests showed that the performance of VERHOM de-creased when using ratios As a consequence the differ-ence series was also used for precipitation This test pro-cedure developed in the framework of ENSEMBLES isreferred to as VERHOM (VERAQC Relative Homoge-nization Method) henceforth

4 Performance assessment andlimitations of the relativehomogeneity test procedure(VERHOM)

The performance assessment was based on a compar-ison with high quality homogeneous Swiss series forthe period 1961 to 1997 The latter have been ho-mogenized using THOMAS a tool specially developedfor homogenization of climate series (BEGERT et al2003 2005) Because THOMAS needs manual input itwas not suitable for the homogeneity testing of a largedata set like the one in ENSEMBLES Note that theTHOMAS procedure includes the full station historiesand that the network density for THOMAS was 4 to20 times higher compared to the data set of ENSEM-BLES A denser network leads to smaller variations inthe q-series and homogeneity tests are able to detectsmaller (and therefore usually more) inhomogeneitiesSimilar a higher network density prevents poorly corre-lated neighbouring stations to serve as reference seriesThe inclusion of such series increases the false alarmrate as they might belong to another climate region Inshort VERHOM cannot be expected to reach the perfor-mance of THOMAS especially in a mountainous area

666 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

deviation series station BASEL BINNINGEN

Alexanderssonrsquos standard normal homogeneity test 2 (conflevel=095)

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

-10

0

0 1

0

month

ly d

evia

tion

0 50 100 150

Ale

x test valu

e

Figure 2 Deviation series (top) and homogeneity test results (bottom) for the maximum temperature series of Basel Switzerland between

1960 and 2004 Small circles indicate the test values and the dashed line represents the 95 significance level Red lines mark the significant

inhomogeneities

deviation series station LUZERN

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 1 year

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 5 years

-3 -1

1

2

0 50 100

0 50 100

month

ly d

evia

tion

Ale

x test valu

eA

lex test valu

e

Figure 3 Test results SNHT for the maximum temperature series of Luzern Switzerland between 1961 and 2000 using a minimum segment

length of 1 year (middle) and 5 years (bottom) Findings of THOMAS are indicated in the deviation series (top) with arrows Small circles

show the test values and the dashed line represents the 95 significance level Red lines stress significant inhomogeneities

Table 1 Left number of large (gt= 05C) and small (lt 05

C) shifts detected by VERHOM and THOMAS in Swiss temperature series

(mean minimum maximum) Right number of large (gt= 05C) and small (lt 05

C) false alarms and missed break points of VERHOM

in Swiss temperature series (mean minimum maximum) In total VERHOM has detected 171 and THOMAS 259 breakpoints for the

period 1961ndash1997

hits

VERHOM

large small

break points

THOMAS

large 28 16

small 9 19

VERHOM

false alarms missed

large 19 49

small 80 138

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 667

Figure 4 Sum of THOMAS inhomogeneities in the Swiss station

series detected by VERHOM applied on monthly values for the

period 1960ndash1997

like Switzerland However the comparison gives an ideaof the performance of VERHOM in the context of thehomogeneity assessment of the large ENSEMBLES dataset

In a first step the THOMAS results were used to findthe best setup for VERHOM whereas the THOMAS re-sults were regarded as the truth Hit rates and false alarmrates of both methods were compared in order to deter-mine the most suitable minimum segment length and theadequate significance level for VERHOM The choice ofthe minimum segment length turned out to be a compro-mise between a preferably high hit rate and an accept-able false alarm rate The false alarm rate in particularwas sensitive to the segment length The optimal lengthwas found to be 5 years independent of the variableThis result is in accordance with the recommendationfor the SNHT (ALEXANDERSSON 1986) As a conse-quence inhomogeneities closer than 5 years can not bedetected Figure 3 exemplarily shows the test results ofSNHT using 1 and 5 year minimum segment lengths

In a second step the results calculated automaticallywith the optimal setup of VERHOM were comparedto the findings of THOMAS 33 mean temperature 33minimum temperature 32 maximum temperature 32precipitation and 18 air pressure series were finally usedfor the comparison Figure 4 reveals the number of breakpoints per series detected by VERHOM compared tothe break points detected by THOMAS In addition thenumber of false alarms (significant break points of VER-HOM no THOMAS indication) and missed shift in-homogeneities (significant break points of THOMASno VERHOM indication) are shown Break points de-tected within one year by both methods are referred toas hits Overall the performance of VERHOM in detect-ing the THOMAS breaks is rather poor About half ofthe shifts detected by VERHOM must be considered asfalse alarms (a result only weakly depending on the vari-able examined) The rate of missed shifts is also highespecially for air pressure As mentioned above it mustbe kept in mind that this comparison is very strict andhas to be interpreted with care Missed break points can

be expected due to the implemented minimum segmentlength and the substantially sparser network density Thesurprisingly high number of missed breaks in air pres-sure series for instance can be explained by the fact thatSwiss air pressure series contain a high number of inho-mogeneities in the 1980s because of technical problemsVERHOM is not able to detect breaks that follow eachother within less than 5 years while there is no restric-tion in THOMAS because inhomogeneities are detectedaccording to test results and station history analysis Thehigh number of missed breaks in temperature series onthe other hand is due to small breaks that could notbe detected by VERHOM The sparser network densitycompared to THOMAS causes small shifts to disappearinto statistical noise Concerning the false alarms thereis more than one reason for the rather high rate Thesparser network density causing series from different cli-matological regions to be included in the reference seriesas well as inhomogeneities in series of neighbouring sta-tions might be responsible In addition SNHT tends tobreak up possible short trends in the deviation series intomany small steps (REEVES et al 2007) Although weintroduced a minimum segment length to be tested thistendency might still contribute to the high false alarmrate

Beside a comparison between ldquoshift datesrdquo a com-parison of ldquoshift dimensionsrdquo gives an idea of the per-formance of the current setup of VERHOM Table 1compares the number of small (lt 05

C) and large(ge 05

C) shifts detected by VERHOM and THOMASin Swiss temperature series (mean minimum and maxi-mum series) In general most of the large shifts are char-acterized as large by VERHOM whereas small shifts areonly partly identified as small ones False alarms andmissed breaks in particular are generally small althougha substantial number of large shifts is included

Finally note that the comparison of the two methodswas carried out using series from stations located in themountainous region of the Alps As spatial correlationis one of the most important factors for successful rel-ative homogeneity testing a better performance can beexpected for less mountainous regions ie most otherparts of Europe However a relatively high number offalse alarms and missed break points due to poorly cor-related neighbouring stations or inhomogeneities in ref-erence series must still be expected A comparison ofthe VERHOM results with findings for series in othercountries would be a valuable completion of the currentperformance assessment

Overall the new automated method VERHOM is ableto detect shift inhomogeneities and estimate their dimen-sions However some limitations exist

i Inhomogeneities closer than 5 years apart can not bedetected which leads to missed break points

ii Stations located at the boundary of the investigatedregion cannot be tested as the algorithm of VER-AQC fails if the number of surrounding stations is

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 2: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

664 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

spect to homogeneous neighbouring stations A combi-nation of statistical methods and analysis of station his-tory is considered to be the most effective approach

The homogeneity assessment of the data set developedwithin ENSEMBLES (referred to as ENSEMBLES dataset) aimed to provide the necessary uncertainty esti-mates of the data that are the base for the following grid-ding process The large number of series to be tested re-quired a fully automated procedure based on statisticaltests only since metadata information was generally notavailable or only existent in native languages HABERLI

(2005) has demonstrated that a procedure combining thequality control method VERAQC (STEINACKER et al2000) with a conventional homogeneity test can meetthe requirements Following this idea we combined VE-RAQC with the standard normal homogeneity test de-veloped by ALEXANDERSSON (1986) and automatedthe procedure The test method is based on the analysisof monthly time series and focuses on the detection ofshifts in mean since these are known to be the most com-mon problems in climatological time series Knowledgeon existing shifts is useful for data gridding and supportthe decision whether to omit a certain series completelydue to inhomogeneities or only use part of it

This paper describes the developed automated de-tection procedure its performance compared to a highquality homogeneous Swiss data set and its applicationto temperature precipitation and air pressure series ofthe European data set collated in ENSEMBLES Aftera short introduction of the data set chapters 3 and 4 ex-plain the details and performance of the automated ho-mogeneity test procedure Chapter 5 shows the results ofthe method applied to the ENSEMBLES European dataset and finally chapter 6 concludes and summarizes theexperiences made during this study

2 European daily data set

The daily data set of KLOK and KLEIN TANK (submit-ted) which updates the ECA data set of KLEIN TANK etal (2002) will be used for the development of the EN-SEMBLES gridded data sets It contains series of about2000 stations and 9 variables minimum mean and max-imum temperature precipitation sea level air pressuresnow depth sunshine duration relative humidity andcloud cover Only the first six variables will be used tocreate the ENSEMBLES gridded data sets The homo-geneity of these series is assessed in this paper exceptfor snow depth (see Chapter 3) Most of the stations inthe data set contain precipitation series (90 ) or tem-perature series (60ndash67 ) whereas a smaller numberholds air pressure (13 ) and snow depth data (8 )Figure 1 shows the temporal data coverage for each cli-mate variable The best coverage is achieved after 1960The strong decline in the precipitation series over thelast 15 years is mainly caused by the precipitation se-ries from the former Soviet Union which cease in the

early nineties The spatial distribution of the stations isdepicted in the plots showing the number of breakpointsfor each series (Figures 5 to 9) Data sparse areas aremainly found in Eastern Europe and Northern Africa

All data in the data set have been automatically qualitychecked and flagged accordingly (KLOK et al 2006)No corrections or adjustments were made to the time se-ries There are three types of data quality flags (0) use-ful (1) suspicious ie the data value does not pass thequality control test and (9) missing The quality controltests are absolute implying that the data are not com-pared with respect to neighbouring station series In to-tal the data set contains 84 of useful data 1 ofsuspicious data and 15 of missing data For the ho-mogeneity assessment only useful data values were se-lected

3 Method

A climatological series is relatively homogeneous withrespect to a synchronous series if the differences (or ra-tios) of pairs of homologous averages constitute a se-ries of random numbers that satisfies the law of er-rors (CONRAD and POLLAK 1950) Following this ideathe relative homogeneity testing is usually based on thecomparison of the candidate series with a reference se-ries Inhomogeneities in the candidate series are de-tected in the difference or ratio series (henceforth q-series) using suitable statistical methods Most of thevariables used for the ENSEMBLES gridded data setcomply with the requirements for relative homogeneitytesting However due to poor spatial and temporal cor-relation the concept does not work for snow depth Thisvariable was therefore excluded from the homogeneityassessment

The relative homogeneity test procedures in generalcan be divided into two main steps 1) the creation of thereference (q-) series 2) the appliance of the statisticaldetection tests As a fully automated method to build a q-series the Vienna Enhanced Resolution Analysis Qual-ity Control (VERAQC) procedure was used VERAQCwas originally designed for quality control of meteoro-logical measurements and runs on an operational basisat MeteoSwiss VERAQC is based on an objective spa-tial interpolation algorithm For each candidate stationfive surrounding stations are selected in a way in whichtheir locations form an ideal pentagon around the centralcandidate station A surface with minimized curvature isfitted trough the values of the five neighbouring stationsusing a thin plate spline algorithm Finally the differ-ence between the measured value of the central stationand the corresponding value on the fitted surface is cal-culated An analysis of these differences in time (for asingle station) and space (for one time step) is used forquality control purposes A detailed description of themethod is given in STEINACKER et al (2000) The re-sults of its application to MAP (Mesoscale Alpine Pro-gramme) data are given in HABERLI et al (2004) and an

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 665

Figure 1 Number of stations for each year with series for precipita-

tion (black) mean temperature (green) maximum temperature (red)

minimum temperature (blue) air pressure (orange) and snow depth

(grey) A year was counted when at least 292 days contain useful

data (flag=0 see Section 2)

extension to test the relative homogeneity of radiosondetime series for the Alpine region in HABERLI (2005)

In the context of homogenization the time series of thedifferences calculated by VERAQC (henceforth devia-tion series) can be regarded as a q-series as long as theinter-station distance is less than the decorrelation dis-tance of the variable in question (SCHEIFINGER et al2003 AUER et al 2005)

The ENSEMBLES data set does fulfil this require-ment within the examined period 1960ndash2004 Variationsdue to non-climatic factors appear as shifts or trendsin the deviation series and can be detected by statis-tical test methods In order to enhance the power ofthe statistical tests the variability of the deviation se-ries has been reduced by running VERAQC on monthlyvalues In addition anomalies calculated from a com-mon period were used as input values to avoid influ-ences of a varying station network Otherwise the ap-pearance or disappearance of a neighbouring series canlead to abrupt changes in the deviation series similar toreal breaks Note that inhomogeneities in reference se-ries may result in false alarms (HANSSEN-BAUER andFORLAND 1994 MENNE and WILLIAMS 2005) Al-though the setup of VERAQC minimizes the influenceof inhomogeneities in single series this drawback hasto be accounted for when analyzing the homogeneity re-sults of the method

Alexanderssonrsquos standard normal homogeneity test(abbr SNHT ALEXANDERSSON 1986) has been usedin the second step of the homogeneity test procedureThe SNHT method is designed to find shifts in meanand returns the date of one possible break point Thetest is applied iteratively to each input series in order tosearch for multiple break points Significant test resultson the 95 confidence level serve to divide the devia-tion series into segments Each segment is then investi-gated separately and the iteration is repeated until each

segment is determined as homogeneous or falls below aminimum length Figure 2 gives an example of a devi-ation series and the according homogeneity test resultsfor maximum temperature at the station Basel

For temperature and air pressure the deviation seriescan also be used to examine the dimension of a detectedshift in mean by a simple comparison of the deviationsin the homogeneous segment before and after the breakpoint The mean difference indicates the order of mag-nitude of the shift Note that the method does not ac-count for possible non-zero slopes in the segments thatcan lead to over- or underestimation of the magnitude(PIELKE et al 2007) But magnitudes are only usedto separate small from large shifts in the performanceassessment of VERHOM (see chapter 4) The signifi-cance of the difference is tested using Studentrsquos t-test Ingeneral significant results of the homogeneity tests willalso turn out to be significant using the t-test Howeverthe mathematics of the homogeneity tests is in princi-ple strictly applicable only to a single break in the inputseries Therefore the test results may be disturbed bymultiple break points and have to be considered as indi-cators only

For precipitation ratio series instead of difference se-ries are commonly used in the homogenization processTests showed that the performance of VERHOM de-creased when using ratios As a consequence the differ-ence series was also used for precipitation This test pro-cedure developed in the framework of ENSEMBLES isreferred to as VERHOM (VERAQC Relative Homoge-nization Method) henceforth

4 Performance assessment andlimitations of the relativehomogeneity test procedure(VERHOM)

The performance assessment was based on a compar-ison with high quality homogeneous Swiss series forthe period 1961 to 1997 The latter have been ho-mogenized using THOMAS a tool specially developedfor homogenization of climate series (BEGERT et al2003 2005) Because THOMAS needs manual input itwas not suitable for the homogeneity testing of a largedata set like the one in ENSEMBLES Note that theTHOMAS procedure includes the full station historiesand that the network density for THOMAS was 4 to20 times higher compared to the data set of ENSEM-BLES A denser network leads to smaller variations inthe q-series and homogeneity tests are able to detectsmaller (and therefore usually more) inhomogeneitiesSimilar a higher network density prevents poorly corre-lated neighbouring stations to serve as reference seriesThe inclusion of such series increases the false alarmrate as they might belong to another climate region Inshort VERHOM cannot be expected to reach the perfor-mance of THOMAS especially in a mountainous area

666 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

deviation series station BASEL BINNINGEN

Alexanderssonrsquos standard normal homogeneity test 2 (conflevel=095)

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

-10

0

0 1

0

month

ly d

evia

tion

0 50 100 150

Ale

x test valu

e

Figure 2 Deviation series (top) and homogeneity test results (bottom) for the maximum temperature series of Basel Switzerland between

1960 and 2004 Small circles indicate the test values and the dashed line represents the 95 significance level Red lines mark the significant

inhomogeneities

deviation series station LUZERN

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 1 year

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 5 years

-3 -1

1

2

0 50 100

0 50 100

month

ly d

evia

tion

Ale

x test valu

eA

lex test valu

e

Figure 3 Test results SNHT for the maximum temperature series of Luzern Switzerland between 1961 and 2000 using a minimum segment

length of 1 year (middle) and 5 years (bottom) Findings of THOMAS are indicated in the deviation series (top) with arrows Small circles

show the test values and the dashed line represents the 95 significance level Red lines stress significant inhomogeneities

Table 1 Left number of large (gt= 05C) and small (lt 05

C) shifts detected by VERHOM and THOMAS in Swiss temperature series

(mean minimum maximum) Right number of large (gt= 05C) and small (lt 05

C) false alarms and missed break points of VERHOM

in Swiss temperature series (mean minimum maximum) In total VERHOM has detected 171 and THOMAS 259 breakpoints for the

period 1961ndash1997

hits

VERHOM

large small

break points

THOMAS

large 28 16

small 9 19

VERHOM

false alarms missed

large 19 49

small 80 138

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 667

Figure 4 Sum of THOMAS inhomogeneities in the Swiss station

series detected by VERHOM applied on monthly values for the

period 1960ndash1997

like Switzerland However the comparison gives an ideaof the performance of VERHOM in the context of thehomogeneity assessment of the large ENSEMBLES dataset

In a first step the THOMAS results were used to findthe best setup for VERHOM whereas the THOMAS re-sults were regarded as the truth Hit rates and false alarmrates of both methods were compared in order to deter-mine the most suitable minimum segment length and theadequate significance level for VERHOM The choice ofthe minimum segment length turned out to be a compro-mise between a preferably high hit rate and an accept-able false alarm rate The false alarm rate in particularwas sensitive to the segment length The optimal lengthwas found to be 5 years independent of the variableThis result is in accordance with the recommendationfor the SNHT (ALEXANDERSSON 1986) As a conse-quence inhomogeneities closer than 5 years can not bedetected Figure 3 exemplarily shows the test results ofSNHT using 1 and 5 year minimum segment lengths

In a second step the results calculated automaticallywith the optimal setup of VERHOM were comparedto the findings of THOMAS 33 mean temperature 33minimum temperature 32 maximum temperature 32precipitation and 18 air pressure series were finally usedfor the comparison Figure 4 reveals the number of breakpoints per series detected by VERHOM compared tothe break points detected by THOMAS In addition thenumber of false alarms (significant break points of VER-HOM no THOMAS indication) and missed shift in-homogeneities (significant break points of THOMASno VERHOM indication) are shown Break points de-tected within one year by both methods are referred toas hits Overall the performance of VERHOM in detect-ing the THOMAS breaks is rather poor About half ofthe shifts detected by VERHOM must be considered asfalse alarms (a result only weakly depending on the vari-able examined) The rate of missed shifts is also highespecially for air pressure As mentioned above it mustbe kept in mind that this comparison is very strict andhas to be interpreted with care Missed break points can

be expected due to the implemented minimum segmentlength and the substantially sparser network density Thesurprisingly high number of missed breaks in air pres-sure series for instance can be explained by the fact thatSwiss air pressure series contain a high number of inho-mogeneities in the 1980s because of technical problemsVERHOM is not able to detect breaks that follow eachother within less than 5 years while there is no restric-tion in THOMAS because inhomogeneities are detectedaccording to test results and station history analysis Thehigh number of missed breaks in temperature series onthe other hand is due to small breaks that could notbe detected by VERHOM The sparser network densitycompared to THOMAS causes small shifts to disappearinto statistical noise Concerning the false alarms thereis more than one reason for the rather high rate Thesparser network density causing series from different cli-matological regions to be included in the reference seriesas well as inhomogeneities in series of neighbouring sta-tions might be responsible In addition SNHT tends tobreak up possible short trends in the deviation series intomany small steps (REEVES et al 2007) Although weintroduced a minimum segment length to be tested thistendency might still contribute to the high false alarmrate

Beside a comparison between ldquoshift datesrdquo a com-parison of ldquoshift dimensionsrdquo gives an idea of the per-formance of the current setup of VERHOM Table 1compares the number of small (lt 05

C) and large(ge 05

C) shifts detected by VERHOM and THOMASin Swiss temperature series (mean minimum and maxi-mum series) In general most of the large shifts are char-acterized as large by VERHOM whereas small shifts areonly partly identified as small ones False alarms andmissed breaks in particular are generally small althougha substantial number of large shifts is included

Finally note that the comparison of the two methodswas carried out using series from stations located in themountainous region of the Alps As spatial correlationis one of the most important factors for successful rel-ative homogeneity testing a better performance can beexpected for less mountainous regions ie most otherparts of Europe However a relatively high number offalse alarms and missed break points due to poorly cor-related neighbouring stations or inhomogeneities in ref-erence series must still be expected A comparison ofthe VERHOM results with findings for series in othercountries would be a valuable completion of the currentperformance assessment

Overall the new automated method VERHOM is ableto detect shift inhomogeneities and estimate their dimen-sions However some limitations exist

i Inhomogeneities closer than 5 years apart can not bedetected which leads to missed break points

ii Stations located at the boundary of the investigatedregion cannot be tested as the algorithm of VER-AQC fails if the number of surrounding stations is

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 3: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 665

Figure 1 Number of stations for each year with series for precipita-

tion (black) mean temperature (green) maximum temperature (red)

minimum temperature (blue) air pressure (orange) and snow depth

(grey) A year was counted when at least 292 days contain useful

data (flag=0 see Section 2)

extension to test the relative homogeneity of radiosondetime series for the Alpine region in HABERLI (2005)

In the context of homogenization the time series of thedifferences calculated by VERAQC (henceforth devia-tion series) can be regarded as a q-series as long as theinter-station distance is less than the decorrelation dis-tance of the variable in question (SCHEIFINGER et al2003 AUER et al 2005)

The ENSEMBLES data set does fulfil this require-ment within the examined period 1960ndash2004 Variationsdue to non-climatic factors appear as shifts or trendsin the deviation series and can be detected by statis-tical test methods In order to enhance the power ofthe statistical tests the variability of the deviation se-ries has been reduced by running VERAQC on monthlyvalues In addition anomalies calculated from a com-mon period were used as input values to avoid influ-ences of a varying station network Otherwise the ap-pearance or disappearance of a neighbouring series canlead to abrupt changes in the deviation series similar toreal breaks Note that inhomogeneities in reference se-ries may result in false alarms (HANSSEN-BAUER andFORLAND 1994 MENNE and WILLIAMS 2005) Al-though the setup of VERAQC minimizes the influenceof inhomogeneities in single series this drawback hasto be accounted for when analyzing the homogeneity re-sults of the method

Alexanderssonrsquos standard normal homogeneity test(abbr SNHT ALEXANDERSSON 1986) has been usedin the second step of the homogeneity test procedureThe SNHT method is designed to find shifts in meanand returns the date of one possible break point Thetest is applied iteratively to each input series in order tosearch for multiple break points Significant test resultson the 95 confidence level serve to divide the devia-tion series into segments Each segment is then investi-gated separately and the iteration is repeated until each

segment is determined as homogeneous or falls below aminimum length Figure 2 gives an example of a devi-ation series and the according homogeneity test resultsfor maximum temperature at the station Basel

For temperature and air pressure the deviation seriescan also be used to examine the dimension of a detectedshift in mean by a simple comparison of the deviationsin the homogeneous segment before and after the breakpoint The mean difference indicates the order of mag-nitude of the shift Note that the method does not ac-count for possible non-zero slopes in the segments thatcan lead to over- or underestimation of the magnitude(PIELKE et al 2007) But magnitudes are only usedto separate small from large shifts in the performanceassessment of VERHOM (see chapter 4) The signifi-cance of the difference is tested using Studentrsquos t-test Ingeneral significant results of the homogeneity tests willalso turn out to be significant using the t-test Howeverthe mathematics of the homogeneity tests is in princi-ple strictly applicable only to a single break in the inputseries Therefore the test results may be disturbed bymultiple break points and have to be considered as indi-cators only

For precipitation ratio series instead of difference se-ries are commonly used in the homogenization processTests showed that the performance of VERHOM de-creased when using ratios As a consequence the differ-ence series was also used for precipitation This test pro-cedure developed in the framework of ENSEMBLES isreferred to as VERHOM (VERAQC Relative Homoge-nization Method) henceforth

4 Performance assessment andlimitations of the relativehomogeneity test procedure(VERHOM)

The performance assessment was based on a compar-ison with high quality homogeneous Swiss series forthe period 1961 to 1997 The latter have been ho-mogenized using THOMAS a tool specially developedfor homogenization of climate series (BEGERT et al2003 2005) Because THOMAS needs manual input itwas not suitable for the homogeneity testing of a largedata set like the one in ENSEMBLES Note that theTHOMAS procedure includes the full station historiesand that the network density for THOMAS was 4 to20 times higher compared to the data set of ENSEM-BLES A denser network leads to smaller variations inthe q-series and homogeneity tests are able to detectsmaller (and therefore usually more) inhomogeneitiesSimilar a higher network density prevents poorly corre-lated neighbouring stations to serve as reference seriesThe inclusion of such series increases the false alarmrate as they might belong to another climate region Inshort VERHOM cannot be expected to reach the perfor-mance of THOMAS especially in a mountainous area

666 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

deviation series station BASEL BINNINGEN

Alexanderssonrsquos standard normal homogeneity test 2 (conflevel=095)

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

-10

0

0 1

0

month

ly d

evia

tion

0 50 100 150

Ale

x test valu

e

Figure 2 Deviation series (top) and homogeneity test results (bottom) for the maximum temperature series of Basel Switzerland between

1960 and 2004 Small circles indicate the test values and the dashed line represents the 95 significance level Red lines mark the significant

inhomogeneities

deviation series station LUZERN

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 1 year

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 5 years

-3 -1

1

2

0 50 100

0 50 100

month

ly d

evia

tion

Ale

x test valu

eA

lex test valu

e

Figure 3 Test results SNHT for the maximum temperature series of Luzern Switzerland between 1961 and 2000 using a minimum segment

length of 1 year (middle) and 5 years (bottom) Findings of THOMAS are indicated in the deviation series (top) with arrows Small circles

show the test values and the dashed line represents the 95 significance level Red lines stress significant inhomogeneities

Table 1 Left number of large (gt= 05C) and small (lt 05

C) shifts detected by VERHOM and THOMAS in Swiss temperature series

(mean minimum maximum) Right number of large (gt= 05C) and small (lt 05

C) false alarms and missed break points of VERHOM

in Swiss temperature series (mean minimum maximum) In total VERHOM has detected 171 and THOMAS 259 breakpoints for the

period 1961ndash1997

hits

VERHOM

large small

break points

THOMAS

large 28 16

small 9 19

VERHOM

false alarms missed

large 19 49

small 80 138

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 667

Figure 4 Sum of THOMAS inhomogeneities in the Swiss station

series detected by VERHOM applied on monthly values for the

period 1960ndash1997

like Switzerland However the comparison gives an ideaof the performance of VERHOM in the context of thehomogeneity assessment of the large ENSEMBLES dataset

In a first step the THOMAS results were used to findthe best setup for VERHOM whereas the THOMAS re-sults were regarded as the truth Hit rates and false alarmrates of both methods were compared in order to deter-mine the most suitable minimum segment length and theadequate significance level for VERHOM The choice ofthe minimum segment length turned out to be a compro-mise between a preferably high hit rate and an accept-able false alarm rate The false alarm rate in particularwas sensitive to the segment length The optimal lengthwas found to be 5 years independent of the variableThis result is in accordance with the recommendationfor the SNHT (ALEXANDERSSON 1986) As a conse-quence inhomogeneities closer than 5 years can not bedetected Figure 3 exemplarily shows the test results ofSNHT using 1 and 5 year minimum segment lengths

In a second step the results calculated automaticallywith the optimal setup of VERHOM were comparedto the findings of THOMAS 33 mean temperature 33minimum temperature 32 maximum temperature 32precipitation and 18 air pressure series were finally usedfor the comparison Figure 4 reveals the number of breakpoints per series detected by VERHOM compared tothe break points detected by THOMAS In addition thenumber of false alarms (significant break points of VER-HOM no THOMAS indication) and missed shift in-homogeneities (significant break points of THOMASno VERHOM indication) are shown Break points de-tected within one year by both methods are referred toas hits Overall the performance of VERHOM in detect-ing the THOMAS breaks is rather poor About half ofthe shifts detected by VERHOM must be considered asfalse alarms (a result only weakly depending on the vari-able examined) The rate of missed shifts is also highespecially for air pressure As mentioned above it mustbe kept in mind that this comparison is very strict andhas to be interpreted with care Missed break points can

be expected due to the implemented minimum segmentlength and the substantially sparser network density Thesurprisingly high number of missed breaks in air pres-sure series for instance can be explained by the fact thatSwiss air pressure series contain a high number of inho-mogeneities in the 1980s because of technical problemsVERHOM is not able to detect breaks that follow eachother within less than 5 years while there is no restric-tion in THOMAS because inhomogeneities are detectedaccording to test results and station history analysis Thehigh number of missed breaks in temperature series onthe other hand is due to small breaks that could notbe detected by VERHOM The sparser network densitycompared to THOMAS causes small shifts to disappearinto statistical noise Concerning the false alarms thereis more than one reason for the rather high rate Thesparser network density causing series from different cli-matological regions to be included in the reference seriesas well as inhomogeneities in series of neighbouring sta-tions might be responsible In addition SNHT tends tobreak up possible short trends in the deviation series intomany small steps (REEVES et al 2007) Although weintroduced a minimum segment length to be tested thistendency might still contribute to the high false alarmrate

Beside a comparison between ldquoshift datesrdquo a com-parison of ldquoshift dimensionsrdquo gives an idea of the per-formance of the current setup of VERHOM Table 1compares the number of small (lt 05

C) and large(ge 05

C) shifts detected by VERHOM and THOMASin Swiss temperature series (mean minimum and maxi-mum series) In general most of the large shifts are char-acterized as large by VERHOM whereas small shifts areonly partly identified as small ones False alarms andmissed breaks in particular are generally small althougha substantial number of large shifts is included

Finally note that the comparison of the two methodswas carried out using series from stations located in themountainous region of the Alps As spatial correlationis one of the most important factors for successful rel-ative homogeneity testing a better performance can beexpected for less mountainous regions ie most otherparts of Europe However a relatively high number offalse alarms and missed break points due to poorly cor-related neighbouring stations or inhomogeneities in ref-erence series must still be expected A comparison ofthe VERHOM results with findings for series in othercountries would be a valuable completion of the currentperformance assessment

Overall the new automated method VERHOM is ableto detect shift inhomogeneities and estimate their dimen-sions However some limitations exist

i Inhomogeneities closer than 5 years apart can not bedetected which leads to missed break points

ii Stations located at the boundary of the investigatedregion cannot be tested as the algorithm of VER-AQC fails if the number of surrounding stations is

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 4: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

666 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

deviation series station BASEL BINNINGEN

Alexanderssonrsquos standard normal homogeneity test 2 (conflevel=095)

1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004

-10

0

0 1

0

month

ly d

evia

tion

0 50 100 150

Ale

x test valu

e

Figure 2 Deviation series (top) and homogeneity test results (bottom) for the maximum temperature series of Basel Switzerland between

1960 and 2004 Small circles indicate the test values and the dashed line represents the 95 significance level Red lines mark the significant

inhomogeneities

deviation series station LUZERN

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 1 year

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

1960 1963 1966 1969 1972 1975 1978 1981 1984 1987 1990 1993 1996 1999

Alexandersson SNHT window size 5 years

-3 -1

1

2

0 50 100

0 50 100

month

ly d

evia

tion

Ale

x test valu

eA

lex test valu

e

Figure 3 Test results SNHT for the maximum temperature series of Luzern Switzerland between 1961 and 2000 using a minimum segment

length of 1 year (middle) and 5 years (bottom) Findings of THOMAS are indicated in the deviation series (top) with arrows Small circles

show the test values and the dashed line represents the 95 significance level Red lines stress significant inhomogeneities

Table 1 Left number of large (gt= 05C) and small (lt 05

C) shifts detected by VERHOM and THOMAS in Swiss temperature series

(mean minimum maximum) Right number of large (gt= 05C) and small (lt 05

C) false alarms and missed break points of VERHOM

in Swiss temperature series (mean minimum maximum) In total VERHOM has detected 171 and THOMAS 259 breakpoints for the

period 1961ndash1997

hits

VERHOM

large small

break points

THOMAS

large 28 16

small 9 19

VERHOM

false alarms missed

large 19 49

small 80 138

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 667

Figure 4 Sum of THOMAS inhomogeneities in the Swiss station

series detected by VERHOM applied on monthly values for the

period 1960ndash1997

like Switzerland However the comparison gives an ideaof the performance of VERHOM in the context of thehomogeneity assessment of the large ENSEMBLES dataset

In a first step the THOMAS results were used to findthe best setup for VERHOM whereas the THOMAS re-sults were regarded as the truth Hit rates and false alarmrates of both methods were compared in order to deter-mine the most suitable minimum segment length and theadequate significance level for VERHOM The choice ofthe minimum segment length turned out to be a compro-mise between a preferably high hit rate and an accept-able false alarm rate The false alarm rate in particularwas sensitive to the segment length The optimal lengthwas found to be 5 years independent of the variableThis result is in accordance with the recommendationfor the SNHT (ALEXANDERSSON 1986) As a conse-quence inhomogeneities closer than 5 years can not bedetected Figure 3 exemplarily shows the test results ofSNHT using 1 and 5 year minimum segment lengths

In a second step the results calculated automaticallywith the optimal setup of VERHOM were comparedto the findings of THOMAS 33 mean temperature 33minimum temperature 32 maximum temperature 32precipitation and 18 air pressure series were finally usedfor the comparison Figure 4 reveals the number of breakpoints per series detected by VERHOM compared tothe break points detected by THOMAS In addition thenumber of false alarms (significant break points of VER-HOM no THOMAS indication) and missed shift in-homogeneities (significant break points of THOMASno VERHOM indication) are shown Break points de-tected within one year by both methods are referred toas hits Overall the performance of VERHOM in detect-ing the THOMAS breaks is rather poor About half ofthe shifts detected by VERHOM must be considered asfalse alarms (a result only weakly depending on the vari-able examined) The rate of missed shifts is also highespecially for air pressure As mentioned above it mustbe kept in mind that this comparison is very strict andhas to be interpreted with care Missed break points can

be expected due to the implemented minimum segmentlength and the substantially sparser network density Thesurprisingly high number of missed breaks in air pres-sure series for instance can be explained by the fact thatSwiss air pressure series contain a high number of inho-mogeneities in the 1980s because of technical problemsVERHOM is not able to detect breaks that follow eachother within less than 5 years while there is no restric-tion in THOMAS because inhomogeneities are detectedaccording to test results and station history analysis Thehigh number of missed breaks in temperature series onthe other hand is due to small breaks that could notbe detected by VERHOM The sparser network densitycompared to THOMAS causes small shifts to disappearinto statistical noise Concerning the false alarms thereis more than one reason for the rather high rate Thesparser network density causing series from different cli-matological regions to be included in the reference seriesas well as inhomogeneities in series of neighbouring sta-tions might be responsible In addition SNHT tends tobreak up possible short trends in the deviation series intomany small steps (REEVES et al 2007) Although weintroduced a minimum segment length to be tested thistendency might still contribute to the high false alarmrate

Beside a comparison between ldquoshift datesrdquo a com-parison of ldquoshift dimensionsrdquo gives an idea of the per-formance of the current setup of VERHOM Table 1compares the number of small (lt 05

C) and large(ge 05

C) shifts detected by VERHOM and THOMASin Swiss temperature series (mean minimum and maxi-mum series) In general most of the large shifts are char-acterized as large by VERHOM whereas small shifts areonly partly identified as small ones False alarms andmissed breaks in particular are generally small althougha substantial number of large shifts is included

Finally note that the comparison of the two methodswas carried out using series from stations located in themountainous region of the Alps As spatial correlationis one of the most important factors for successful rel-ative homogeneity testing a better performance can beexpected for less mountainous regions ie most otherparts of Europe However a relatively high number offalse alarms and missed break points due to poorly cor-related neighbouring stations or inhomogeneities in ref-erence series must still be expected A comparison ofthe VERHOM results with findings for series in othercountries would be a valuable completion of the currentperformance assessment

Overall the new automated method VERHOM is ableto detect shift inhomogeneities and estimate their dimen-sions However some limitations exist

i Inhomogeneities closer than 5 years apart can not bedetected which leads to missed break points

ii Stations located at the boundary of the investigatedregion cannot be tested as the algorithm of VER-AQC fails if the number of surrounding stations is

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 5: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 667

Figure 4 Sum of THOMAS inhomogeneities in the Swiss station

series detected by VERHOM applied on monthly values for the

period 1960ndash1997

like Switzerland However the comparison gives an ideaof the performance of VERHOM in the context of thehomogeneity assessment of the large ENSEMBLES dataset

In a first step the THOMAS results were used to findthe best setup for VERHOM whereas the THOMAS re-sults were regarded as the truth Hit rates and false alarmrates of both methods were compared in order to deter-mine the most suitable minimum segment length and theadequate significance level for VERHOM The choice ofthe minimum segment length turned out to be a compro-mise between a preferably high hit rate and an accept-able false alarm rate The false alarm rate in particularwas sensitive to the segment length The optimal lengthwas found to be 5 years independent of the variableThis result is in accordance with the recommendationfor the SNHT (ALEXANDERSSON 1986) As a conse-quence inhomogeneities closer than 5 years can not bedetected Figure 3 exemplarily shows the test results ofSNHT using 1 and 5 year minimum segment lengths

In a second step the results calculated automaticallywith the optimal setup of VERHOM were comparedto the findings of THOMAS 33 mean temperature 33minimum temperature 32 maximum temperature 32precipitation and 18 air pressure series were finally usedfor the comparison Figure 4 reveals the number of breakpoints per series detected by VERHOM compared tothe break points detected by THOMAS In addition thenumber of false alarms (significant break points of VER-HOM no THOMAS indication) and missed shift in-homogeneities (significant break points of THOMASno VERHOM indication) are shown Break points de-tected within one year by both methods are referred toas hits Overall the performance of VERHOM in detect-ing the THOMAS breaks is rather poor About half ofthe shifts detected by VERHOM must be considered asfalse alarms (a result only weakly depending on the vari-able examined) The rate of missed shifts is also highespecially for air pressure As mentioned above it mustbe kept in mind that this comparison is very strict andhas to be interpreted with care Missed break points can

be expected due to the implemented minimum segmentlength and the substantially sparser network density Thesurprisingly high number of missed breaks in air pres-sure series for instance can be explained by the fact thatSwiss air pressure series contain a high number of inho-mogeneities in the 1980s because of technical problemsVERHOM is not able to detect breaks that follow eachother within less than 5 years while there is no restric-tion in THOMAS because inhomogeneities are detectedaccording to test results and station history analysis Thehigh number of missed breaks in temperature series onthe other hand is due to small breaks that could notbe detected by VERHOM The sparser network densitycompared to THOMAS causes small shifts to disappearinto statistical noise Concerning the false alarms thereis more than one reason for the rather high rate Thesparser network density causing series from different cli-matological regions to be included in the reference seriesas well as inhomogeneities in series of neighbouring sta-tions might be responsible In addition SNHT tends tobreak up possible short trends in the deviation series intomany small steps (REEVES et al 2007) Although weintroduced a minimum segment length to be tested thistendency might still contribute to the high false alarmrate

Beside a comparison between ldquoshift datesrdquo a com-parison of ldquoshift dimensionsrdquo gives an idea of the per-formance of the current setup of VERHOM Table 1compares the number of small (lt 05

C) and large(ge 05

C) shifts detected by VERHOM and THOMASin Swiss temperature series (mean minimum and maxi-mum series) In general most of the large shifts are char-acterized as large by VERHOM whereas small shifts areonly partly identified as small ones False alarms andmissed breaks in particular are generally small althougha substantial number of large shifts is included

Finally note that the comparison of the two methodswas carried out using series from stations located in themountainous region of the Alps As spatial correlationis one of the most important factors for successful rel-ative homogeneity testing a better performance can beexpected for less mountainous regions ie most otherparts of Europe However a relatively high number offalse alarms and missed break points due to poorly cor-related neighbouring stations or inhomogeneities in ref-erence series must still be expected A comparison ofthe VERHOM results with findings for series in othercountries would be a valuable completion of the currentperformance assessment

Overall the new automated method VERHOM is ableto detect shift inhomogeneities and estimate their dimen-sions However some limitations exist

i Inhomogeneities closer than 5 years apart can not bedetected which leads to missed break points

ii Stations located at the boundary of the investigatedregion cannot be tested as the algorithm of VER-AQC fails if the number of surrounding stations is

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 6: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

668 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

Table 2 Number of series with zero one two three and four or more break points and number of stations with no result from VERHOM

mean temp minimum temp maximum temp precipitation air pressure

homogeneous 245 (20 ) 168 (12 ) 231 (17 ) 1071 (59 ) 36 (12 )

1 break point 316 (26 ) 323 (23 ) 300 (22 ) 406 (23 ) 45 (15 )

2 break points 238 (19 ) 269 (20 ) 264 (19 ) 94 (5 ) 65 (21 )

3 break points 104 (8 ) 147 (11 ) 128 (9 ) 21 (1 ) 47 (15 )

4 or more break points 31 (3 ) 47 (3 ) 30 (2 ) 0 (0 ) 30 (10 )

undefined 299 (24 ) 423 (31 ) 420 (31 ) 209 (12 ) 83 (27 )

0(l) 1(l) 2(l) 3(l) gt4(l) undefined (l)

Figure 5 Precipitation 1960ndash2004 Number of breakpoints de-

tected

Table 3 Length of mean homogeneous sub-period per parameter

parameter mean homogenous sub period [y]

mean temperature 177

minimum temperature 160

maximum temperature 171

precipitation 290

air pressure 155

insufficient 10 to 20 of the series in the EN-SEMBLES data base are affected depending on thevariable

iii As consequence of limitation ii variations in thestation network can cause incomplete deviation se-ries close to the boundary if a neighbouring stationappears or disappears in time This can cause missedbreak points

Figure 6 As Figure 5 but for air pressure

5 Homogeneity assessment of theENSEMBLES data set withVERHOM for the period 1960ndash2004

The homogeneity of the ENSEMBLES data set wastested using the procedure VERHOM described abovePrecipitation air pressure mean temperature maximumtemperature and minimum temperature series were con-sidered

The results of the homogeneity assessment based onVERHOM are shown in Figures 5 to 9 Different coloursindicate the number of break points per series Stationsexcluded from the analysis due to incompleteness or dueto the edge problem are plotted in black The exact num-bers of detected break points and the mean lengths of ahomogeneous sub-period per parameter are given in Ta-bles 2 and 3

Overall it is found that air pressure series containthe highest number of break points (Figure 6 Table 2)This result agrees with findings from other studies (egBEGERT et al 2003) and has several reasons Firstair pressure is highly correlated in space resulting in a

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 7: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 669

Figure 7 As Figure 5 but for mean temperature

Figure 8 As Figure 5 but for maximum temperature

good performance of relative homogeneity test methodsSmaller shifts can be detected compared to other para-meters Second small changes in the measuring condi-tions such as a changing station height can lead to sub-stantial inhomogeneities Long homogeneous air pres-sure series are generally rare

For temperature series mean homogeneous sub-periods of around 20 years are present (Table 3) Min-imum temperature series contain the highest number ofshifts followed by mean and maximum temperature (Ta-ble 2) In addition the number of break points in the dif-ferent temperature series of a station often differs (Fig-ures 7ndash9) As seen in Swiss series the higher number ofinhomogeneities in minimum temperature series mightbe a result of the fact that minimum temperature mea-

Figure 9 As Figure 5 but for minimum temperature

surements are more sensitive to disturbances in the mea-suring conditions (eg relocations) than the other tem-perature variables

Precipitation series contain less inhomogeneities thanthe other parameters (Figure 5 to 9 Table 2) Precipi-tation also shows the longest mean homogeneous sub-periods (Table 3) Again this result is a combinationof the fact that precipitation series are less frequentlysubject to changes in the measuring conditions and thelimited efficiency of relative homogeneity tests due toa higher spatial and temporal variability of the variableFigures 10 and 11 show the frequency distribution of theshift dimensions in air pressure and temperature seriesof the ENSEMBLES data set Most of the shifts in airpressure series (around 80 ) lie between ndash1 and 1 hPaSingle shifts of up to ndash10 hPa and 5 hPa respectively oc-curred For temperature the shifts vary between ndash3 and3C Large shifts rarely occurred but most of the tem-perature shifts lie between ndash1 and 1C A broader distri-bution can be observed for minimum temperature thanfor maximum and mean temperature Shift dimensionsas well as found differences in distributions agree withfindings for the Swiss temperature series in the period1961 to 1997 (BEGERT et al 2003)

6 Summary and conclusions

With VERHOM an automated relative homogeniza-tion procedure has been developed tested and usedfor the homogeneity assessment of a large Europeandata set collated in the framework of the EU-projectENSEMBLES VERHOM combines VERAQC (ViennaEnhanced Resolution Analysis Quality Control) out-put with Alexanderssonrsquos standard normal homogeneitytest VERAQC uses a modified thin-plate spline interpo-lation to calculate a reference value for a station to be

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 8: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

670 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

air pressure

[]

010

20

30

40

50

10 5 0 5 10

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 10 Left frequency distribution of shift dimensions for air pressure in classes of 1 hPa Right frequency distribution of shift

dimensions for mean temperature in classes of 05C

maximum temperature

[]

010

20

30

40

50

7 35 0 35 7

minimum temperature

[]

010

20

30

40

50

7 35 0 35 7

Figure 11 Left frequency distribution of shift dimensions for maximum temperature in classes of 05C Right frequency distribution of

shift dimensions for minimum temperature in classes of 05C

tested The resulting differences between measurementsand calculated values can be analysed as time series fora single station and used as input for relative homogene-ity tests VERHOM is designed to detect shifts in meanin climatological time series

To assess the performance of the automated proce-dure the findings were compared to results of a care-fully homogenized data set in the Swiss Alpine regionThe latter is based on the relative homogenization pro-cedure THOMAS combining statistical methods withan in-depth analysis of the station history informationOverall the proposed new method VERHOM is able to

detect shift inhomogeneities in temperature precipita-tion and air pressure series although the number of falsealarms as well as missed break points is comparativelyhigh Depending on the variable 45 to 60 of the in-dicated shifts were false alarms and only 40 to 55 ofthe inhomogeneities found by THOMAS were detectedSince VERHOM was tested in a region with complextopography a better performance can be expected formost parts of Europe outside the Alpine region The highfalse alarm rate for small shifts might also be due toinhomogeneities in the reference series built from sur-rounding stations Addressing this problem would re-

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 9: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

Meteorol Z 17 2008 M Begert et al An automated procedure to detect discontinuities 671

quire additional strategies (eg GONZALES-ROUCO etal 2001 or MENNE and WILLIAMS 2005) About 10ndash20 of all series could not be tested due to the factthat VERHOM is not able to process stations along theedge of the region covered by the observations The as-sessment of the data homogeneity of the ENSEMBLESdata set for the period 1960-2004 revealed that 59 of the precipitation 20 of the mean temperature 17 of the maximum temperature 12 of the minimumtemperature and air pressure series were homogeneousBy far the highest number of shift inhomogeneities wasdetected in air pressure series followed by temperatureand precipitation The mean homogeneous sub-periodsvary between 155 (air pressure) and 29 (precipitation)years

List of acronyms

ENSEMBLES Research project supported by the Eu-ropean Comission under the 6th Framework Pro-gramme 2002ndash2006 Priority Global Change andEcosystems The focus is on Ensemble based predic-tions of climate changes and their impactswwwensembles-euorg

SNHT Standard Normal Homogeneity TestTHOMAS Tool for Homogenization of Monthly Data

SeriesVERAQC Vienna Enhanced Resolution Analysis Qual-

ity ControlVERHOM Homogenization procedure which combines

VERAQC with a relative homogeneity test

Acknowledgments

This study was supported by the ENSEMBLES project(EU FP 6 contract GOCE-CT-2003-505539)

References

AGUILAR E I AUER TC PETERSON J WIERINGA2003 Guidelines on climate metadata and homogenizationndash WCDMPndashNo 53 WMOndashTD No 1186 World Meteoro-logical Organization Geneva

ALEXANDERSSON H 1986 A homogeneity test applied toprecipitation data ndash J Climatol 6 661ndash675

AUER I R BOEHM A JURKOVIC A ORLIK R POTZ-MANN W SCHONER M UNGERSBOCK M BRUNETTIT NANNI M MAUGERI K BRIFFA P JONES DEFTHYMIADIS O MESTRE JM MOISSELIN MBEGERT R BRAZDIL O BOCHNICEK T CEGNAR MGAJI-APKA K ZANINOVI S MAJSTOROVI S SZA-LAI T SZENTIMREY L PERCALLI 2005 A new instru-mental precipitation dataset for the greater alpine region forthe period 1800ndash2002 ndash Int J Climatol 25 139ndash166

BEGERT M G SEIZ T SCHLEGEL M MUSA GBAUDRAZ M MOESCH 2003 Homogenisierung vonKlimareihen der Schweiz und Bestimmung der Norm-werte 1961ndash1990 ndash Schlussbericht des Projekts NORM90Veroffentlichung der MeteoSchweiz 67 MeteoSchweizZurich

BEGERT M T SCHLEGEL W KIRCHHOFER 2005 Ho-mogeneous temperature and precipitation series of Switzer-land from 1864 to 2000 ndash Int J Climatol 25 65ndash80

CONRAD V W POLLAK 1950 Methods in climatologySecond edition ndash Harvard University Press CambridgeMA

GONZALEZ-ROUCO JF JL JIMENEZ V QUESADA FVALERO 2001 Quality control and homogeneity of pre-cipitation data in the southwest of Europe ndash J Climate 14964ndash978

HABERLI C 2005 The Comprehensive Alpine RadiosondeData set (CALRAS) Contribution to the diagnostic basedon upper-air soundings 1957ndash1999 ndash Wien Facultas2006 ndash VIII 297 S Wiener Meteorologische Schriften 4Zugl Wien Univ Dissertation

HABERLI C I GROEHN R STEINACKER WPOTTSCHACHER M DORNINGER 2004 Perfor-mance of the surface observation network during MAP ndashMeteorol Z 13 109ndash121

HANSSEN-BAUER I EJ FORLAND 1994 Homogenizinglong Norwegian precipitation series ndash J Climate 7 1001ndash1013

HEWITT CD DJ GRIGGS 2004 Ensembles-Based Pre-dictions of Climate Changes and their Impacts ndash EosTrans AGU 85(52) 566 1010292004EO520005

KLEIN TANK AMG JB WIJNGAARD GP KONNENR BOHM G DEMAREE A GOCHEVA M MILETAS PASHIARDIS L HEJKRLIK C KERN-HANSEN RHEINO P BESSEMOULIN G MULLER-WESTERMEIERM TZANAKOU S SZALAI T PALSDOTTIR DFITZGERALD S RUBIN M CAPALDO M MAUGERIA LEITASS A BUKANTIS R ABERFELD AFV VAN

ENGELEN E FORLAND M MIETUS F COELHO CMARES V RAZUVAEV E NIEPLOVA T CEGNARJ ANTONIO LOPEZ B DAHLSTROM A MOBERGW KIRCHHOFER A CEYLAN O PACHALIUK LVALEXANDER P PETROVIC 2002 Daily dataset of 20th-century surface air temperature and precipitation series forthe European Climate Assessment ndash Int J Climatol 221441ndash1453

KLOK EJ AMG KLEIN TANK submitted Updated andextended European dataset of daily climate observations ndashInt J Climatol

KLOK EJ AMG KLEIN TANK M BEGERT E ZEN-KLUSEN 2006 Assessment of the available station densityfor the gridding and daily data qualityhomogeneity ndash EN-SEMBLES project report D 58

MENNE MJ CN WILLIAMS 2005 Detection of undocu-mented changepoints using multiple test statistics and com-posite reference series ndash J Climate 18 4271ndash4286

PETERSON TC DR EASTERLING TR KARL P GRO-ISMAN I AUER R BOHM N PLUMMER N NICHO-LIS S TOROK L VINCENT H TUOMENVIRTA JSALINGER EJ FOslashRLAND I HANSSEN-BAUER HALEXANDERSSON P JONES D PARKER 1998 Homo-geneity adjustments of in situ climate data a review ndash IntJ Climatol 18 1493ndash1517

PIELKE RA C DAVEY J ANGEL O BLISS NDOESKEN M CAI S FALL D NIYOGI K GALLOR HALE K HUBBARD X LIN H LI J NIELSEN-GAMMON S RAMAN 2007 Documentation of bias asso-ciated with surface temperature measurement sites for cli-mate change assessment ndash Bull Amer Meteor Soc 88913ndash928

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316

Page 10: An automated procedure to detect discontinuities; performance ... · An automated procedure to detect discontinuities; performance assessment and application to a large European

672 M Begert et al An automated procedure to detect discontinuities Meteorol Z 17 2008

REEVES J J CHEN XL WANG R LUND QQ LU2007 A review and comparison of changepoint detectiontechniques for climate data ndash J Appl Meteor Climatol46 900ndash914

SCHEIFINGER H R BOHM I AUER 2003 Raum-liche Dekorrelation von Klimazeitreihen unterschiedlicherzeitlicher Auflosung und ihre Bedeutung fur ihre Ho-mogenisierbarkeit und die Reprasentativitat von Ergebnis-

sen ndash 6 Deutsche Klimatagung Klimavariabilitat 200322ndash25 September 2003 Potsdam Terra Nostra Schriften-reihe der AlfredndashWegenerndashStiftung 20036 375ndash379

STEINACKER R C HABERLI W POTTSCHACHER 2000A transparent method for the analysis and quality evalua-tion of irregularly distributed and noisy observational datandash Mon Wea Rev 128 2303ndash2316