View
49
Download
0
Category
Preview:
DESCRIPTION
Quantifying efficiency of homogenisation methods. Dr. Peter Domonkos dopeter@t-online.hu COST HOME ES0601. Measuring efficiency our expectations. Gaining the real climatic trends, Gaining the real trends and fluctuations, Identifying large inhomogeneity-shifts one-by-one, - PowerPoint PPT Presentation
Citation preview
Quantifying efficiency of homogenisation
methods
Dr. Peter Domonkos
dopeter@t-online.hu
COST HOME ES0601
Measuring efficiencyour expectations
• Gaining the real climatic trends,
• Gaining the real trends and fluctuations,
• Identifying large inhomogeneity-shifts
one-by-one,
• Identifying as many shifts as we can
Measuring efficiencygeneral practice
• Usually the rate of correct detection is examined (Ducré-Robitaille, Mestre, Menne and Williams, etc.)
• Menne and Williams (2005) apply the hit rate (or power, = H), false detection rate (F), false alarm rate (FAR), bias of detection frequency (B), and the improvement in skill compared to random forecasts (HSS).
shifts factual allshifts detectedcorrectly H
events shift" no" ofnumber shifts detected falseF
Measuring efficiencygeneral practice
shifts detected allshifts detected falseFAR
shifts factual allshifts detected allB
Measuring efficiencythis presentation
• Arbitrary, but reasonable choices• 1 = standard deviation of estimated noise
• Factual shift: Shift with MM0 magnitude between two adjacent 3 year long periods. M0 = 2 or M0 = 3 here.
• Right detection: A shift with M1.5 for M0 = 2
(M2 for M0 = 3) is detected with maximum 1 year lapse.
• False detection: A shift with M1.5 for M0 = 2
(M2 for M0 = 3) is detected at year j, but there is no shift of the same direction than the detected one with M > 0 within the (j-2,j+2) period.
Measuring efficiencythis presentation
• Let the number of the time series be m, the total of the factual shifts is k, the number of right detections is DR, that of false detections is DF, then
kD
H RkD
F F'
kDD
FHI FRA
'
mkmDD
I FRB
Measuring efficiencythis presentation
• Reliability of trends!? • Let the mean bias of trend slopes, caused by
inhomogeneities is t0 before the homogenisation, and t after the homogenisation. Then the improvement in trend reliability is indicated by
• General (combined) efficiency (Domonkos, 5th Seminar, 2006)
0
0
ttt
IT
Properties of time series
• Five versions of simulated datasets are examined here. Each dataset has 10,000, one hundred year long time series. The scale of the properties is wide from a single inhomogeneity per time series to the inclusion of very complex inhomogeneity-structures „Hungarian standard” (Domonkos, 5th Seminar, 2006).
• (1) 1 shift with M = 3; (2) 1 shift with M = 3 and 4 shifts with M = 1.5; (3) and (4) Shifts with 1/ decade frequency, exponential distribution of M above 1, and uniform distribution of M below 1. (3) Mmax<2; (4) Mmax<3;
(5) Hungarian standard
Distribution of difference (percentage) between the detected inhomogeneity-properties of simulated and real
climatic time series for HU STANDARD.k : simple, wk : weighted with sample size
0
20
40
60
80
100
120
-25 -20 -15 -10 -5 0 5 10 15 20 25 %
k w∙k
Homogenisation methods
• 15 objective homogenisation methods: 2-2 versions of Bayes-test [Bay, Ba1], Buishand-test
[Bu1, Bu2], SNHT [SNH, SNT] and t-test [tt1, tt2]; Caussinus-Mestre test [C-M], Easterling-Peterson test [E-P], Mann-Kendall test [M-K], MASH [MAS], Multiple Linear Regression [MLR], Pettitt-test [Pet] and Wilcoxon Rank Sum test [WRS].
Method parameterisation
• With original parameterisations the chance of detecting at least 1 inhomogeneity is ~5% in pure white noise.
• Minimum length of subperiods for calculating own statistical properties: usually 5 years, but in C-M and MAS 1 year, and in E-P 3 years.
• Outliers are prefiltered; Concerning multiple inhomogeneities the semihierarchic algorithm of Moberg and Alexandersson (1997) is included in Bay, Ba1, Bu1, Bu2, M-K, MLR, Pet, SNH, SNT and WRS.
• In a few experiments optimised parameterisation is applied (its use is indicated).
Red = C-M Blue = MASH Green = E-P Black = t-test (tt1) Brown = SNHT for shifts Lila = MLR
0
25
50
75
100
0 5 10 15 20 25
False rate (%)
Power (%)
Identification A, 1 shift (M=3)
0
25
50
75
100
MLR Bay SNH Ba1 WRS Bu2 Bu1 tt2 E-P C-M MAS SNT Pet tt1 M-K
%
Identification A, 1 shift (M=3)+ 4 small shifts
-10
15
40
65
90
E-P tt2 Bay SNH Ba1 C-M MAS Bu1 WRS Bu2 Pet MLR SNT tt1 M-K
%
Identif. A of M3, Exp. M<6
-10
15
40
65
90
MAS E-P C-M Ba1 Bay SNH tt2 tt1 Bu2 Bu1 WRS MLR Pet SNT M-K
%
Identif.A of M2, Hu standard
0
25
50
75
100
C-M MAS E-P Ba1 MLR Bay SNH SNT Bu2 Bu1 tt2 tt1 WRS Pet M-K
%
Identif.A of M3, Hu standard
0
25
50
75
100
MAS C-M E-P Ba1 MLR Bay SNH SNT Bu1 tt1 Bu2 tt2 WRS Pet M-K
%
Identif.B of M2, Exp. M<2
0
25
50
75
100
tt1 SNT E-P tt2 Bu1 Pet MLR SNH Bay WRS Bu2 Ba1 M-K MAS C-M
%
Identif.B of M3, Exp. M<2
0
25
50
75
100
tt1 SNT Bu1 Pet E-P tt2 SNH Bu2 Bay WRS Ba1 MLR MAS M-K C-M
%
Absence of large shiftsnumber of kinds: 7, best: tt1, C-M, Bay
-100
-75
-50
-25
0
25
50
75
100
All breaks Large breaks General eff.
Trends, 1 shift (M=3) filled columns = optimised parameters
0
25
50
75
100
%
Trends, 1 shift + 4 small shifts
0
25
50
75
100
%
Trends, Exp. M<2
0
25
50
75
100
%
Trends, Exp. M<6
0
25
50
75
100
%
Trends, Hu standard
-10
15
40
65
90
%
Identification A, 1 shift
0
25
50
75
100
%
Identif.A, 1 shift + 4 small shifts
-10
15
40
65
90
%
Identif.B of M2, Exp. M<2
0
25
50
75
100
%
Identif.B of M3, Exp. M<2
0
25
50
75
100
%
Identif.A of M2, Hu standard
0
25
50
75
100
%
Identif.A of M3, Hu standard
0
25
50
75
100
%
Identif.A of M3, Exp. M<6
-10
15
40
65
90
%
Discussion
• Identification of M>3 shifts is best with MASH, but its reproduction of climatic trends is not among the best results. This drawback of MASH can be reduced with parameter-optimisation.
• Many results with C-M are on the top, except for cases of very low rate of large inhomogeneities. If the evaluations of shorter than 3-year sections are excluded, and detection results with M<2 are not considered, all the possible disadvantages with C-M are avoidable, even the skill in detecting shifts of M>3 exceeds the performance of MASH.
Conclusions• The efficiency-order of homogenisation methods
strongly depends on the properties of time series, the purposes/priorities of the homogenisation, and on the way of the efficiency evaluation.
• Direct methods for identifying multiple inhomogeneities (C-M and MASH) usually perform better, than the other methods. When the avoidance of false detection has enhanced importance t-test and E-P methods are also competitive.
• Parameter-optimisation may yield improved results.
Thank you for your Thank you for your attention!attention!
COST HOME ES0601COST HOME ES0601
Recommended