Upload
ronda
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research Branch, Science and Technologies Branch, Environment Canada Presentation to the COST meeting in Tarragona, Spain March 9-11, 2009. Outline. - PowerPoint PPT Presentation
Citation preview
1
Detection of discontinuities using an approach based on regression models and
application to benchmark temperature
by Lucie VincentClimate Research Branch, Science and Technologies Branch, Environment
Canada Presentation to the COST meeting in Tarragona, Spain
March 9-11, 2009
Detection of discontinuities using an approach based on regression models and
application to benchmark temperature
by Lucie VincentClimate Research Branch, Science and Technologies Branch, Environment
Canada Presentation to the COST meeting in Tarragona, Spain
March 9-11, 2009
2
OutlineOutline
Methodology- identification of changepoints in annual mean temperature - adjustment of monthly and daily values
Testing methodology using simulated values- homogenous series, single step, random number of steps
Identification of biases in Canadian climate data- bias in relative humidity due to change in instruments - bias in radiosonde temperature due to introduction of correction factor- bias in daily minimum temperature due to a change in observing time
Application to benchmark temperature datasets
- monthly mean minimum temperature at Groix
Discontinuities in precipitation due to joining station observations
3
MethodologyMethodology
4
Let y the candidate series and x the reference series
Model 1: to identify an homogeneous series
y = a1 + c1x1 + e -2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50 60 70 80 90 100
Model 2: to identify a trend
y = a2 + b2i + c2x1 + e i = 1, …, n
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50 60 70 80 90 100
Model 3: to identify a step
y = a3 + b3I + c3x1 + e i = 1, …, n
I = 0 for i = 5, …, p-1 I = 1 for i = p, …, n-5 -2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50 60 70 80 90 100
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
0 10 20 30 40 50 60 70 80 90 100
Difference between candidate and reference
Model 4: to identify a step w trends bef & aft
y = a4 + b4iI1 + a5I2 + b5iI2 + c4x1 + e i = 1, …, n
I1 = 1 and I2 = 0 for i = 5,…, p-1 I1 = 0 and I2 = 1 for i = p, …, n-5
Identification of changepoint in annual mean temperature
Identification of changepoint in annual mean temperature
5
Durbin-Watson test: to determine if candidate series is homogeneous (autocorrelation)
ei = ρei-1 + μi
H0: ρ = 0 versus Ha: ρ > 0
D = Σ(ei – ei-1)2 / Σei2
if D > du => H0 ; if D < dl => Ha
if dl ≤ D ≤ du test is inconclusiveF test: to determine if the introduction of additional variables improve the fit
Model 1 and Model 3 are compared
H0: b3 = 0 versus Ha: b3 ≠ 0
F* = [(SSE1–SSE3)/(df1–f3)] / SSE3/df3
if F* > F(1-α; 1, n-3) reject H0
Statistical testsStatistical tests
If there is a significant changepoint, divide series into two segments and re-test each segment separately
6
ExampleExample
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
0 2 4 6 8 10 12 14 16 18 20
Lag k
Au
toco
rrel
atio
n a
t la
g k
20
30
40
50
60
70
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
SS
E
1943
4
5
6
7
8
9
10
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
°C
Annual mean of daily maximum temperatureof Pointe-au-Père / Mont-Joli
1915-1998
Trend of 1.8°C / 84 years
Model 1
Model 3
7
ExampleExample
Difference between candidate and reference
Step of 1.1°C in 1943
-3
-2
-1
0
1
2
3
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
4
5
6
7
8
9
10
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
4
5
6
7
8
9
10
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
Annual mean of daily maximum temperature
Trend of 1.8°C / 84 years
Adjusted series
Trend of 0.1°C / 84 years
Step of 1.1°C in 1943
8
RemarksRemarks
First changepoint is not always associated to a “real” change
- use an hierarchical procedure to find all changepoints until . convergence of the position of each changepoint . each segment is homogeneous . each segment is too short
Reference series can contain inhomogeneities- a step in the neighbour series can affect the candidate series- a network bias is difficult to detect- preferable to confirm the changepoint with metadata
9
Application to the 12 monthly series for changepoint p identified in annual mean
temperature
-4
-3
-2
-1
0
1
2
3
4
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
1943
-6
-5
-4
-3
-2
-1
0
1
2
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
1943
Difference between candidate and reference
January
July
-4
-3
-2
-1
0
1
2
3
4
1°C
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-4
-3
-2
-1
0
1
2
3
4
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
1943
December
Adjustment of monthly temperatureAdjustment of monthly temperature
If ai show seasonality => apply ai
If ai randomly distributed => apply annual adjustment
Monthly Adjustments (ai i=1,12)
-1
0
1
2
1
10
ExampleExample
-3
-2
-1
0
1
2
3
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
Annual mean of daily maximum temperatureof Pointe-au-Père / Mont-Joli
Step of 1.1°C in 1943
-4
-3
-2
-1
0
1
2
3
4
1°C
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Monthly adjustments
Before 1943
After 1943
Instruments on the roof
11
-4
-3
-2
-1
0
1
2
3
4
1°C
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-4
-3
-2
-1
0
1
2
3
4
1°C
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Adjustment of daily temperatureAdjustment of daily temperature
Linear interpolation between midmonth target values objectively chosen so that the average of the daily adjustments over a given month is equal to the monthly adjustment:
T = A-1M
where M are monthly adjustments and A =
8/78/1
8/18/68/1
...
8/18/68/1
8/18/7
Regression model 3 applied to individual daily series for changepoint p:
y = a3 + b3I + c3x1 + e i = 1, …, n
I = 0 for i = 5, …, p-1 I = 1 for i = p, …, n-5
12
Testing the methodology using simulated values
Testing the methodology using simulated values
13
Simulation of annual mean temperaturesSimulation of annual mean temperatures
Homogeneous Series (series with no steps)• Random numbers ~ N(0,1) with AR(1)=0.1• 1000 homogeneous series of 100 values (years)
Series with one step• Step of magnitude 0.25, 0.50, 0.75, …, 2.00 σ• Position 5, 10, 15, 20, 35, 50• 48 000 series with a single step
Series with a random number of steps• Step of magnitude ∂ = 0.5 to 2.0 σ; ∂ ~ N(0,1) • Position ∆t = exp(0.05), ∆t ≥ 10• 25 000 series with a random number of steps (0 to 7
steps)
Reference series• Reference series cross-correlated with candidate series
with correlation factor ~ 0.8 and re-standardized
14
Identification of homogeneous series Identification of homogeneous series
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
0 10 20 30 40 50 60 70 80 90 100
Position
Mag
nitu
de (
°C)
SNHT TPR
MLR WSR
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
0 10 20 30 40 50 60 70 80 90 100
Position
Mag
nitu
de (
°C)
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
0 10 20 30 40 50 60 70 80 90 100
Position
Mag
nitu
de (
°C)
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
0 10 20 30 40 50 60 70 80 90 100
Position
Ma
gnitu
de (
°C)
Position and magnitude of steps falsely detectedwhen the procedure is applied to 1000 homogeneous series
Position and magnitude of steps falsely detectedwhen the procedure is applied to 1000 homogeneous series
15
0102030405060708090
100
5 10 15 20 35 50
0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
Identification of a single step Identification of a single step
0
10
20
30
40
50
60
70
80
90
100
5 10 15 20 35 50
Position
Pe
rce
nta
ge
of s
tep
s
Percentage of steps identified when one step is introduced in the candidate series
Percentage of steps identified when one step is introduced in the candidate series
SNHT TPR
0
10
20
30
40
50
60
70
80
90
100
5 10 15 20 35 50
Position
Pe
rce
nta
ge
of s
tep
s
0
10
20
30
40
50
60
70
80
90
100
5 10 15 20 35 50
Position
Pe
rce
nta
ge
of s
tep
s
0
10
20
30
40
50
60
70
80
90
100
5 10 15 20 35 50
Position
Pe
rce
nta
ge
of s
tep
s
WSRMLR
16
Identification of a random number of steps Identification of a random number of steps
Percentage of steps detected versus number of steps introducedPercentage of steps detected versus number of steps introduced
Number of steps artificially introduced in the series
Method Steps detected 0 1 2 3 4 5 6 7
SNHT 0 93.9 0.1 1.3 0.4 0.4 1.1 1 5.8 92.5 1.2 4.2 2.2 1.2 3.42 0.3 7.0 89.2 4.3 9.1 4.4 4.03 0.4 7.8 84.0 9.1 14.7 6.2 16.74 0.4 6.8 74.6 14.8 23.2 33.35 0.1 0.3 4.5 61.6 18.6 16.76 0.1 2.2 44.07 0.6 33.3
TPR 0 96.5 29.0 8.3 2.5 1.1 0.3 1 0.6 56.7 39.2 16.6 6.3 2.8 0.62 0.9 8.1 34.4 38.7 25.4 13.3 5.73 0.9 3.2 11.3 27.3 36.2 33.6 14.7 16.74 0.5 1.4 4.7 10.5 21.4 30.5 41.8 66.65 0.2 0.8 1.4 3.6 7.6 15.7 31.6 16.76 0.2 0.6 0.6 0.7 1.9 3.6 5.67 0.2 0.2 0.1 0.1 0.2 0.2
MLR 0 96.5 0.1 0.1 0.1 0.1 0.1 0.61 3.5 69.6 9.5 1.1 0.7 2.5 2.32 20.6 64.9 16.5 6.2 9.1 2.83 7.5 20.1 63.9 20.0 23.8 6.24 2.0 4.7 15.9 57.6 26.8 52.5 50.05 0.2 0.6 2.4 14.9 28.8 29.9 50.06 0.1 0.1 0.5 8.7 5.77 0.2
WRS 0 94.2 12 19.3 17.8 16.6 15.6 13.5 16.71 5.6 21.5 7.1 5.3 3.3 4.2 7.9 2 0.2 29.9 14.4 7.8 7.1 4.3 1.13 22.1 23.1 15.8 12.7 12.9 11.3 4 12.4 19.2 19.7 16.3 13.2 11.95 1.8 10.8 16.6 17.5 17.2 7.3 33.26 0.2 4.6 10.7 14.2 14.6 18.1 16.77 0.1 1.4 4.7 8.2 10.3 14.1 16.7
17
Identification of biases in Canadian climate dataIdentification of biases in Canadian climate data
18
Bias in relative humidity due to a change instruments75 Canadian climate stations
Example: Kuujjuaq Québec, 1955-2004, dewcel introduced in 1978
Bias in relative humidity due to a change instruments75 Canadian climate stations
Example: Kuujjuaq Québec, 1955-2004, dewcel introduced in 1978
60
65
70
75
80
85
90
1950 1960 1970 1980 1990 2000 2010
%
60
65
70
75
80
85
90
1950 1960 1970 1980 1990 2000 2010
%
60
65
70
75
80
85
90
1950 1960 1970 1980 1990 2000 2010
%
60
65
70
75
80
85
90
1950 1960 1970 1980 1990 2000 2010
%
Winter
Summer Fall
Spring
Step -8.0%
Step -3.3%Step -2.8%
Step -7.1%
Original valuesAdjusted valuesMany missing very cold values
before the introduction of the dewcel
19
Annual
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
850 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
500 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
100 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
50 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
20 hPa
Winter
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
850 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
500 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
100 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
50 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
20 hPa
Summer
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
850 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
500 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
100 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
50 hPa
-3
-2
-1
0
1
2
3
1960 1970 1980 1990 2000
20 hPa
Temperature anomalies
mean for Canada5 pressure levels
observations at 12 UTC
Bias in radiosonde temperature due to the introduction of
a radiation correction
25 Canadian stations
Bias in radiosonde temperature due to the introduction of
a radiation correction
25 Canadian stations
During 1985-1995:- semi-automated system implemented- switch to Vaisala instruments- introduction of radiation correction
20
Bias in daily minimum temperature due to a change in observing time
120 Canadian climate stations
Bias in daily minimum temperature due to a change in observing time
120 Canadian climate stations
°C-1.2 -0.8 -0.4
• On July 1, 1961, the climatological day was redefined to end at 06 UTC
• Prior to that, it ended at 12 UTC for max temp and 00 UTC for min temp
• The redefinition of the climatological day has created a cold bias in daily min temp
Decreasing step identified in 1961; a filled triangle indicate a significant step at 5%
level
21
Bias in daily minimum temperature due to a change in observing time
Bias in daily minimum temperature due to a change in observing time
0
5
10
15
20
25
30
0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16
Observing time (hour)
Tem
pera
ture
(°C
)
0
5
10
15
20
25
30
0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16
Observing time (hour)
Tem
pera
ture
(°C
)
0
5
10
15
20
25
30
0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16
Observing time (hour)
Tem
pera
ture
(°C
)
Jul 19
0
5
10
15
20
25
30
0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16
Observing time (hour)
Tem
pera
ture
(°C
)
Jul 19
Hourly temperatures at Kapuskasing from July 18 to 23, 2007
12 UTC 00 UTC
06 UTC06 UTC
22
Application to benchmark temperature datasets
Application to benchmark temperature datasets
23
Temperature surrogate group 1Temperature surrogate group 1
Calculate monthly anomalies
- departures from the 1961-1990 reference period
Produce the long series- sequence of monthly values for 100 years (1200 values)
Produce a reference series for each station - average of the remaining stations
Search for all potential changepoints- the first changepoint is not necessary a real one
When all changepoints identified, determine magnitude of each step
24
Station Name Station iD Period # chpts Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
Saint-Georges 49281 1909-1999 4 1949 (-1.5) 1956 (0.5) 1963 (0.5) 1971 (0.6)
Groix 56069 1925-1999 2 1949 (0.8) 1956 (-1.4)
Chartres 28070 1900-1999 6 1916 (-0.7) 1930 (-1.0) 1925 (-0.4) 1949 (2.2) 1959 (-0.7) 1977 (-0.7)
Rennes 35281 1900-1999 5 1905 (-0.4) 1951 (1.1) 1959 (0.7) 1990 (-1.1) 1994 (-0.4)
Gievres 41097 1917-1999 4 1925 (-0.5) 1945 (-1.0) 1960 (0.6) 1985 (0.5)
St-Cornier 61377 1913-1999 3 1925 (2.1) 1959 (-4.0) 1994 (2.8)
Ile-Yeu 85113 1921-1999 4 1928 (0.8) 1933 (-1.0) 1963 (0.9) 1991 (-0.4)
La-Mothe 85152 1900-1999 5 1906 (-0.9) 1925 (-0.3) 1952 (-1.3) 1960 (0.8) 1998 (1.0)
Biard 86027 1905-1999 5 1914 (-2.3) 1916 (1.4) 1924 (-0.6) 1950 (0.6) 1996 (-0.7)
Position and magnitude (°C) of each step identified by the regression approach
Temperature surrogate group 1Temperature surrogate group 1
25
-8
-6
-4
-2
0
2
4
6
8
120 240 360 480 600 720 840 960 1080 1200
1949 1956
Trend of -0.3°C for 1925-1999
-8
-6
-4
-2
0
2
4
6
8
120 240 360 480 600 720 840 960 1080 1200
1949 1956
Trend of -1.3°C for 1925-1999
Monthly anomalies at Groix Adjusted monthly anomalies at Groix
Temperature surrogate group 1Temperature surrogate group 1
26
Discontinuities in precipitation due to joining station observations
Discontinuities in precipitation due to joining station observations
27
Does joining precipitation records create any artificial steps?
Does joining precipitation records create any artificial steps?
Annual total rainfall for Edson Alberta
0
100
200
300
400
500
600
700
800
1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
Edson 3062240 Edson 3062241 Edson A 3062244 Edson CR10 3062246
28
MethodologyMethodology
Let Ti and Ni be the monthly total rain (or snow) at the tested site and neighbour respectively for year i:
Ratios: . if Ti > 0 and Ni > 0 then qi =Ti/Ni
. if Ti = 0 and Ni = 0 then qi = 1 . if Ti or Ni = 0 (or missing) then qi = missingOutliers: . qi < q0.25 – (3*(q0.75-q0.25)) q0.25 and q0.75 are 25th and 75th percentiles . qi > q0.75 + (3*(q0.75-q0.25)) outliers qi = missingStandardized ratios: . zi = (qi – Q) / sq Q is average of qi, sq is standard deviation Apply t-test on {zi} to determine if the difference in the means before and after the joining date is different from zero at the significance level 5%: . 30 years before and after joining date . minimum of 5 years on each side of the joining date
Adjustments: . Ai = qai / qbi qbi & qai are ratio means before & after joining date
29
ExampleDigby Airport and Bear River joined in 1957
ExampleDigby Airport and Bear River joined in 1957
Monthly, annual and long series (LS) adjustments by each neighbour. The number in bold indicates that the adjustment correspond to a step significant
at the 5% level.
Neighbour Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS
Rain
1 1.26 0.97 0.88 1.11 1.06 0.85 0.77 1.05 0.90 0.93 1.02 1.10 0.98 1.00
2 1.34 1.25 1.42 1.10 1.05 0.84 0.83 1.09 0.85 0.94 1.09 1.11 1.02 1.09
3 1.01 1.12 0.86 1.16 1.08 0.89 0.81 1.04 0.96 1.06 1.04 0.96 0.97 0.97
4 1.46 1.18 1.12 1.09 1.10 0.88 0.74 0.90 0.99 1.03 1.11 1.18 1.05 1.06
5 1.17 0.99 0.97 1.16 1.14 0.96 0.79 0.97 1.06 0.99 0.89 0.99 0.99 1.00
6 1.46 0.95 1.05 1.00 1.00 0.90 0.99 1.02 0.98 1.18 1.08 1.10 1.06 1.06
7 1.45 1.19 1.13 1.18 1.05 1.04 0.90 1.07 0.97 0.94 0.99 1.12 1.02 1.06
8 1.67 1.50 1.54 1.13 1.08 0.85 0.88 0.92 0.97 1.01 0.95 1.16 1.08 1.09
Snow
1 1.93 1.53 1.10 1.30 0.96 1.44 1.51 1.39
2 2.04 1.57 1.27 1.03 1.64 1.78 1.65 1.56
3 1.52 1.12 0.98 1.30 1.31 1.39 1.21 1.27
4 1.75 1.27 1.33 2.07 1.25 1.46 1.42 1.48
5 2.29 1.59 1.63 1.69 1.08 1.85 1.83 1.70
6 2.03 1.89 1.08 1.30 1.16 1.64 1.69 1.52
7 2.21 1.65 1.39 2.15 1.15 1.63 1.70 1.70
8 1.71 1.41 1.66 1.37 1.26 1.55 1.74 1.45
30
ExampleDigby Airport and Bear River joined in 1957
ExampleDigby Airport and Bear River joined in 1957
Monthly, annual and long series (LS) adjustments obtainedusing the neighbours (purple) and overlapping data (green)
0.8
1.0
1.2
1.4
1.6
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS
0.8
1.0
1.2
1.4
1.6
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS
Rain Snow
31
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Box plots of the differences between the adjustments from neighbours and adjustments from overlapping data obtained from the 60 stations. The circle, box
and whiskers indicate the median, 10th and 90th percentiles, and minimum and maximum values.
Rain Snow
Comparing adjustments from neighbours and overlapComparing adjustments from neighbours and overlap
32
Thank you!
Merci!
Gracias!