32
1 Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research Branch, Science and Technologies Branch, Environment Canada Presentation to the COST meeting in Tarragona, Spain March 9-11, 2009

Outline

  • Upload
    ronda

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Detection of discontinuities using an approach based on regression models and application to benchmark temperature by Lucie Vincent Climate Research Branch, Science and Technologies Branch, Environment Canada Presentation to the COST meeting in Tarragona, Spain March 9-11, 2009. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Outline

1

Detection of discontinuities using an approach based on regression models and

application to benchmark temperature

by Lucie VincentClimate Research Branch, Science and Technologies Branch, Environment

Canada Presentation to the COST meeting in Tarragona, Spain

March 9-11, 2009

Detection of discontinuities using an approach based on regression models and

application to benchmark temperature

by Lucie VincentClimate Research Branch, Science and Technologies Branch, Environment

Canada Presentation to the COST meeting in Tarragona, Spain

March 9-11, 2009

Page 2: Outline

2

OutlineOutline

Methodology- identification of changepoints in annual mean temperature - adjustment of monthly and daily values

Testing methodology using simulated values- homogenous series, single step, random number of steps

Identification of biases in Canadian climate data- bias in relative humidity due to change in instruments - bias in radiosonde temperature due to introduction of correction factor- bias in daily minimum temperature due to a change in observing time

Application to benchmark temperature datasets

- monthly mean minimum temperature at Groix

Discontinuities in precipitation due to joining station observations

Page 3: Outline

3

MethodologyMethodology

Page 4: Outline

4

Let y the candidate series and x the reference series

Model 1: to identify an homogeneous series

y = a1 + c1x1 + e -2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

0 10 20 30 40 50 60 70 80 90 100

Model 2: to identify a trend

y = a2 + b2i + c2x1 + e i = 1, …, n

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

0 10 20 30 40 50 60 70 80 90 100

Model 3: to identify a step

y = a3 + b3I + c3x1 + e i = 1, …, n

I = 0 for i = 5, …, p-1 I = 1 for i = p, …, n-5 -2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

0 10 20 30 40 50 60 70 80 90 100

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

0 10 20 30 40 50 60 70 80 90 100

Difference between candidate and reference

Model 4: to identify a step w trends bef & aft

y = a4 + b4iI1 + a5I2 + b5iI2 + c4x1 + e i = 1, …, n

I1 = 1 and I2 = 0 for i = 5,…, p-1 I1 = 0 and I2 = 1 for i = p, …, n-5

Identification of changepoint in annual mean temperature

Identification of changepoint in annual mean temperature

Page 5: Outline

5

Durbin-Watson test: to determine if candidate series is homogeneous (autocorrelation)

ei = ρei-1 + μi

H0: ρ = 0 versus Ha: ρ > 0

D = Σ(ei – ei-1)2 / Σei2

if D > du => H0 ; if D < dl => Ha

if dl ≤ D ≤ du test is inconclusiveF test: to determine if the introduction of additional variables improve the fit

Model 1 and Model 3 are compared

H0: b3 = 0 versus Ha: b3 ≠ 0

F* = [(SSE1–SSE3)/(df1–f3)] / SSE3/df3

if F* > F(1-α; 1, n-3) reject H0

Statistical testsStatistical tests

If there is a significant changepoint, divide series into two segments and re-test each segment separately

Page 6: Outline

6

ExampleExample

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

0 2 4 6 8 10 12 14 16 18 20

Lag k

Au

toco

rrel

atio

n a

t la

g k

20

30

40

50

60

70

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

SS

E

1943

4

5

6

7

8

9

10

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

°C

Annual mean of daily maximum temperatureof Pointe-au-Père / Mont-Joli

1915-1998

Trend of 1.8°C / 84 years

Model 1

Model 3

Page 7: Outline

7

ExampleExample

Difference between candidate and reference

Step of 1.1°C in 1943

-3

-2

-1

0

1

2

3

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

4

5

6

7

8

9

10

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

4

5

6

7

8

9

10

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

Annual mean of daily maximum temperature

Trend of 1.8°C / 84 years

Adjusted series

Trend of 0.1°C / 84 years

Step of 1.1°C in 1943

Page 8: Outline

8

RemarksRemarks

First changepoint is not always associated to a “real” change

- use an hierarchical procedure to find all changepoints until . convergence of the position of each changepoint . each segment is homogeneous . each segment is too short

Reference series can contain inhomogeneities- a step in the neighbour series can affect the candidate series- a network bias is difficult to detect- preferable to confirm the changepoint with metadata

Page 9: Outline

9

Application to the 12 monthly series for changepoint p identified in annual mean

temperature

-4

-3

-2

-1

0

1

2

3

4

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

1943

-6

-5

-4

-3

-2

-1

0

1

2

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

1943

Difference between candidate and reference

January

July

-4

-3

-2

-1

0

1

2

3

4

1°C

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

-4

-3

-2

-1

0

1

2

3

4

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

1943

December

Adjustment of monthly temperatureAdjustment of monthly temperature

If ai show seasonality => apply ai

If ai randomly distributed => apply annual adjustment

Monthly Adjustments (ai i=1,12)

-1

0

1

2

1

Page 10: Outline

10

ExampleExample

-3

-2

-1

0

1

2

3

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

Annual mean of daily maximum temperatureof Pointe-au-Père / Mont-Joli

Step of 1.1°C in 1943

-4

-3

-2

-1

0

1

2

3

4

1°C

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Monthly adjustments

Before 1943

After 1943

Instruments on the roof

Page 11: Outline

11

-4

-3

-2

-1

0

1

2

3

4

1°C

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

-4

-3

-2

-1

0

1

2

3

4

1°C

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Adjustment of daily temperatureAdjustment of daily temperature

Linear interpolation between midmonth target values objectively chosen so that the average of the daily adjustments over a given month is equal to the monthly adjustment:

T = A-1M

where M are monthly adjustments and A =

8/78/1

8/18/68/1

...

8/18/68/1

8/18/7

Regression model 3 applied to individual daily series for changepoint p:

y = a3 + b3I + c3x1 + e i = 1, …, n

I = 0 for i = 5, …, p-1 I = 1 for i = p, …, n-5

Page 12: Outline

12

Testing the methodology using simulated values

Testing the methodology using simulated values

Page 13: Outline

13

Simulation of annual mean temperaturesSimulation of annual mean temperatures

Homogeneous Series (series with no steps)• Random numbers ~ N(0,1) with AR(1)=0.1• 1000 homogeneous series of 100 values (years)

Series with one step• Step of magnitude 0.25, 0.50, 0.75, …, 2.00 σ• Position 5, 10, 15, 20, 35, 50• 48 000 series with a single step

Series with a random number of steps• Step of magnitude ∂ = 0.5 to 2.0 σ; ∂ ~ N(0,1) • Position ∆t = exp(0.05), ∆t ≥ 10• 25 000 series with a random number of steps (0 to 7

steps)

Reference series• Reference series cross-correlated with candidate series

with correlation factor ~ 0.8 and re-standardized

Page 14: Outline

14

Identification of homogeneous series Identification of homogeneous series

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

0 10 20 30 40 50 60 70 80 90 100

Position

Mag

nitu

de (

°C)

SNHT TPR

MLR WSR

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

0 10 20 30 40 50 60 70 80 90 100

Position

Mag

nitu

de (

°C)

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

0 10 20 30 40 50 60 70 80 90 100

Position

Mag

nitu

de (

°C)

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

0 10 20 30 40 50 60 70 80 90 100

Position

Ma

gnitu

de (

°C)

Position and magnitude of steps falsely detectedwhen the procedure is applied to 1000 homogeneous series

Position and magnitude of steps falsely detectedwhen the procedure is applied to 1000 homogeneous series

Page 15: Outline

15

0102030405060708090

100

5 10 15 20 35 50

0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00

Identification of a single step Identification of a single step

0

10

20

30

40

50

60

70

80

90

100

5 10 15 20 35 50

Position

Pe

rce

nta

ge

of s

tep

s

Percentage of steps identified when one step is introduced in the candidate series

Percentage of steps identified when one step is introduced in the candidate series

SNHT TPR

0

10

20

30

40

50

60

70

80

90

100

5 10 15 20 35 50

Position

Pe

rce

nta

ge

of s

tep

s

0

10

20

30

40

50

60

70

80

90

100

5 10 15 20 35 50

Position

Pe

rce

nta

ge

of s

tep

s

0

10

20

30

40

50

60

70

80

90

100

5 10 15 20 35 50

Position

Pe

rce

nta

ge

of s

tep

s

WSRMLR

Page 16: Outline

16

Identification of a random number of steps Identification of a random number of steps

Percentage of steps detected versus number of steps introducedPercentage of steps detected versus number of steps introduced

Number of steps artificially introduced in the series

Method Steps detected 0 1 2 3 4 5 6 7

SNHT 0 93.9 0.1 1.3 0.4 0.4 1.1 1 5.8 92.5 1.2 4.2 2.2 1.2 3.42 0.3 7.0 89.2 4.3 9.1 4.4 4.03 0.4 7.8 84.0 9.1 14.7 6.2 16.74 0.4 6.8 74.6 14.8 23.2 33.35 0.1 0.3 4.5 61.6 18.6 16.76 0.1 2.2 44.07 0.6 33.3

TPR 0 96.5 29.0 8.3 2.5 1.1 0.3 1 0.6 56.7 39.2 16.6 6.3 2.8 0.62 0.9 8.1 34.4 38.7 25.4 13.3 5.73 0.9 3.2 11.3 27.3 36.2 33.6 14.7 16.74 0.5 1.4 4.7 10.5 21.4 30.5 41.8 66.65 0.2 0.8 1.4 3.6 7.6 15.7 31.6 16.76 0.2 0.6 0.6 0.7 1.9 3.6 5.67 0.2 0.2 0.1 0.1 0.2 0.2

MLR 0 96.5 0.1 0.1 0.1 0.1 0.1 0.61 3.5 69.6 9.5 1.1 0.7 2.5 2.32 20.6 64.9 16.5 6.2 9.1 2.83 7.5 20.1 63.9 20.0 23.8 6.24 2.0 4.7 15.9 57.6 26.8 52.5 50.05 0.2 0.6 2.4 14.9 28.8 29.9 50.06 0.1 0.1 0.5 8.7 5.77 0.2

WRS 0 94.2 12 19.3 17.8 16.6 15.6 13.5 16.71 5.6 21.5 7.1 5.3 3.3 4.2 7.9 2 0.2 29.9 14.4 7.8 7.1 4.3 1.13 22.1 23.1 15.8 12.7 12.9 11.3 4 12.4 19.2 19.7 16.3 13.2 11.95 1.8 10.8 16.6 17.5 17.2 7.3 33.26 0.2 4.6 10.7 14.2 14.6 18.1 16.77 0.1 1.4 4.7 8.2 10.3 14.1 16.7

Page 17: Outline

17

Identification of biases in Canadian climate dataIdentification of biases in Canadian climate data

Page 18: Outline

18

Bias in relative humidity due to a change instruments75 Canadian climate stations

Example: Kuujjuaq Québec, 1955-2004, dewcel introduced in 1978

Bias in relative humidity due to a change instruments75 Canadian climate stations

Example: Kuujjuaq Québec, 1955-2004, dewcel introduced in 1978

60

65

70

75

80

85

90

1950 1960 1970 1980 1990 2000 2010

%

60

65

70

75

80

85

90

1950 1960 1970 1980 1990 2000 2010

%

60

65

70

75

80

85

90

1950 1960 1970 1980 1990 2000 2010

%

60

65

70

75

80

85

90

1950 1960 1970 1980 1990 2000 2010

%

Winter

Summer Fall

Spring

Step -8.0%

Step -3.3%Step -2.8%

Step -7.1%

Original valuesAdjusted valuesMany missing very cold values

before the introduction of the dewcel

Page 19: Outline

19

Annual

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

850 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

500 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

100 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

50 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

20 hPa

Winter

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

850 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

500 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

100 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

50 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

20 hPa

Summer

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

850 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

500 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

100 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

50 hPa

-3

-2

-1

0

1

2

3

1960 1970 1980 1990 2000

20 hPa

Temperature anomalies

mean for Canada5 pressure levels

observations at 12 UTC

Bias in radiosonde temperature due to the introduction of

a radiation correction

25 Canadian stations

Bias in radiosonde temperature due to the introduction of

a radiation correction

25 Canadian stations

During 1985-1995:- semi-automated system implemented- switch to Vaisala instruments- introduction of radiation correction

Page 20: Outline

20

Bias in daily minimum temperature due to a change in observing time

120 Canadian climate stations

Bias in daily minimum temperature due to a change in observing time

120 Canadian climate stations

°C-1.2 -0.8 -0.4

• On July 1, 1961, the climatological day was redefined to end at 06 UTC

• Prior to that, it ended at 12 UTC for max temp and 00 UTC for min temp

• The redefinition of the climatological day has created a cold bias in daily min temp

Decreasing step identified in 1961; a filled triangle indicate a significant step at 5%

level

Page 21: Outline

21

Bias in daily minimum temperature due to a change in observing time

Bias in daily minimum temperature due to a change in observing time

0

5

10

15

20

25

30

0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16

Observing time (hour)

Tem

pera

ture

(°C

)

0

5

10

15

20

25

30

0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16

Observing time (hour)

Tem

pera

ture

(°C

)

0

5

10

15

20

25

30

0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16

Observing time (hour)

Tem

pera

ture

(°C

)

Jul 19

0

5

10

15

20

25

30

0 8 16 0 8 16 0 8 16 0 8 16 0 8 16 0 8 16

Observing time (hour)

Tem

pera

ture

(°C

)

Jul 19

Hourly temperatures at Kapuskasing from July 18 to 23, 2007

12 UTC 00 UTC

06 UTC06 UTC

Page 22: Outline

22

Application to benchmark temperature datasets

Application to benchmark temperature datasets

Page 23: Outline

23

Temperature surrogate group 1Temperature surrogate group 1

Calculate monthly anomalies

- departures from the 1961-1990 reference period

Produce the long series- sequence of monthly values for 100 years (1200 values)

Produce a reference series for each station - average of the remaining stations

Search for all potential changepoints- the first changepoint is not necessary a real one

When all changepoints identified, determine magnitude of each step

Page 24: Outline

24

Station Name Station iD Period # chpts Step 1 Step 2 Step 3 Step 4 Step 5 Step 6

Saint-Georges 49281 1909-1999 4 1949 (-1.5) 1956 (0.5) 1963 (0.5) 1971 (0.6)

Groix 56069 1925-1999 2 1949 (0.8) 1956 (-1.4)

Chartres 28070 1900-1999 6 1916 (-0.7) 1930 (-1.0) 1925 (-0.4) 1949 (2.2) 1959 (-0.7) 1977 (-0.7)

Rennes 35281 1900-1999 5 1905 (-0.4) 1951 (1.1) 1959 (0.7) 1990 (-1.1) 1994 (-0.4)

Gievres 41097 1917-1999 4 1925 (-0.5) 1945 (-1.0) 1960 (0.6) 1985 (0.5)

St-Cornier 61377 1913-1999 3 1925 (2.1) 1959 (-4.0) 1994 (2.8)

Ile-Yeu 85113 1921-1999 4 1928 (0.8) 1933 (-1.0) 1963 (0.9) 1991 (-0.4)

La-Mothe 85152 1900-1999 5 1906 (-0.9) 1925 (-0.3) 1952 (-1.3) 1960 (0.8) 1998 (1.0)

Biard 86027 1905-1999 5 1914 (-2.3) 1916 (1.4) 1924 (-0.6) 1950 (0.6) 1996 (-0.7)

Position and magnitude (°C) of each step identified by the regression approach

Temperature surrogate group 1Temperature surrogate group 1

Page 25: Outline

25

-8

-6

-4

-2

0

2

4

6

8

120 240 360 480 600 720 840 960 1080 1200

1949 1956

Trend of -0.3°C for 1925-1999

-8

-6

-4

-2

0

2

4

6

8

120 240 360 480 600 720 840 960 1080 1200

1949 1956

Trend of -1.3°C for 1925-1999

Monthly anomalies at Groix Adjusted monthly anomalies at Groix

Temperature surrogate group 1Temperature surrogate group 1

Page 26: Outline

26

Discontinuities in precipitation due to joining station observations

Discontinuities in precipitation due to joining station observations

Page 27: Outline

27

Does joining precipitation records create any artificial steps?

Does joining precipitation records create any artificial steps?

Annual total rainfall for Edson Alberta

0

100

200

300

400

500

600

700

800

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000

Edson 3062240 Edson 3062241 Edson A 3062244 Edson CR10 3062246

Page 28: Outline

28

MethodologyMethodology

Let Ti and Ni be the monthly total rain (or snow) at the tested site and neighbour respectively for year i:

Ratios: . if Ti > 0 and Ni > 0 then qi =Ti/Ni

. if Ti = 0 and Ni = 0 then qi = 1 . if Ti or Ni = 0 (or missing) then qi = missingOutliers: . qi < q0.25 – (3*(q0.75-q0.25)) q0.25 and q0.75 are 25th and 75th percentiles . qi > q0.75 + (3*(q0.75-q0.25)) outliers qi = missingStandardized ratios: . zi = (qi – Q) / sq Q is average of qi, sq is standard deviation Apply t-test on {zi} to determine if the difference in the means before and after the joining date is different from zero at the significance level 5%: . 30 years before and after joining date . minimum of 5 years on each side of the joining date

Adjustments: . Ai = qai / qbi qbi & qai are ratio means before & after joining date

Page 29: Outline

29

ExampleDigby Airport and Bear River joined in 1957

ExampleDigby Airport and Bear River joined in 1957

Monthly, annual and long series (LS) adjustments by each neighbour. The number in bold indicates that the adjustment correspond to a step significant

at the 5% level.

Neighbour Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS

Rain

1 1.26 0.97 0.88 1.11 1.06 0.85 0.77 1.05 0.90 0.93 1.02 1.10 0.98 1.00

2 1.34 1.25 1.42 1.10 1.05 0.84 0.83 1.09 0.85 0.94 1.09 1.11 1.02 1.09

3 1.01 1.12 0.86 1.16 1.08 0.89 0.81 1.04 0.96 1.06 1.04 0.96 0.97 0.97

4 1.46 1.18 1.12 1.09 1.10 0.88 0.74 0.90 0.99 1.03 1.11 1.18 1.05 1.06

5 1.17 0.99 0.97 1.16 1.14 0.96 0.79 0.97 1.06 0.99 0.89 0.99 0.99 1.00

6 1.46 0.95 1.05 1.00 1.00 0.90 0.99 1.02 0.98 1.18 1.08 1.10 1.06 1.06

7 1.45 1.19 1.13 1.18 1.05 1.04 0.90 1.07 0.97 0.94 0.99 1.12 1.02 1.06

8 1.67 1.50 1.54 1.13 1.08 0.85 0.88 0.92 0.97 1.01 0.95 1.16 1.08 1.09

Snow

1 1.93 1.53 1.10 1.30 0.96 1.44 1.51 1.39

2 2.04 1.57 1.27 1.03 1.64 1.78 1.65 1.56

3 1.52 1.12 0.98 1.30 1.31 1.39 1.21 1.27

4 1.75 1.27 1.33 2.07 1.25 1.46 1.42 1.48

5 2.29 1.59 1.63 1.69 1.08 1.85 1.83 1.70

6 2.03 1.89 1.08 1.30 1.16 1.64 1.69 1.52

7 2.21 1.65 1.39 2.15 1.15 1.63 1.70 1.70

8 1.71 1.41 1.66 1.37 1.26 1.55 1.74 1.45

Page 30: Outline

30

ExampleDigby Airport and Bear River joined in 1957

ExampleDigby Airport and Bear River joined in 1957

Monthly, annual and long series (LS) adjustments obtainedusing the neighbours (purple) and overlapping data (green)

0.8

1.0

1.2

1.4

1.6

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS

0.8

1.0

1.2

1.4

1.6

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS

Rain Snow

Page 31: Outline

31

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ann LS-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Box plots of the differences between the adjustments from neighbours and adjustments from overlapping data obtained from the 60 stations. The circle, box

and whiskers indicate the median, 10th and 90th percentiles, and minimum and maximum values.

Rain Snow

Comparing adjustments from neighbours and overlapComparing adjustments from neighbours and overlap

Page 32: Outline

32

Thank you!

Merci!

Gracias!