Upload
chenoa
View
28
Download
0
Tags:
Embed Size (px)
DESCRIPTION
WMO course - “Statistics and Climatology” - Lecture III. Dr. Bertrand Timbal Regional Meteorological Training Centre, Tehran, Iran December 2003. Statistics and Climatology: Lecture III. Review some classical statistical tools. - PowerPoint PPT Presentation
Citation preview
III-1
WMO course-
“Statistics and Climatology” -
Lecture III
Dr. Bertrand Timbal
Regional Meteorological Training Centre,
Tehran, Iran
December 2003
III-2
Statistics of the Climate system---
Spatio-temporal linkages within the system
Overview:
1. Links within the system: the example of ENSO
2. Regression and correlation of variables
3. Spatial structures: reduction of the degree of freedom
Review some classical statistical toolsstatistical tools
Statistics and Climatology: Lecture III
III-3
Schematic of summer La Niña conditions across the Equatorial Pacific Ocean
El Niño / La Niña : a large scale feature
III-4
Schematic of summer EL Niño conditions across the Equatorial Pacific Ocean
El Niño / La Niña: a large scale feature
III-5
• Temperature, along an equatorial longitude-depth section
• Anomalies are relevant for interannual variability
• Observed with the TAO: array of buoys in the Tropical Pacific
• Thermocline movements important for seasonal forecasting
Thermocline: Layer of strongtemp gradient around 20C
El Niño: a large scale feature
III-6
El Niño: sub-surface ocean anomalies
97-98 El Niño formation
• Anomalous warm water accumulated at depth in the West Pacific and travel across the basin along the thermocline
• The predictability comes from the slow moving ocean anomalies
III-7
Transition to the 98-99 La Niña
III-8
El Niño: air-sea interactions
III-9
El Niño: air-sea interactions
III-10
El Niño: Global Tele-connections
Courtesy of NOAA
III-11
La Niña: Global Tele-connections
Courtesy of NOAA
III-12
El Niño: impact on Australian rainfall
Stratification of the mean climate based on ENSO phases
III-13
La Niña: impact on Australian rainfall
Stratification of the mean climate based on ENSO phases
III-14
Probability of exceeding median rainfall for
Cold, Neutral and Warm conditions in the
Equatorial Pacific Ocean
(Data for 1900-1997)
El Niño: global impact on rainfall
Stratification of the mean climate based on ENSO phases.
III-15
El Niño: impact on Australian Wheat Yields
III-16
Links within the climate system exist:
• El Niño is a planetary scale phenomenon
• Several variables exhibit coherent variations (correlation)
• Distant teleconnections are observed (lag correlation)
• Probabilities are shifted by ENSO phases (predictable)
How to best express these relationships ?
III-17
Statistics of the Climate system---
Spatio-temporal linkages within the system
Overview:
1. Links within the system: the example of ENSO
2. Regression and correlation of variables
3. Spatial structures: reduction of the degree of freedom
III-18
Simple model: Least-Squares Regression
bxay ˆ
)(ˆ iii xyye
Regression: Correlation:
• Pearson ordinary correlation (r)
• a is the intercept for X=0
• b is the slope:
• r2 is the amount of variance explained
n
i
n
iii
n
iii
yxxy
yx
yx
SS
yxr
1 1
2'2'
1
''
)()(
),cov(
y
xxy S
Srb
III-19
r = 0.457 r = 0.336
Courtesy of J. Stockburger
Role of outliers:
• Outlier detection method to find observations with large influence• Problem often arises from either erroneous data or small sample• Graphical visualisation is essential
In this example, out of 100 points, only one data is different !
III-20
False correlationbased on oneerroneous data
Perfect relationshipaffected byone data
In all cases, the correlation is r=0.816 but …
The relationshipis not linear.
Graphical visualisation of correlation
Correlation is not robust and resistant ….Instead we can use the rank correlation: correlation based on ranked data
III-21
Annual SW WA Rainfall
350
550
750
950
1880 1900 1920 1940 1960 1980 2000
AswWArain 20 per. Mov. Avg. (AswWArain)
Annual SW WA Rainfall
350
550
750
950
1880 1900 1920 1940 1960 1980 2000
AswWArain 20 per. Mov. Avg. (AswWArain)
An example of a non linear relation
Perth Inflow v. sw WA Rainfall
0
250
500
750
1000
300 400 500 600 700 800 900 1000
Rainfall (mm/yr)
Inflow
(GL/y
)
Rainfall and river flow
Courtesy of S. Power
III-22
Correlations between seasonal rain and SOI
Correlation is not causation!
• Correlation does not imply causation
• Simultaneous evolution
• Others techniques are needed:
• Path analysis (Blalock, 1971)• Temporal precedence
Is ENSO forced by Australian rainfall? orAre Australian rainfall affected by ENSO?
Courtesy of W. Drosdowsky
III-23
Lag Correlation and auto-correlation
Lagged correlation between the SOI and cyclone formation
• Lag correlation of a series with itself is auto-correlation at lag-k:
• Meteorological variables are auto-correlated (persistence)
• Violate the independent data assumption effective sample size
• Hypothesis testing • Variance estimate
kxx
kk SS
xxr
),cov(
• (Prior) Lag correlations exhibit the dependence between variables• Predictability arises from lag correlation
III-24
Correlation in the climate system:
• Correlation coefficientes express the part of the
variation of two variables which are linked (no causality)
• Correlation assumes normality (!) and linear relation (!)
• A more robust coefficient is the rank correlation
• Lag correlation is useful for causality and predictability
• Auto-correlation of meteorological data has serious
consequences for the use of statistics in climate
III-25
Statistics of the Climate system---
Spatio-temporal linkages within the system
Overview:
1. Links within the system: the example of ENSO
2. Regression and correlation of variables
3. Spatial structures: reduction of the degree of freedom
III-26
Spatial structure in climate data
Several motivations to identify large scale spatial features:
• Data are not spatially independent: spatial correlation
• Large scale structures are more coherent and predictable
• Extract the large scale climate signal
• Reduce the weather noise associated with small scales
• Smaller degree of freedom and reduced data set
• Identify useful relationships to exploit for climate forecasting
III-27
Principal-Component (EOF) Analysis
Objective:
• To reduce the original data set to a new data set of (much) fewer variables• To condense a large fraction of the variance of the original dataset• To explore large multivariate data sets (spatial and temporal variation)
Calculation:
• PCA are done on anomalies
• Based on the covariance [S] or the correlation [R] matrix of a vector X: XTX
• The principal components are the
projection of X on the eigenvectors of [S]: ei
• orthogonal one to an other: new coordinate system
• maximise the variance: measured by the eigenvalues (λi)
III-28
Principal-Component (EOF) Analysis
• Eigenvectors (PCA) are orthogonal
• Strong constraint for small domain (Jolliffe, 1989)
• Typically the 2nd PC is a dipole (not necessarily meaningful)
• The number of PCs to be consider is based on the eigenvalues
III-29
200 hPa
850 hPa
EOFs of combined fields:
Courtesy of M. Wheeler
III-30
The phase-space representation of the MJO
M(t) = [RMM1(t),RMM2(t)]
Vector M traces:
- large anti-clockwise circles about the origin when the MJO is strong.
- random jiggles around the origin when the MJO is weak.
For compositing, we define the 8 equal-angle phases as labeled, and described by the angle
Φ = tan-1[RMM2(t)/RMM1(t)]
Southern Summer = DJFMA Courtesy of M. Wheeler
III-31
MJO propagation based on vector M in the two dimensional phase space
OLR
contour interval = 4 Wm-2
blue negative
850 hPa wind
Max vector = 4.5 ms-1
Courtesy of M. Wheeler
III-32
First two rotated PCAs of Indian/Pacific SSTAs using data from Jan 1949 to Dec 1991.
Rotated PCs
Courtesy of W. Drosdowsky
• Facilitate physical interpretation
• Review by Richman (1986) and by Jolliffe (1989, 2002)
• New set of variable: RPCs
• Varimax is a very classic rotation technique (many others)
III-33
Other multivariate analyses
• Extended EOFs and Complex (Hilbert) EOFs are two classical extensions of PCs
• Canonical Correlation Analysis: extension of PCA to two multivariate data sets: forecasting one variable with the other (book by Wilks, 1995).
• Principal Orthogonal Pattern (POP) and (PIP), SVD are other techniques used (book by von Storch and Navarra, 1995 and von Storch and Zwiers, 1999)
• Discriminant analysis (e.g. the operational seasonal forecast of the BoM): the conditioning is on the predictand and in a sense the reverse conditional probabilities are estimated from the data, and Bayes theorem is used to invert these (article by Drosdowsky and Chambers, 2001)
• Analogue (lecture 7), clustering (book by Wilks, 1995) and NHMM (next slide) are other techniques dealing with classification.
• All techniques can be use for forecasting and downscaling
III-34
1016
1000
1012
1008
1016
1004
1012
1016
1016
1012
Typ
e 5
Typ
e 3
.2 .4 1.8.6
.2 .4 .6 1.8
H H
1020
1012
H
L
L
An other downscaling approachNon-homogeneous Hidden Markov Model: makes use of non observed “hidden” weather states which are related to observed rainfall structures
Courtesy of S. Charles
III-35
Summary:
• Many interactions in the system correlation
• Many issues with correlation: robustness, causality
• Large scale structure exist multivariate analyses
• Useful for filtering, organizing and reducing the noise in data
• Forecasting uses many of these statistical tools
Tool box to analyse our dynamic climate system …. and … basis for climate forecasting