21
Rotated EOFs: When the domain sizes are larger than optimal for conventional EOF analysis but still small enough so that the real structure in the data is not completely obscured by sampling variability, the EOF patterns can be simplified or rendered more robust by taking linear combinations of (or rotating) the leading EOFs and project them back on the input data matrix [X] to obtain the PCs. This is referred to as the rotated EOF. Often the REOF are more robust in sampling variability because physical variables tend to be correlated, instead of orthogonal. REOF works better when the domain is larger than the structure of the variability in the data. When the domain is smaller, REOF tends to produce the similar pattern as EOF.

Rotated EOFs - Jackson School of · PDF fileRotated EOFs: ! When the domain sizes are larger than optimal for ... Trenberth, K. E., Branstator, G.W., Karoly, D., Kumar, A., Lau, N.,

  • Upload
    vuanh

  • View
    218

  • Download
    2

Embed Size (px)

Citation preview

Rotated EOFs:

�  When the domain sizes are larger than optimal for conventional EOF analysis but still small enough so that the real structure in the data is not completely obscured by sampling variability, the EOF patterns can be simplified or rendered more robust by taking linear combinations of (or rotating) the leading EOFs and project them back on the input data matrix [X] to obtain the PCs. This is referred to as the rotated EOF.

�  Often the REOF are more robust in sampling variability because physical variables tend to be correlated, instead of orthogonal. REOF works better when the domain is larger than the structure of the variability in the data. When the domain is smaller, REOF tends to produce the similar pattern as EOF.

Problems with EOF (unrotated) 1.  Orthogonality constraint which limits the physical

interpretation of EOFs �  The mathematics of the procedure constrains the higher order

components to be orthogonal to the first component although natural processes do not necessarily have such orthogonal patterns

�  Natural processes could be correlated with each other (von Storch and Zwiers, 1999)

�  While it may be possible to identify the physical mode associated with the first mode, those associated with the second and higher order modes may not prevail in the real world

Problems with EOF (unrotated) 2.  Effect of domain shape

�  Meteorological stations are not spaced uniformly. Unrotated EOF/PCA has no information about the spatial distribution of the original datapoints (Wilks, 2006)

�  If EOF/PC analysis is applied to irregularly spaced data the variability of the element under study in localized data-rich regions is overemphasized over that of other regions (Karl, et al., 1982).

�  A similar problem arises when data from a regular latitude-longitude grid is used as the number of gridpoints per unit area increases with increasing latitude or when longitude and latitude spacing differs.

Problems with EOF (unrotated) 3.  The third problem relates to the effect of data

domain shape – termed “Buell patterns” or “Buell effects”. �  In rectangular domain, points at the centre will have the

strongest correlation with all other points as their distance to all other points is a minimum. Thus, the first component tends to represent the dominant central points.

�  Given the orthogonality constraint, the second component will represent the next dominant pattern that happens to be the corner points of the domain – i.e. points with the lowest correlation.

�  Such an artificial pattern arises due to the mathematics behind eigenvector calculation (Wilks, 2006).

Circumventing the problems �  Shape-dependent problems can be addressed by

interpolating data onto an equal-area grid

�  Orthogonality problem can be addressed by using the technique of Rotated EOF

�  REOF relaxes the orthogonality constraint

Technique used for rotation: Varimax

�  REOF seeks an orthogonal rotation of the factor-loading matrix ejp = E into a new factor matrix bjp = B for which the variance of squared elements of the eigenvector is a maximum.

�  Varimax method: the simplest, assume EOFs (ek), weighted by the square roots of their respective eigen values (bjk). The factors, Sp, is rotated to makes the large loading larger and smaller loading smaller.

�  Orthogonal rotations must satisfy the equation:

B – a k x m rotated EOF loadings matrix

A – a k x m initial unrotated loading matrix

T – a m x m orthonormal transformation matrix

 As T is an orthonormal matrix,

Where, I - is an identity matrix

bjk = λk ejk, the EOFs are rotated to maximize the factor

sp2 =

1M

bjp2{ }

j=1

M

∑2

−1M 2 bjp

2{ }j=1

M

∑#$%

&%

'(%

)%

2

, p =1, 2,..m

s2 = sp2{ }

j=1

m

•  Because REOF maximize the leading EOFs (make them close to 1) and minimize other EOFs (make them close to zero), it produces highly localized pattern, looks like one-point correlation. However, it is objectively determined, rather than subjectively determined as the one-point correlation.

Selecting modes for rotation

�  “Broken stick approach” known as scree plot approach

Find mode where slope levels off

Use EOF to identify the main source of variances:

�  In many cases, a few leading EOFs can explain much of the total variances in the data. Thus, we wish to use these few leading EOFs to represent or reconstruct the data, which are approximately the same as the original data, but with clearer link to one or several sources of variabilities as represented by the leading EOF modes and noise removed. One can also use these leading EOF modes to predict the original data.

�  For example, to reconstruct or predict the variation of the global sea surface temperature anomalies (SSTA), we can use the first three EOF modes (or rotated EOFs) in our in-class exercise, which explain ?? of total variance of the global SSTA.

First 3 EOF modes

Courtesy of Nelun Fernando and Bing Pu

Points to note about the EOF modes

�  Land values were set to 0 before the analysis �  They appear in colours corresponding to 0 in colour

map �  We will see how this can be changed by editing figure

properties

�  EOF1 shows a time series with a clear declining trend �  Is this physically meaningful? What do we know about

global SST trends?

First Rotated EOF mode (REOF1) and PC time coefficients

•  Still shows a declining trend in time coefficients •  Let’s multiply the loadings and time

coefficients by -1.

Original EOF1

REOF1 and PC time coefficients after multiplying by -1

•  The time coefficients show the increasing trend. •  We now need to adjust the colours in the map to reflect •  the actual data

•  (reds/oranges – warmer SSTs; blues/greens – cooler SSTs)

REOF1 after correcting colour map

•  What does the REOF1 mode show?

Useful references

�  Barnston, A. G., and R. E. Livezey (1987), Classification, seasonality and persistence of low-frequency atmospheric circulation patterns, Monthly Weather Review, 115, 1083-1126.

�  Karl, T. R., Koscielny, A.J., and Diaz, H.F. (1982), Potential errors in the application of principal component (eigenvector) analysis to geophysical data, Journal of Applied Meteorology, 21, 1183-1186.

�  Kim, K. Y., and Wu, Q. (1999), A comparison study of EOF techniques: analysis of nonstationary data with periodic statistics, Journal of Climate, 12, 185-199.

�  O'Lenic, E. A., and Livezey, R.E. (1988), Practical considerations in the use of Rotated Principal Component Analysis (RPCA) in diagnostic studies of upper-air height fields, Monthly Weather Review, 116, 1682-1689.

�  Richman, M. B. (1987), Rotation of principal components: a reply, Journal of Climatology, 7, 511-520.

�  Trenberth, K. E., Branstator, G.W., Karoly, D., Kumar, A., Lau, N., Ropelewski, C. (1998), Progress during TOGA in understanding and modeling global teleconnections associated with tropical sea surface temperatures, Journal of Geophysical Research, 103, 14291-14324.

Complex EOF �  Conventional EOF analysis allows the detection of standing oscillation.

For propagating oscillation, such as waves, they often show up as 2 or more separate EOFs, instead of one mode of variation.

�  For large data with unknown dominant frequency and spatial scales, cross-spectral analysis is less informative, especially for regional, non-stationary phenomena characterized by short lived, irregularly occurring and episodes or propagating wave signals. CEOF is more suitable tool.

A scalar field x j,t can be represented by

x j,t = ajω

∑ (ω)cos ωt( )+ bj (ω)sin(ωt)

A propagating feature can be described by

Xj,t =ω

∑ aj (ω)cos ωt( )+ bj (ω)sin(ωt)xj,k

"

#$$

%

&''+ i bj (ω)cos ωt( )− aj (ω)sin(ωt)

x̂ j,k

"

#

$$

%

&

''

=x j,t + ix̂ j,t where x j,t is the original data, x̂ j,k is the quadrature function or Hilbert transform. It's amplitude is the same as x j,t, but phase is advanced by π /2.

Horel 1984

�  Examples of Hilbert transformation: Hilbert transforms does not act as a low-pass filter upon the data It contains as much energy due to noise as original data and it may redistribute the noise to different part of the time series.

Original data

Hilbert transform

Vector form of the Complex time series

�  One can use apply eigen-analysis to the correlation between the jth and kth location to [X*j(t)] in a similar way of conventional EOF, namely,

After subtract mean and normalize by standard deviation, we obtain

correlation matrix: rj,k = [Xj (t)*Xk (t)]t = ejnekn

*

n=1

N

where e jn is the eigenvector and ejn* is the eigenvector in complex conjugation

apply eigen analysis to rj,k in a similar way as regular EOF,

Xj (t) = [ej,n*

n=1

N

∑ Zn (t)]

ejn* =[X j (t)*Pn(t)]= sjne

iθ jn where sjn is the magnitude of the correlation and

eiθ jn is the phase of the correlation. Pn(t)=Tn(t)eiφn (t ) is the PC.

§  Start from a data set p(x,t) §  Hilbert transform to get a complex data

set P(x,t)=p(x,t)+i p(x,t) §  Get covariance matrix

C(x,x’)=<p*(x,t)P(x’,t)> §  Get real eigenvalues λn , complex

eigenvectors Bn(x), and associated principal components An(t)

§  Spatial phase and amplitude functions θn(x)=arctan[ImBn (x)/ReBn (x)] Sn(x)=[Bn(x)B*

n(x)]0.5

§  Temporal phase and amplitude function φn(t)=arctan[ImAn (t)/ReAn (t)] Rn(x)=[An(t)A*

n(t)]0.5

Basic steps

Barnett (1983, 1984)

Matlab code

http://jisao.washington.edu/vimont_matlab/Stat_Tools/complex_eof.html