49
Nonlinear analysis of time series ARMA(p,q) model q t q t t p t p t t z z z x x x 1 1 1 1 Linear analysis / linear models Advantages: 1. Simple 2. Gaussian process, established theory for stochastic processes and statistical inference 3. Useful in applications Shortcomings: 1. Cannot explain irregular patterns in the time series - data (distribution) asymmetry - time irreversibility - «bursts» 2. Deterministic part: - stable fixed point system - unstable system - periodic system autocorrelation AR model description of irregular patterns explanation / detection of complex deterministic patterns Time series, Part 3 Nonlinear analysis of time series

2013.06.18 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance, part 3

Embed Size (px)

Citation preview

Nonlinear analysis of time series

ARMA(p,q) model qtqttptptt zzzxxx 1111

Linear analysis / linear models

Advantages: 1. Simple

2. Gaussian process, established

theory for stochastic processes

and statistical inference

3. Useful in applications

Shortcomings: 1. Cannot explain irregular patterns

in the time series

- data (distribution) asymmetry

- time irreversibility

- «bursts»

2. Deterministic part:

- stable fixed point system

- unstable system

- periodic system

autocorrelation AR model

description of irregular

patterns explanation / detection of complex

deterministic patterns

Time series, Part 3

Nonlinear analysis of time series

),,,,( 21 tptttt XXXfX A general

nonlinear model

tptttt XXXfX ),,,( 21 additive

noise

p

ptttt

'XXX ,,, 211 X pf :

f ?

tptpttt XXXX 2211

Linear AR

model

Generalizations / extensions of the ΑR model

p ,,, 21

constant (linear ΑR)

random coefficients - RCA

- BL

constant (linear ΑR, ARMA)

function of Xt - ARCH

- GARCH

piecewise models

- SETAR

- Markovian

)1()1(

2

)1(

1 ,,, p )2()2(

2

)2(

1 ,,, p

)()(

2

)(

1 ,,, l

p

ll

Self-excited threshold autoregressive models (SETAR)

ll rrrr ,,,, 110

lrrr 10

lRRR 21

lirrR iii ,,1],,( 1

pPartition of

selection of a lag d,

partition of for dtX

t

j

pt

j

pt

j

t

j

t XXXX )()(

2

)(

21

)(

1

jdt RX

SETAR

when

)1,0(~0 αν4.00.1

0 αν6.00.2

11

11

t

ttt

ttt

tXX

XXX

Example for SETAR

-5 0 5-3

-2

-1

0

1

2

3

4

x(t-1)

x(t

)

(xt-1

,xt) for a SETAR model

AR models with probabilistic selection of threshold

Exponential autoregressive models (EAR)

tt

j

t

j

t XXX 2

)(

21

)(

1

1 με2

με1j

tt

j

t

j

t XXX 2

)(

21

)(

1

AR models with periodic coefficients

12 όταν2

2 όταν1

kt

ktj

1

)1(

1 0)1(

2 0)2(

1 2

)2(

2

Example

Markov chain driven AR models

ljJ t ,,2,1

The selection of the threshold

is determined by a Markov chain )|( 1 iJjJP tt

Transition matrix

Example

tt

J

t XX t 1

)( 9.0)1( 9.0)2(

8.02.0

9.01.0)|( 1 iJjJP tt =

Piecewise polynomial models

tptttt XXXfX ),,,( 21

1 2( , , , )t m t t t p tX p X X X

polynomial of

order p and

degree m

Example

2

1 1 1 1(1 )t t t t tX aX X aX aX logistic map 1a

aa /)1( Two fixed points: 0 and

Fractional autoregressive models

tq

j

j

tj

p

j

j

tj

t

Xbb

Xaa

X

1

10

1

1010 qp

0pa

0qb

Example

Fraction of two polynomials

random coefficients autoregressive models (RCA)

1 ttt XX AR(1) with multiplicative errors

p

i

titiit XtBbX1

)( RCA

ib constant )(tBb iii

)(,),(),( 21 tBtBtB pindependent of

t

tXrandom with mean 0

Example titit XtBX )(1.0 )9.0,0(~ 2tB

Bilinear models (BL)

BL of order 1: ttttt XbaXX 11

p

i

titiit XtAaX1

)(

s

k

ktjki btA1

)(

)(tAa iii coefficients

ts XXts const, tss ,- If linear w.r.t.

“Bilinear” because:

ts Xts const, tsX s ,- If linear w.r.t.

AR models with conditional heteroscedasticity

tX ~ ARCH ~ BL 2

tX

ARCH ttt VX 22

11 ptptt XXV 0

0i

Model of multiplicative noise

),0(~ 2 t

GARCH

q

i

iti

p

i

itit VXV11

2

0ittt VX

0

0i

Analysis with nonlinear models

1. Model selection

2. Parameter estimation

- maximum likelihood method

- method of ordinary least squares

3. Diagnostic checking

uncorrelated

following normal distribution

rgm m 2)(ˆ|ln2)(AIC xθx

Μ candidate models, m = 1,...,M

errors (rediduals):

Real world time series

mechanics

physiology

geophysics economics

Nonlinear time series analysis and dynamical systems

Time series 1 2, , , nx x x

Assumption:

: trajectory of the dynamical system dts

0s : state vector at time 0

dd: tf system function

t : continuous or discrete time

For time series we assume underlying systems to be dissipative

Trajectory in d attractor

d:h observation function

( )t tx h sobservation :

0( )t

t s f sNonlinear dynamical system

Attractor:

● stable fixed (equilibrium) point

● finite set of equilibrium points

● limit cycle

● torus

● strange attractor

self similarity - fractals

chaos sensitivity to initial conditions

can be derived by

a linear system

cannot be derived by

a linear system

Nonlinear dynamical systems, maps (discrete time)

si = 1 – 1.4 si-12 + 0.3si-2

chaotic map Hénon

2

1

1

1

64.0exp9.01

k

kk

s

iiss

chaotic map Ikeda

si = a si-1(1 - si-1)

periodic a=3.52 chaotic a=4

Logistic map

Nonlinear dynamical systems, flows (continuous time)

s3

s1

s2

s1, s2 , s3 Lorenz system:

2133

31212

121 )(

sscss

sssbss

ssas

3

82810 cba

sampling time τs

Noise in the time series

( )t tx h s

0( )t

t s f s

noise ( )t t tx h w s

observational noise

noise

Observation

Dynamical system

0( )t

t tf s s

dynamic (system) noise

tw : white noise, uncorrelated to and tx ts

t : white noise, uncorrelated to us tu

Noise: dynamic (system) ε observational (measurement) w

si = a si-1(1 - si-1)

xi = si + wi, wi ~ N(0,s)

logistic map

si = a si-1(1 - si-1) + εi , εi ~ N(0,s2) xi = si

chaotic

periodic

Scatter diagrams in 2 and 3 dimensions

d=1 d=3 d=2

d=1 d=3 d=2

0 50 100 150 200 250 3000

50

100

150

200

time index i

x(i)

annual sunspots 1700-1996

0 50 100 150 2000

50

100

150

200

x(i)

x(i-1)

sunspots

050

100150

200 0

50

100

150

200

0

50

100

150

200

x(i-1)

sunspots

x(i)

x(i-2)

0 50 100 150 200 250 3000

100

200

300

400

500

time index i

x(i)

square of AR(9)

0 100 200 300 400 5000

100

200

300

400

500

x(i)

x(i-1)

Square of AR(9)

0200

400600 0

200

400

600

0

100

200

300

400

500

x(i-1)

Square of AR(9)

x(i)

x(i-2)

50 100 150 200 2500

500

1000

1500

2000

time index i

x(i)

square of z-lorenz

0 500 1000 1500 20000

500

1000

1500

2000

x(i)

x(i-1)

square of z-lorenz

0500

10001500

2000 0

500

1000

1500

2000

0

500

1000

1500

2000

x(i-1)

square of z-lorenz

x(i)

x(i-2)

Scatter diagrams in 2 and 3 dimensions

- Other topics:

- Hypothesis testing for linearity / nonlinearity

- Control system evolution

- Synchronization

- …

- State space reconstruction

in order to observe the complexity / stochasticity / structure

of the system

- Estimation of characteristics of the system / attractor

measuring the complexity / dimension of the system

- Modeling / Prediction

Use nonlinear models to improve predictions

Topics in

the analysis of time series and dynamical systems

xi = [xi , xi-t ,…, xi-(m-1)t ]

Method of delays

Parameters

embedding dimension m

delay time t time window length tw

tw = (m-1)t

We assume that

the studied system

is deterministic

State space reconstruction

initial state

space

M

is

1is

)(1 ii sfs

x

R

observed

quantity

xi = h(si )

h

Embedding

?

1ixix

)(1 ii xFx

Rm

reconstructed

state space

xi = F(si ) Φ

condition: 12 Dm

m=2 τ=1

s(i)= 1 – 1.4 s(i-1)2 + 0.3s(i-2)

or

s1 (i)= 1 – 1.4 s1(i-1)2 + s2(i-1)

s2 (i)= 0.3 s1(i-1)

Method of delays

Example: Hénon map

xi= s1 (i)

projection

m=3 τ=1

m=2 τ=2

m=3 τ=2

self-intersections

τ =10 xi= s1 (i)

projection

τ=1

Method of delays, m=3

3213

21312

211 )(

cssss

sbssss

ssas

a=10, b=28, c=8/3

Example: Lorenz system

optimal τ ? τ =5

τ =20

• From the autocorrelation r(τ)

(measures linear correlation)

τ r(τ) =1/e ή τ r(τ) =0

Estimation of τ

)()(

),(log),(),(

, ypxp

yxpyxpYXI

YX

XY

yx

XY

)(),( t

t

IYXI

xYxX ii

• From the mutual information I(τ)

(measures linear and

nonlinear correlation)

τ first local minimum I(τ)

• Close points on the attractor are:

- either real neighboring points due to system dynamics

- or false neighboring points due to self-intersections and insufficiently low m

Method of false nearest neighbors (FNN)

Estimation of m Optimal m ?

R

R2

• Takens theorem:

… but D is unknown 12 Dm

• At a larger m where there are no self-intersections all false neighboring points

will be resolved as they will no longer be close

• The optimal m’ is the one for which there are no longer any false nearest

neighbors as the dimension increases by one from m’ to m’+1.

• Too small m

self-intersection in the attractor

• Too large m

“curse of dimensionality”

An example of estimating m by the method FNN

The estimation of m with the method FNN depends on:

- the delay τ

- noise

x-Lorenz without noise

2 4 6 8 100

5

10

15

20

25

30

35

40

m

% F

NN

FNN, x-lorenz, no-noise

t=2

t=5

t=10

t=20

x-Lorenz + 10% noise

2 4 6 8 100

5

10

15

20

25

30

35

40

m

% F

NN

FNN, x-lorenz 10% noise

t=2

t=5

t=10

t=20

• Dimension

1. Euclidean

2. Topologic

3. Fractal

(correlation, information, box counting, …)

• Lyapunov exponents

(largest, the whole spectrum)

• Entropy

Estimation of nonlinear characteristics

Nonlinear characteristics (invariant measures)

The correlation dimension ν characterizes the fractal structure of the

attractor (self-similarity at different scales) using the density of the points

of the attractor in the reconstructed state space

The basic idea is that the probability of two points being

closer than a distance r

Correlation dimension ν

rji xx

changes w.r.t. r as a power of r

i : number of points lying in a sphere with

radius r and center ix

i i jxr x x

scaling law r

xi ~

ν integer the attractor is a regular geometric object

ν non-integer attractor is a fractal

holds for

0r N

xi

xi

xi

xi

rrC )(Scaling law for small r

Convergence of ν(m) for m sufficiently large

Estimation d log ( )

d log

C r

r for a range of r

If ν small and non-integer and the system is deterministic

small dimension and fractal (chaotic) structure

Estimation of the correlation dimension ν

Correlation sum

N

i

N

ij

jrNN

rC1 1)1(

2)( xxi

Nii ,,1, xreconstruction time series , 1, , ( 1)ix i N m t

Estimation of xi

0 when 0( )

1 when 0

xx

x

Heaviside function

x-Lorenz + 10% observational noise, τ=2

x-Lorenz + 10% observational noise, τ=10

log C(r) vs log r local slope vs log r ν vs m x-Lorenz without noise, τ=2

The estimation of ν is affected by the following factors:

- correlation time wji

- selection of τ and m

- noise

- time series length

-2 -1.5 -1 -0.5 0 0.5-5

-4

-3

-2

-1

0

logr

logC

(r)

m=1

m=10

()

-2 -1.5 -1 -0.5 0 0.50

1

2

3

4

5

log r

local slo

pe

m=1

m=10

()

0 2 4 6 8 100

1

2

3

4

5

m

()

n=924

Hénon

-2 -1.5 -1 -0.5 0 0.5-5

-4

-3

-2

-1

0

logr

logC

(r)

m=1

m=10

()

-2 -1.5 -1 -0.5 0 0.50

1

2

3

4

5

log r

local slo

pe

m=1

m=10

()

0 2 4 6 8 100

1

2

3

4

5

m

(t)

Hénon

+ 10% white noise

-4 -3.5 -3 -2.5 -2 -1.5 -1

-5

-4

-3

-2

-1

0

logr

logC

(r)

m=1

m=10

()

-4 -3.5 -3 -2.5 -2 -1.5 -10

2

4

6

8

10

log r

local slo

pe

m=1

m=10

()

0 2 4 6 8 100

2

4

6

8

10

m

()

Returns of ASE index

1/1/2005 – 20/9/2005

-4 -3.5 -3 -2.5 -2 -1.5 -1

-5

-4

-3

-2

-1

0

logr

logC

(r)

m=1

m=10

()

-4 -3.5 -3 -2.5 -2 -1.5 -10

2

4

6

8

10

log r

local slo

pe

m=1

m=10

()

0 2 4 6 8 100

2

4

6

8

10

m

()

white noise

The Lyapunov exponents measure the average rate of divergence and convergence

of the trajectories on the attractor at the directions of the local state space

Lyapunov spectrum: m ...21

λi > 0 divergence

λi < 0 convergence

λi = 0 direction of flow

If λ1 > 0 and the system is deterministic

chaos

Lyapunov exponents

Dissipative system :

m

i

i

1

0

xi

xi’

xi+t

xi’+t

d0 dt

Largest Lyapunov exponent λ1

Initial distance d0= xi - xi’ of two nearby trajectories is

expected to increase exponentially with time

If t

t e 1

0

λ1 is the largest

Lyapunov exponent

N

j j

jt

Nt 1 ,0

,

1 ln1

Computation:

After time t: dt= xi+t - xi’+t

Example: x-Lorenz

without noise with 10%-noise

The estimation of λ1 depends on : τ, m, noise

The true system generating the time series: )(1 ii sfs

Prediction models

2

1, 1 1, 2,

2, 1 1,

1 1.4

0.3

i i i

i i

s s s

s s

Hénon map

1

1, 2, 1, 1( , )f

i i is s s

2

1, 2, 2, 1( , )f

i i is s s

1i ifs s

The true system generating the time series: unknown )(1 ii sfs

The problem of modeling and prediction of time series:

given x1, x2, … xi , to estimate / predict xi+1

State space reconstruction

with the method of delays:

xi = [xi, xi-t …, xi-(m-1)t]

Prediction models

The reconstructed system from the time series: estimation? )(1 ii xFx

The function that is relevant to

time series prediction:

)(1 ii xFx

)(1 ii Fx x

mm :F

mF :

1 1( , )i i ix F x x m = 2, τ = 1

• Semi-local models, e.g. neural networks

the form of function F is derived as a weighted sum of

local basic functions

Nonlinear prediction models

• Global models, e.g. polynomials

function F bears the same analytic expression

for the whole domain

• Local models, e.g. the local linear model

function F is defined differently at each point of the

reconstructed state space

Prediction using similar segments of the time series

Prediction at time i+T from the mappings Τ step ahead of

“similar” segments from the past of the time series

Local prediction models

Implementation of the idea of “similar” segments:

time series segments reconstructed points

},...,,{ )()2()1( Kiii xxxThe nearest neighboring points to xi:

Prediction of xi+T from the mappings of the neighbors: },...,,{ )()2()1( TKiTiTi xxx

Zeroth order prediction: TiiTi xTxx )1()(ˆ

Average prediction:

K

j

Tjii xK

Tx1

)(

1)(

Local linear prediction

We assume that for the neighbor of xi the local linear model is valid :

i

mimii

miiiii

'a

xaxaxaa

xxxFFx

xa

x

0

)1(210

)1(1 ),,,()(

tt

tt

xi(1)+T = a0 + a’ xi(1)

xi(2)+T = a0 + a’ xi(2)

xi(K)+T = a0 + a’ xi(K)

The model holds for

)()2()1( ,...,, Kiii xxx

K

j

mjimjijiaaa

xaxaaxm 1

2

)1()()(101)(,,,

)(min10

t

Estimation of parameters

(method of ordinary least squares) maaa ,,, 10

Estimation of prediction error

We split the time series in two parts:

1 11 2, 1, , , , ,N N Nx x x x x

learning set test set

1 1ˆ ˆ, ,N Nx x

predictions ˆ

i T i T i Te x x

prediction error

N

i

i

TN

Nt

TtTt

xxN

xxNTN

T

1

2

1

2

1

1

ˆ1

)(NRMSE 1statistic for

prediction error

( )ix T

Example: x-Lorenz • local linear prediction model (LLP)

Prediction with: • local average prediction model (LAP)

11,5,1 Kmt

without noise

with 10%-noise

0 2 4 6 8 100.7

0.8

0.9

1

1.1

m

nrm

se(m

)

()

ARLAM(K=15)LLM(K=15)

Prediction error (nrmse) for the

last 30 quarters

annual- quarter growth rate of GNP of USE in the period 1947 – 1991

164 166 168 170 172 174 176-0.01

-0.005

0

0.005

0.01

0.015

0.02()

realAR(3)LAM(m=5,K=15)LLM(m=5,K=15)

Predictions starting at the first

quarter of 1989 with prediction

horizon being the last 6 years

Prediction with

- linear model, AR

- local average model, LAM

- local linear model, LLM

Prediction starting at 20/9/2005

and prediction horizon is up to 16 days ahead

ASE index in the period 1/1/2002 – 20/9/2005

Predict index with

- linear model, AR

- local average model, LAM

returns 1

1

t tt

t

x xy

x

18 25 02 09 16-0.015

-0.01

-0.005

0

0.005

0.01

0.015

day

retu

rns o

f in

dex

()

general index returnsy

n(T), AR(7)

yn(T), LAM(m=7,K=20)

index

18 25 02 09 163200

3250

3300

3350

3400

3450

day

clo

se index

()

general indexx

n(T), AR(7)

xn(T), LAM(m=7,K=20)

One step ahead prediction

in the period 21/9/2005 – 12/10/2005

ASE index in the period 1/1/2002 – 20/9/2005

Predict index with

- linear model, AR

- local average model, LAM

returns 1

1

t tt

t

x xy

x

18 25 02 09 16-0.015

-0.01

-0.005

0

0.005

0.01

0.015

day

index r

etu

rn

()

general indexy

n(1) AR(7)

yn(1) LAM(m=7,K=20)

index

18 25 02 09 163200

3250

3300

3350

3400

3450

day

clo

se index

()

general indexx

n(1) AR(7)

xn(1) LAM(m=7,K=20)