Upload
olivia-wilcox
View
224
Download
2
Tags:
Embed Size (px)
Citation preview
732A34 Time series analysis
Fall semester 2012
• 6 ECTS-credits
• Course tutor and examiner: Anders Nordgaard
• Course web: www.ida.liu.se/~732A34
• Course literature:
• Cryer J.D., Chan K-S.: Time Series Analysis – With Applications in R. 2nd ed. ISBN 978-0-387-75958-6.
• Complementary handouts
Organization of this course:
• Weekly “meetings”: Mixture between lectures, (computer) exercises and seminars
• A great portion of self-studying
• Assignments (every second week)
• Individual written exam
Access to a computer is necessary.
Optimal: Bring your own laptop to the meetings
Examination
The course is examined by
1.Assignments (3 in total)
2.Final written exam
Assignments will be marked Passed or Failed. If Failed, corrections must be done for the mark Pass.
Written exam marks are given according to ECTS grades.
The final grade will be the same grade as for the written exam.
Communication
Contact with course tutor is best through e-mail: [email protected].
Office in Building B, Entrance 27, 2nd floor, corridor E (the small one close to Building E), room 3E:485.
Normal working hours: When teaching
E-mail response almost all weekdays and occassionally in weekends
All necessary information will be communicated through the course web. Always use the English version. The first page contains the most recent information (messages)
Solutions to assignments should be e-mailed.
Note! Course tutor is away from Linköping on
3-4 September
12-14 September
25-29 September
Assignments
A number of exercises will be given as assignments to be individually carried out.
There will not be any supervision for these assignments since they are part of the examination, but they can be carried out in the computer rooms or at home. No other statistical software than R will be needed.
The solutions to the assignments should be submitted in forms of written reports. The core text of these reports may contain graphs and tables, but the latter should be constructed from scratch (i.e. no copying and pasting from R or other software). Besides such components the text should be completely your own and easy to read. Direct outputs from the software (except graphs) can only be included in form of attachments.
In the marking of these reports, emphasis will be put on the English language. It will not be sufficient to simply give short answers to the detailed questions of the exercises.
Time series
Sales figures jan 98 - dec 01
051015202530354045
jun-97
jan-98
jul-98
feb-99
aug-99
mar-00
okt-00
apr-01
nov-01
maj-02
• What kind of patterns can visually be detected?
• Is the development stable or non-stable?
Concentrations of Total Phosphorus (ug/l), Råån, Helsingborg, County of Skåne, Sweden
Monthly measurements1980-2001
0
100
200
300
400
500
600
700
800
900
1000
1980-01-15
1981-01-15
1982-01-15
1983-01-15
1984-01-15
1985-01-15
1986-01-15
1987-01-15
1988-01-15
1989-01-15
1990-01-15
1991-01-15
1992-01-15
1993-01-15
1994-01-15
1995-01-15
1996-01-15
1997-01-15
1998-01-15
1999-01-15
2000-01-15
2001-01-15
• What kind of patterns can visually be detected?
• Is the development stable or non-stable?
• Non-independent observations (correlations structure)
• Systematic variation within a year (seasonal effects)
• Long-term increasing or decreasing level (trend)
• Irregular variation of small magnitude (noise)
Sales figures jan 98 - dec 01
051015202530354045
jun-97
jan-98
jul-98
feb-99
aug-99
mar-00
okt-00
apr-01
nov-01
maj-02
Characteristics:
Concentrations of Total Phosphorus (ug/l), Råån, Helsingborg, County of Skåne, Sweden
Monthly measurements1980-2001
0
100
200
300
400
500
600
700
800
900
1000
1980-01-15
1981-01-15
1982-01-15
1983-01-15
1984-01-15
1985-01-15
1986-01-15
1987-01-15
1988-01-15
1989-01-15
1990-01-15
1991-01-15
1992-01-15
1993-01-15
1994-01-15
1995-01-15
1996-01-15
1997-01-15
1998-01-15
1999-01-15
2000-01-15
2001-01-15
• Economic indicators: Sales figures, employment statistics, stock market indices, …
• Meteorological data: precipitation, temperature,…• Environmental monitoring: concentrations of nutrients and pollutants
in air masses, rivers, marine basins,…• Sports statistics?• Electromagnetic och thermal fields
Where can time series be found?
Time series analysis
Estimate/Investigate different parts of a time series in order to
–understand the historical pattern
–judge upon the current status
–make forecasts of the future development
–judge upon the quality of data
Method This course?
Classical decomposition (Yes)
Time series regression Yes
Exponential smoothing No
ARIMA modelling (Box-Jenkins) Yes
Non-parametric and semi-parametric analysis No
Transfer function and intervention models Yes
State space modelling No
Heteroscedastic models: ARCH, GARCH Yes
Advanced econometric methods: Cointegration No
Spectral domain analysis No
Data mining techniques No
Methodologies
Decomposition
yt yt
A time series can be thought of as built-up by a number of components
Number of employees, private sector 1993:1-2008:4, (1994:1=100)
90
100
110
120
130
140
150
Quarter
Ind
ex
What kind of components can we think of? Long-term? Short-term? Deterministic? Purely random?
Decomposition – Analyse the observed time series in its different components:Trend part (TR)Seasonal part (SN)Cyclical part (CL)Irregular part (IR)
Cyclical part: State-of-market in economic time seriesIn environmental series, usually together with TR
Multiplicative model:
yt=TRt·SNt ·CLt ·IRt
Suitable for economic indicators Level is present in TRt or in TCt=(TR∙CL)t
SNt , IRt (and CLt) works as indices Seasonal variation increases with level of yt
161412108642
16
14
12
10
8
6
4
2
161412108642
10
9
8
7
6
5
4
3
2
1
Additive model:
yt=TRt+SNt+CLt +IRt
More suitable for environmental data Requires constant seasonal variation SNt , IRt (and CLt) vary around 0
Number of employees, private sector 1993:1-2008:4, (1994:1=100)
90
100
110
120
130
140
150
Quarter
Ind
ex Additive or multiplicative
model?
Example 1: Sales figures, additive decomposition
sales figures jan-98-dec-01
0
10
20
30
40
50
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46
observed (blue), deseasonalised (magenta)
0
10
20
30
40
50
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46
observed (blue), estimated trend (green)
0
10
20
30
40
50
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46-10
0
10
20
30
40
50
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46
observed TR SN fitted IR
A more theoretical description
A time-series is a special case of a stochastic process.
A stochastic process is a family of random variables coupled with a deterministic index variable t:
t can be continuous or discrete. Y can be continuous- or discrete-valued.
RtYt ,
Examples
•Yt = Number of events (e.g. number of telephone calls arrived) up to time t. Point process (usually modelled as a Poisson process)•Yt =Number of customers in a queue at time-point t (Birth-and-death process)•Yn = The number of offspring in generation n of a population starting with an initial population Y0. Markov chain (Yn depends only on Yn – 1 )•Assume you score +1 if you toss a coin and get “heads” and –1 if you get “tails”. Let Yn = The sum of scores after n tosses. Random walk•Yt = The temperature outdoors at time point t (infinitesimal resolution)
When t is discrete a stochastic process is called a sequence and constitute a model for an observed time series.(Sometimes the sequence itself is referred to as the time series)
Mean (value) function:
(Auto)covariance function:
(Auto)correlation function:
,2,1,0, tYE tt
,2,1,0,,,, stYYEYYCov stststst
sstt
st
st
ststst
YVarYVar
YYCovYYCorr
,,
,,
,,
Example Random walk
22
2
1
2
111,
11
1
11
2
21
00
i.i.d.00
:1For
000
,3,2,
and 0 with s variablerandom
(i.i.d.) ddistributey identicall andt independen of sequence,,
ee
jiji
jiji
t
ii
ststttst
tttt
ttt
ett
tteE
eEeEeEteeEeE
eeeeeeE
st
eEeEeeEYE
teYY
eY
eVareE
ee
98.025
24;94.0
9
8;2.0
25
1;71.0
2
1
:1For
,min,min
,min
25,249,825,12,1,
22
2
,,
,,
2,
2,
s
t
st
st
st
st
st
tYVar
st
st
ee
e
sstt
stst
ettt
est
Note that t,s and t,s are symmetric functions, i.e.
tsst
tsst
,,
,,
Stationarity
A stochastic process is said to be
strictly stationary if the joint probability distribution of
is the same as the joint probability distribution of
for any set of time points (t1, … , tn ) no matter of the value of k
nttt YYY ,,,
21
ktktkt nYYY ,,,
21
A stochastic process is said to be
weakly stationary (or second-order stationary) if
) ofnt (independe and allfor
allfor (constant)
,0, tkt
t
kktt
t
stationary non-stationary
Roughly: Constant mean and constant variance
White noise
A stochastic process that is a sequence of independent and identically distributed (i.i.d.) random variables e1, e2, … is called a white noise process.
By definition a white noise process is strictly stationary
Independent random variables
ststt ,0, and
00
02
,0 st
steVar etst
Of interest in the construction of models for general processes