Upload
jose-c-beraun-tapia
View
212
Download
0
Embed Size (px)
DESCRIPTION
econometria
Citation preview
1
PhD Program in Business Administration and Quantitative Methods
FINANCIAL ECONOMETRICS
2007-2008
ESTHER RUIZ
CHAPTER 4. STOCHASTIC VOLATILITY MODELS
4.1 Properties of ARSV(1) model
ARCH-type models assume that the volatility can be observed one-step-ahead.
However, a more realistic model for volatility can be based on modelling it having a
predictable component that depends on past information and an unexpected noise. In
this case, the volatility is a latent unobserved variable. One interpretation of the latent
volatility is that it represents the arrival of new information into the market; see, for
example, Clark (1973). In the simplest case, the log-volatility follows an AR(1) process.
Then, we have the ARSV(1) model given by *ttty =
ttt ++= )log()log( 2* 12* where t is a strict white noise with variance 1. The noise of the volatility equation, t , is assumed to be a Gaussian white noise with variance 2 independent of the noise of the level, t . The Gaussianity of t , which may seem rather ad hoc, means that the log-volatility process has a Normal distribution. However, there are several empirical
studies that support this assumption both for exchange rates and stock returns; see
Andersen, T.G., T. Bollerslev, F.X. Diebold and H. Ebens (2001) and Andersen, T.G.,
T. Bollerslev, F.X. Diebold and P. Labys (2001, 2003).
The parameter is related with the marginal variance of returns. A more convenient re-parameterization of the ARSV(1) is
ttty *= ttt += )log()log( 2 12
2
where 2/1
* 1exp
= is a scale parameter that removes the necessity of including a
constant term in the equation of the log-volatility.
The persistence is measured by the parameter . Finally, 2 measures the uncertainty of the volatility. If 02 = , then the process is homoscedastic. If we assume that the variance of )log( 2t is fixed regardless of the persistence parameter, , then the ARSV(1) can be re-parameterized once more as follows:
ttty *= ttt 2/122 12 )1()log()log( +=
Note that, as 1 , the process approaches homoscedasticity. Stochastic Volatility models have several attractives. First of all, they are closer than
GARCH models to the models often postulated in financial theories. Furthermore, their
properties are usually easy to derive as they follow directly from the properties of the
log-Normal distribution. However, the presence of two noises makes their estimation
hard.
The main statistical properties of SV models have been review by Ghysels, Harvey
and Renault (1996). In particular, the series ty is stationary if the log-volatility process
is stationary, i.e. if 1
3
14cexp
1)k(4cexp
)k(2h
2
c
h2h
2
c
=
, for k1,
where 2h and h(k) are the variance and the ACF of the underlying log-volatility, ht, and c is a constant defined as:
c = E(|t| 2c) / [E(|t| c)]2 = 2])21
2c([)
21()
21c( ++
where () is the gamma function. For the cases of main interest, c=1 and c=2, this constant takes the values 1=/2 and 2=3, respectively. If, for example, c=2, t is Gaussian and )log( 2t is an AR(1) process, then
( )( ) 1exp3 1exp)( 22
2 =
h
khk
This acf was derived by Taylor (1986) who showed that if 2 is small and/or is close to one, it can be approximated by
( )( ) khhk
1exp31exp
)( 22
2
However, this approximation is not always appropriate. The approximate
autocorrelations are always larger than the true ones and their rate of decay, , is smaller. Therefore, we may have a distorted picture of the underlying dynamics of
squared returns.
4
The parameter 2 governs the degree of kurtosis independently of the persistence of volatility measured by . Introducing the noise t makes the ARSV(1) model more flexible in the sense that it is able to generate higher kurtosis than the
GARCH(1,1) model without increasing 2(1) and without forcing the volatility to be close to the non-stationarity region.
5
The conditional distribution of yt is not Normal even if we assume that t is Gaussian. The volatility is unobservable. However, it is possible to estimate it by
running the Kalman filter. For this consider the following linear transformation of
returns:
log( 2ty ) = + ht + t
ttt hh += 1 where =log( 2* )+E[log( 2t )] and t=log( 2t )-E[log( 2t )] is a non-Gaussian, zero mean, white noise process with variance 2 and independent of ht. If, for example, t is
Gaussian, then 27.1))(log( 2 =tE and 2))(log(2
2 =tVar . Because, log( 2t ) is not truly Gaussian, the Kalman filter yields minimium mean square linear estimators (MMSLE)
of ht and future observations rather than minimum mean square estimators (MMSE).
The variance of log( 2ty ) is (0)= 22h + , its autocovariance function coincides with the autocovariance function of ht and its ACF is given by:
(k) =1
2h
2
h 1)k(
+ , for k1.
The ACF of log-squared returns is proportional to the ACF of ht, with the factor
of proportionality being smaller than one.
6
4.2 Comparison with GARCH(1,1) model
The relationship between kurtosis, persistence of shocks to volatility and first-
order autocorrelation of squares is different in GARCH and ARSV models. This
difference can explain why when both models are fitted to the same series:
i) The persistence estimated is usually larger in GARCH models than in ARSV models;
Taylor (1994), Shephard (1996), Kim, Shephard and Chib (1998), Hafner and Herwartz
(2000) and Anderson (2001).
ii) The Gaussianity assumption for the errors seem more adequate in ARSV models than
in GARCH models; Shephard (1996), Ghysels, Harvey and Renault (1996), Kim,
Shephard and Chib (1998) and Hafner and Herwartz (2000).
7
8
The relationship between kurtosis, persistence and )1(2 for an ARSV(1) model is given by
1
1)1(2
y
y
When the Normal-GARCH model is fitted to represent the evolution of
volatility, the kurtosis and first-order autocorrelation of squared returns implied by the
estimated parameters could be much larger than the corresponding population
coefficients of the simulated data. The persistence estimated by the GARCH model is
also usually larger than the persistence of the underlying true autocorrelations of
squares.
4.3 Estimation
Given that the conditional distribution of yt is not Normal, the ML estimator
cannot be obtained by traditional methods. This is the reason why there is a large list of
alternative methods to estimate SV models. Next, we describe some of the more
promising from an empirical point of view.
4.3.1 Method of Moments
These methods have the difficulty that their efficiency depends on the choice of
moments. Furthermore, a particular distribution needs to be assumed for t . Finally, its efficiency is reduced as the process approaches the non-stationarity as it is often the
case in the empirical application to financial returns.
Consider, for example, the estimator based on the sample variance. If, the log-
volatility is a random walk, the stationary form of log( 2ty ) is
)log()log( 22 ttty = . Therefore, if we denote by )log( 2* tt yy = , then, assuming that t is Normal, )2/(2 222* +=y . Consequently, 222 * = y . A method of moments (MM) estimator of 2 is then given by
222*
~ = ys . If, 02 > , then )~( 22 T has an asymptotic normal distribution with zero mean and variance [ ]4222 )(2 ++ .
9
Melino and Turnbull (1990) proposed to estimate the stationary SV models by
GMM using the following moments:
( )( ))(
)(
)(
)(
2222
33
44
22
htthtt
htthtt
tt
tt
tt
tt
yyEyy
yyEyy
yEy
yEyyEy
yEy
GMM is relatively inefficient due to the largely arbitrary choice of unconditional
moments that can be computed in closed form, while the likelihood-based procedures
achieve the Cramer-Rao efficiency bound. The Efficient Method of Moments (EMM)
seeks efficiency improvements, while maintaining the general flexibility of GMM, by
letting the data guide the choice of an auxiliary quasi-likelihood which serves to
generate an efficient set of moments; see Andersen, Chung and Sorensen (1999).
4.3.2 Quasi Maximum Likelihood
The QML estimator was independently proposed by Nelson (1988) and Harvey
et al. (1994) and is based on the Kalman filter. This is applied to log( 2ty ), where the
observations are standardized by the sample standard deviation to obtain one-step ahead
errors and their variances. These are then used to construct the Gaussian likelihood which
is numerically maximized. Ruiz (1994) shows that the QML estimator is consistent and
asymptotically Normal. However, the QML procedure is inefficient, as the method does
not rely on the exact likelihood of log( 2ty ).
The estimator of the constant of the model is uncorrelated with the estimators of
the other parameters of the model; see Ruiz (1994).
The standard theory for the estimation of unobserved component time series
models with non-normal errors applies to the estimates of and 2 . When ht follows a random walk, the Kalman filter approach is still valid if the
restriction 1= is imposed. The only difference is that the first observation is used to initialize the Kalman filter, whereas when 1
10
Finally, note that the model can be estimated by assuming that t has a particular distribution as, for example, Normal. In this case, the parameter 2 does not need to be estimated as it is determined by this distribution. However, Ruiz (1994)
shows that, even if the distribution of t is known, estimating 2 reduces the finite sample variances of the estimates.
Furthermore, estimating 2 , it is possible to test whether t is Normal by testing
2:
22
0 =H . One possible test statistic could be a quasi-Likelihood Ratio (LR) test.
Since the null hypothesis is on the boundary of the admissible parameter space, the
distribution of the LR test is given by 2120 2
121 + where 20 is a degenerate
distribution with all its mass at the origin. The size of the of the LR test can therefore be
set appropriately simply by using the 2, rather than the , significance point of a 21 distribution for a test of size . For example, for =5%, the corresponding critical value
for the Quasi-LR test statistic is 2.71.
4.3.3 Monte Carlo Likelihood
Sandman and Koopman (1998) have proposed the Monte Carlo Likelihood
(MCL) procedure that estimates the likelihood function of log( 2ty ) by a Gaussian part
constructed via the Kalman filter plus a correction for departures from the Gaussian
assumption as follows
+=
)|(
)|(log)|(log)|(log
21log**
GGTGT f
fEYLYL
where { })log(),...,log( 221* TT yyY = , )|( * TG YL is the Gaussian likelihood function, )|(2
1logf is the true 21log density, )|( Gf is the importance density corresponding
to the approximating Gaussian model and GE refers to the expectation with this density.
The MCL procedure also generates simultaneously estimates of the latent
volatilities.
11
4.3.3 MCMC
ML estimators of the parameters of SV models have experience a big progress,
thanks to the development of numerical methods based on importance sampling and
Monte Carlo Markov Chain (MCMC) procedures. In order to derive the likelihood, the
vector of the unobserved volatilities has to be integrated out of the joint probability
distribution. If we denote by { }TT yyY ,...,1= , { }T ,...,1= and { }2,, = , the likelihood is given by
= dfYfYL TT )|(),|()|(log The dimension of this integral is T and its evaluation requires numerical methods.
MCMC estimators of the parameters of SV models were proposed Jacquier et al. (1994).
The Bayesian approach for estimating the parameters, , is to augment this vector with the
latent log-volatilities. After M Monte Carlo replicates of the parameters have been
obtained, )(i , it is possible to obtain density estimates. On the other hand, a natural choice to obtain smooth estimates of the log-volatilities is the marginal posterior expectation
which can be estimated by the sample mean. Finally, it is also possible to obtain interval
predictions of future volatilities conditional on the information available at time T that take
into account the inherent model variability and the parameter uncertainty.
Kim et al. (1998) have also proposed a MCMC algorithm that samples the
unobserved volatilities simultaneously by means of an approximating offset mixture of
normal models, together with an importance reweightening procedure to correct the
linearization error. The KSC procedure provides efficient inferences, likelihood evaluation,
filtered volatility estimates, diagnostics for model failure and computation of statistics for
comparing non-nested volatility models.
Example: Standard & Poors500
GMM JPR QML MCL KSC
)0479.0(
9602.0 )0203.0(
9596.0 )0699.0(
9401.0 )0249.0(
9288.0 )0237.0(
9392.0
2 )0219.0( 0541.0 )0196.0( 0172.0 )0222.0( 0128.0 )0190.0( 0499.0 )0168.0( 0405.0
* )0157.0( 0005.1 )0673.0( 9673.0 )0371.0( 2051.1 )0542.0( 1248.1 )0619.0( 1260.1
12
4.4 Extensions
4.5.1 Leverage effect
Harvey and Shephard (1996) proposed introducing the leverage effect in the
ARSV model through correlation between the noises t and 1+t . Therefore, the volatility of the asymmetric ARSV(1), A-ARSV(1), model is given by
ttty *= ttt += )log()log( 2 12
with =+ ),( 1ttCorr . Harvey and Shephard (1996) show that, in this case, the kurtosis of ty is the same as in the symmetric case. The acf of squared observations,
derived by Taylor (1994), is given by
1)exp(
1)exp()1()( 2
222
2 +=
h
khk
It is interesting to observe that, as in the symmetric ARSV model, the rate of
decay is under for the smaller lags but instead of converging to , it converges to one in the presence of correlation between the level and volatility noises, t and t , respectively. However, notice that, in practice, the autocorrelations of large order are
indistinguishable from zero.
13
For a given value of the kurtosis, the autocorrelation of order one of squares is
larger, the larger is the correlation between the noises.
Therefore, if as expected in empirical applications, the magnitude of the
asymmetry parameter is rather small, the relationship between persistence, kurtosis and
autocorrelations of squares is similar to the one derived for the symmetric ARSV(1)
model.
Jacquier et al. (2004) have proposed an ARSV model with leverage effect where
t and t are correlated. However, Yu (2002) shows that this latter specification has problems and provides empirical evidence favouring the specification proposed by
Harvey and Shephard (1996).
With respect to estimation, see Yu (2004).
4.5.3 Long-memory
The long-memory property has been incorporated into SV models by Harvey (1998)
and Breidt, Crato and deLima (1998), who propose LMSV models where the log-
volatility follows an ARFIMA(p,d,q) process. In particular, when p=1 and q=0, the
model for the series of returns, yt, is given by:
yt = exp(ht/2)t (1.a) (1-L)(1-L)d ht = t (1.b)
where ht= )log( 2t .
Finally, note that model (1) is stationary if ||
14
where F(,;;) is the hypergeometric function. Note that when =0, F(1,1+d;1-d;0)=1, and (3) becomes the variance of an ARFIMA(0,d,0) process,
222h ])d1(/[)d21( = , as given by Harvey (1998). On the other hand, when d=0,
F(1,1;1;)=(1-)-1, and (3) becomes the variance of an AR(1) process, )1/( 222h = , as in Harvey, Ruiz and Shephard (1994).
As k, these autocorrelations behave like h(k)~Ak2d-1 where A is a factor of proportionality that depends on d and . Therefore, the dependence between observations a long time span apart decays at a very slow hyperbolic rate. Finally, it is
possible to show that when d=0, expression (4) becomes the ACF of an AR(1) process,
h(k)=k, and when =0, it becomes the ACF of the ARFIMA(0,d,0) process, h(k)= )dk)...(d2)(d1(
)1kd)...(1d(d++ .
Andersen and Bollerslev (1997) and Robinson (2001) show that the autocovariances
of squared and absolute returns decay at the same rate as the autocovariances of ht for
large lags. This argument is often used to justify the use of these transformations to
identify and model the long-memory of volatility. However, the rate of decay of the
autocorrelations of |yt| or 2ty and those of ht could be rather different for low lags. The
rate of decay of the autocorrelations of squares is clearly smaller than the rate of decay
of the ACF of the log-volatility. Therefore, the autocorrelations of squares decay
towards zero quicker than the autocorrelations of the log-volatility. The rates of decay
of the autocorrelations of both series are the same for large lags. The same behaviour
can be observed when comparing the rates of decay of the autocorrelations of absolute
returns and log-volatility autocorrelations although, in this case, they are closer than
when comparing squared returns and log-volatilities.
Another important difference between the ACF of |yt| or 2ty and the ACF of ht is the
magnitude of the autocorrelations themselves, that are clearly smaller for |yt| and 2ty
than for the log-volatility process. This fact shows up in Figure 1(b) that displays the
ACF of ht together with the ACF of |yt| and 2ty for the same model as before. In this
case, the ACF of squared and absolute returns is nearly five times smaller than the ACF
of the log-volatilities.
15
It is also remarkable that the behaviour of the ACF of short-memory and long-
memory SV models can be rather similar in some cases. Figure 2(a) plots the ACF of 2ty for three LMSV models with parameters {=0.98, d=0, 2 =0.027}, {=0.93, d=0.2, 2 =0.027} and {=0.88, d=0.3, 2 =0.026}, respectively. These models are selected so
that their coefficient of variation, defined as {Var( 2t )/[E( 2t )]2}, is approximately one and the first order autocorrelation of 2ty is 0.19 in all of them. Notice that, in practice,
the rates of decay of the short-memory ARSV model and the LMSV model with small d
could be difficult to distinguish in the first lags. Indeed, the main differences only arise
after the autocorrelation of approximately order 80. Furthermore, observe that the
autocorrelations up to order 20 of the two long memory models displayed in Figure 2(a)
are nearly indistinguishable. In these cases, the knowledge of the behaviour in the long-
run will be essential. The same conclusions would be drawn if the ACF of absolute
returns were used.
The dynamic dependence of the series of returns also appears in the logarithms
of squares by Therefore, both ACFs decay at the same hyperbolic rate but the ACF of
log( 2ty ) takes smaller values. The ACF of log( 2ty ) takes even smaller values than the
ACF of the other two transformations considered. There is a difficulty in distinguishing
among different LMSV models using only the information contained in the ACF. As we
16
will see in next section, this problem will be enlarged because of the negative bias of the
sample ACF of log( 2ty ) in LMSV models.
With respect to estimation, Harvey (1998) and Breidt et al. (1998) proposed a
QML estimator based on maximising the discrete Whittle approximation of the
likelihood function of log( 2ty ) in the frequency domain. The estimates are obtained by
minimizing
+=
Mj j
jyj f
If
TL
);(
)();(log
21)(
*
where [ ]{ },2/,...,2,1 TjM == ()f is the spectral density of 2log ty , Tj
j 2= are
Fourier frequencies, 2
)()( ** yy WI = and = =T
tty ityT
W1
2 )exp(log21)(* are the
periodogram and discrete Fourier transform of 2log ty at frequency . The finite sample properties of the QML estimator have been considered by Prez
and Ruiz (2001).
Arteche (2004) has proven the consistency and asymptotic normality of a related
estimator, the local Whittle estimator. He shows that the added noise has a distorting effect
on the estimates of the memory parameter of the signal. A suitable choice of the bandwidth
is important to lessen its impact.
Once the parameters of the model have been estimated, the underlying volatility at
time t may be estimated by an algorithm proposed by Harvey (1998).
Example: Daily returns of the IBEX35 observed from 7/1/1987 to 30/12/1998. We
remove any correlation in the data fitting a MA(1) and focus the analysis on the
residuals form this model.
17
Estimation results
Model ARSV(1) RWSV LMSV ARLMSV
0.9898 (0.0057)
--- --- 0.6632
d --- --- 0.7538 0.7035
2 0.0168 (0.0042)
0.0099
(0.0024)
0.0906 0.0155
2* 0.9297
(0.0374)
0.9484
(0.0436)
1.5112 1.5484
Sample moments of standardised observations using smoothed estimates of volatility.
Sample moments ARSV(1) RWSV LMSV ARLMSV Mean
Variance
Skewness
Kurtosis
0.0101
1.0000
-0.0284
3.5620
0.0074
1.0000
-0.0441
3.7070
0.0113
0.9999
-0.0183
3.3941
0.0122
0.9999
-0.0127
3.4380
Aut. of squares 1
2
3
4
5
10
50
100
0.0709**
0.0676**
0.0306
0.0342
0.0335
0.0206
0.0011
0.0300
0.0894**
0.0923**
0.0392*
0.0542**
0.0464*
0.0258
0.0011
0.0320
0.0419*
0.0378*
0.0217
0.0129
0.0290
0.0225
-0.0011
0.0256
0.0455*
0.0320
0.0173
0.0053
0.0231
0.0210
-0.0017
0.0249
18
Ljung-Box Test
Q2(10)
Q2(20)
Q2(100)
43.15**
101.70**
150.41**
80.06**
144.61**
198.87**
17.89
71.37*
117.67
15.50 68.43*
113.39
** Significant at 1%; * Significant at 5%
There are other alternative estimation methods proposed for LMSV models:
a) Methods based on state space models: Chan and Petris (2000)
b) Bayesian procedures: Hsu and Breidt (1997)
c) GMM: Wright (1999)
d) Semiparametric estimation: Deo and Hurvich (1998)
References
Andersen, T.G., H.-J- Chung and B.E. Sorensen (1999), Efficient method of moments estimation of a stochastic volatility model: A Monte Carlo study, Journal of Econometrics, 91, 61-87. Andersen, T.G., T. Bollerslev, F.X. Diebold and H. Ebens (2001), The distribution of realized stock return volatility, Journal of Financial Economics, 61, 43-76. Andersen, T.G., T. Bollerslev, F.X. Diebold and P. Labys (2001), The distribution of realized exchange rate volatility, Journal of the American Statistical Association, 96, 42-55. Andersen, T.G., T. Bollerslev, F.X. Diebold and P. Labys (2003), Modelling and forecasting volatility, Econometrica, 71, 579-625. Arteche, J. (2004), Gaussian semiparametric estimation in long memory in stochastic volatility and signal plus noise models, 119, 131-154. Broto, C. and E. Ruiz (2004), Estimation methods for Stochastic Volatility models: A survey, Journal of Economic Surveys, 18, 613-649. Carnero, M.A., D. Pea and E. Ruiz (2004), Persistence and kurtosis in GARCH and Stochastic Volatility Models, Journal of Financial Econometrics, 2, 319-342. Clark, P.K. (1973), A subordinated stochastic process model with fixed variance for speculative prices, Econometrica, 41, 135-156. Ghysels, E., A.C. Harvey and E. Renault (1996), Stochastic Volatility, in G.S. Maddala and C.R. Rao (eds.), Handbook of Statistics, 14, North-Holland, Amsterdam.
19
Harvey, A.C. (1998), Long memory in Stochastic volatility, in J. Knight and S. Satchell (eds.), Forecasting Volatility in Financial Markets, 307-320, Butterworth-Haineman, Oxford. Harvey, A.C., E. Ruiz and N. Shephard (1994), Multivariate Stochastic Variance Models, Review of Economic Studies, 61, 247-2.