Threshold GARCH Model: Theory and Applicationpublish.uwo.ca/~jwu87/files/JobMarketPaper_JingWu.pdf · Threshold GARCH Model: Theory and Application Jing Wu ... than others, a regime

Threshold GARCH Model: Theory and Application

Jing Wu∗

The University of Western Ontario

October 2011

Abstract

In this paper, we describe the regime shifts in the volatility dynamics by a threshold model,while volatility follows a GARCH process within each regime. This model can be viewed as aspecial case of the random coefficient GARCH model. We establish theoretical conditions, whichensure that the return process in the threshold model is strictly stationary, as well as conditions forthe existence of various moments. A simulation study is further conducted to examine the finitesample properties of the maximum likelihood estimator. The simulation results reveal that themaximum likelihood estimator is approximately unbiased and consistent for modest sample sizeswhen the stationarity conditions hold. Furthermore, using the VIX and volume as the thresholdvariable, we employ 20 stocks from Major Market Index (MMI) and find that the threshold modelfits the data well. The forecasting performance of the model is examined via the use of threethreshold variables: VIX, volume, and number of trades. The results support the use of thresholdvariables in forecasting volatility, especially when there is a volatility spike during the forecastingperiod.

Key Words : Threshold Model, GARCH, Regime Switching, VolatilityJEL Classification: C1, C51, C58

∗I am deeply indebted to my supervisor, Professor John Knight, for his excellent guidance and continuous supportand grateful to Professor Martijn van Hasselt and Youngki Shin for their advice. I also thank Robert de Jong, EricHillebrand, Guofu Zhou, and other participants of the ESWC and the MEG conference 2010 for their suggestionsand valuable comments. I am responsible for all errors.

1

1 Introduction

Volatility modeling and forecasting are very important in financial markets, since volatility is akey component in pricing derivative securities, risk management, and making monetary policy.

Modeling the temporal dependencies in the conditional variance of financial time series hasbeen the interest of many economists and financial analysts. The most popular approaches arethe ARCH model introduced by Engle (1982) and its extension GARCH model by Bollerslev(1986). To capture the striking feature that asset prices move more rapidly during some periodsthan others, a regime switching framework has been brought into ARCH and GARCH models.A widely used class of regime switching models is the hidden Markov model, which assumes thatstates of the world are unknown. While estimation is not difficult, these models often fail togenerate accurate predictions due to the unknown state in the future. In this paper we employa different type of regime switching models – the threshold model – to describe the conditionalvariance process. In this threshold model, the state of the world is determined by an observablethreshold variable and therefore known, while conditional variance follows a GARCH process withineach state. This model can be viewed as a special case of the random coefficient GARCH model.First, we examine the theoretical and empirical properties of the threshold model with an exogenousthreshold variable. We establish theoretical conditions, which ensure that the return process in thethreshold model is strictly stationary, as well as conditions for the existence of various moments.A simulation study is then conducted to examine the finite sample properties of the maximumlikelihood estimator. The simulation results reveal that the maximum likelihood estimator isapproximately unbiased and consistent for modest sample sizes1 when the stationarity conditionshold. We also explore the properties of the threshold GARCH model when the threshold variableis endogenous through simulation studies. Furthermore, using a volatility index and volume asthe threshold variable, we employ 20 stocks from Major Market Index (MMI) and find that thethreshold model fits the data well. Even though the threshold model can well describe the regimeshifts in volatility process, it is a success only if it provides accurate volatility forecasts. Thereforewe also examine the out-of-sample forecasting performance of the threshold model using MMIstocks as well as 8 most active stocks in NASDAQ. The results support the use of the thresholdmodel, especially when the volatility experiences a sudden increase in the forecasting period, the

1We have estimated return data according to sample sizes ranging from 500, 1000, to 2000.

2

volatility forecasts from threshold model can pick up the volatility spike much faster than theforecasts from GARCH model.

1.1 Regime Shifts in Conditional Variance

In most widely used GARCH models the conditional variance is defined as a linear function oflagged conditional variances and squared past returns. Formally, let rt be a sequence of returns, εtbe a series of innovations that are usually assumed to be independent identically distributed (i.i.d.)zero-mean random variable, σ2

t be the variance of rt given information at time t, the GARCH(p, q)

model for returns rt is defined as follows:

rt = σtεt σ2t = α0 +

p∑i=1

βiσ2t−i +

q∑j=1

αjr2t−j

where p, q = 0, 1... are integers, α0 > 0, αj ≥ 0, βi ≥ 0, i = 1, ...p, j = 1, ...q, are modelparameters.

Though these models have been proved to be adequate for explaining the dependence structurein conditional variances, they have several important limitations, one of which is that they fail tocapture the stylized fact that conditional variance tends to be higher after a decrease in returnthan after an equal increase. In order to account for this asymmetry many alternative models havebeen proposed. The exponential GARCH (EGARCH) introduced by Nelson (1991) specifies theconditional variance in logarithmic form2:

lnσ2t = α0 +

p∑i=1

βilnσ2t−i +

q∑k=1

αk[θZt−k + γ(|Zt−k| − (2

π)

12 )] Zt = µt/σt

2The model takes the asymmetry into account while keeping the linear function form of conditional variance.

3

The threshold GARCH (TGARCH) model proposed by Zakoian (1994) and GJR GARCHmodel studied by Glosten, Jagannathan, and Runkle (1993) define the conditional variance as alinear piecewise function. In TGARCH(1,1) ,

σ2t = ω + αr2

t−1 + δDtr2t−1 + βσ2

t−1

Dt−1 =

1 rt−1 < 0

0 rt−1 ≥ 0

More details of such alternative models can be found in the survey of GARCH models byBollerslev, Chou, and Kroner (1992). The above alternative models are able to characterize somestylized facts better than the GARCH model. However there is no evidence that any alternativemodel consistently outperforms the GARCH model, for example Hansen and Lunde (2005) claimthat nothing beats a GARCH (1,1) in the analysis of the exchange rate data.

The TGARCH and GJR-GARCH models also relax the linear restriction on the conditionalvariance dynamics. Questioning the common finding of a high degree of persistence to the con-ditional variance in GARCH model, Lamoureux and Lastrapes (1990) suggest that such highpersistence may be spurious if there are regime shifts in the volatility process. From then bothARCH and GARCH models have been implemented with regime switching (RS) framework. Theearly RS applications, such as Hamilton and Susmel (1994), only allow a Markov-switching ARCHmodel to describe the conditional variances. Gray (1996) and Klaassen (2002), on the other hand,develop a generalized Markov-switching model, in which a GARCH process in conditional varianceis permitted in each regime. In comparison to the popular Markov-switching models, thresholdmodels have clear conceptual advantages while receiving less attention. The Markov-switchingmodels often fail to generate the accurate predictions due to the unknown state in the future.While in the threshold model, the state is governed by an observable threshold variable, thereforeis known. Knight and Satchell (2009) derive theoretical conditions for the existence of station-ary distributions for the threshold models. Based on their work it is now convenient to applythe threshold models as regime-switching models. To account for the possible structural changesin the conditional variances, we use a threshold model to describe regime switches in the condi-tional variance process. We simply assume 2 regimes for the conditional variance, which follows

4

a GARCH process within each regime. Different from Markov-switching models, the regimes areknown because their shifts are triggered by an observable variable. We just need to estimate thethreshold value which in turn determines the change of the state. The model is more complexsince the parameters controlling the conditional variances are changing over time, however it isstill flexible in the sense that single regime is a possible outcome in the estimation procedure3.

1.2 Exogenous and endogenous threshold variable

In addition to incorporating the nonlinearity in the threshold GARCH model, the threshold ortrigger variable takes into account the effect of correlation between conditional variance and otherobservable variables that represent trading activities. The use of the threshold model is particularlymotivated by the volatility volume relationship.

At the time of advancing the volatility modeling, an extensive study on stock return volatility-volume relation has been developed. As mentioned in Poon and Granger (2003) the volatility-volume research may lead to a new and better way for modeling the return distribution. Theearly work on the relationship between stock returns and trading volume that is summarized inKarpoff (1987) shows that volume is positively related to the absolute value of the price change.Later works further identify the positive contemporaneous correlation between return volatilityand volume (Gallant, Rossi, and Tauchen (1992), and Lamoureux and Lastrapes (1990)).

The empirical works establish various relationships between stock returns and trading volume,yet there is no consensus on how to model the underlying generating process theoretically. Thefavored theoretical explanation of positive price-volume correlation is the mixture of distributionhypothesis (MDH), which states that the stock returns and trading volume are driven by thesame underlying latent information variable (Clark (1973), Epps and Epps (1976), Tauchen andPitts (1983), Andersen (1996), and Bollerslev and Jubinski (1999)). One encouraging attempt isAndersen’s (1996) MDH model in which the joint dynamics of returns and volume are generalizedand estimated with a result of significant reduction in the volatility persistence.

More interestingly, recent findings suggest that the size of the trading volume, more specificallythe above average volume has significant effect on conditional variance (Wagner and Marsh 2004).

3If the estimated threshold value is the minimum or maximum of the trigger.

5

Intuitively, the price changes in a stock market can be regarded as a response to arrivals of infor-mation, while the volume of shares traded reflects the arrival rate of information. As mentionedin many studies stock prices experience volatile periods with high intensity of information arrivalsand tranquil periods accompanied by moderate trading activities. If we assume that the volatilityfollows different processes in different regimes, obviously volume provides information about whichregime the volatility is in.

The established volatility-volume relation motivates the use of volume as the trigger variablein our threshold GARCH model. Since volume and volatility are highly correlated, volume mustbe treated as an endogenous threshold variable. Nevertheless, other variables that reflect tradingactivities can also be accommodated. In this paper we first choose the Chicago Board OptionsExchange (CBOE) Volatility Index (VIX) as an exogenous threshold variable since it is a measureof market expectations of near-term volatility and therefore has almost no correlation with currentvolatility but provides information on the state of the current volatility. The VIX is calculatedand disseminated in real-time by CBOE since 1993. It is a weighted blend of prices for a range ofoptions on the S&P 500 index. The formula uses a kernel-smoothed estimator that takes as inputsthe current market prices for all near-term and next-term out-of-the-money calls and puts with atleast 8 days left to expiration. The goal is to estimate the implied volatility of the S&P 500 indexover the next 30 days.

Even though the theoretical conditions for an endogenous threshold variable model cannot bederived, we simulate data according to endogenous threshold variable and examine the performanceof the maximum likelihood estimators based on different endogeneity levels between the thresholdvariable and volatility. The simulation results reveal that the bigger the endogeneity coefficientthe better the performance of MLE estimators. We then apply our model to empirical data usingvolume as the threshold variable.

1.3 Volatility Forecasting

The success of a volatility model is determined crucially by its out-of-sample predicting power.Therefore, extensive research has been devoted to this subject. In the 2003 survey on the volatilityforecasting literature, Poon and Granger (2003) reviewed 93 published and working papers that

6

study the forecasting performance of various volatility models. The comparisons among differentforecasting models show a mixed picture. Poon and Granger conclude that the overall rankingfavors ISD (option implied standard deviation) model, while HISVOL (historical volatility models)and GARCH models are roughly equal. However, they also mention that the success of the impliedvolatility models is benefited from using a larger information set, but they are less practical due tothe availability of options. Since GARCH models perform well in forecasting volatility as describedin Hansen and Lunde (2005), we compare the forecasting performance of the threshold model withGARCH(1,1) model. In volatility forecasting literature there is also a big concern on how shouldthe true volatility be measured. In fact the accuracy of measures of actual volatility has significanteffect on the outcomes in comparing volatility models. Most of the early works use the dailysquared return to proxy actual daily volatility, though as shown in Lopez (2001), while squaredreturn is an unbiased estimator of daily variance, it is a very noise measure of true variance. Takingthis into account, besides using daily squared return, we also compare our volatility forecast withrealized volatility that is constructed from high frequency data of IBM.

Our paper is organized as follows. Section 2 introduces the model and derives the stationarityconditions. Section 3 provides the simulation study. Section 4 discusses the empirical findings,and section 5 concludes.

2 Model

2.1 Introduction

The threshold GARCH model we study in this paper is defined as follows:

rt =σtεt

σ2t =ωst−1 + αst−1r

2t−1 + βst−1σ

2t−1

where rt is the series of demeaned returns and σ2t is the conditional variance of returns given time

t information. We assume that the sequence of innovations εt follow independent and identicaldistribution with mean 0 and variance 1: εt ∼ iidD(0, 1). The parameters {ωSt , αSt , βSt} in theconditional variance equation depend on a threshold variable yt:

7

σ2t = ω0 + α0r

2t−1 + β0σ

2t−1 if St−1 = I(yt−1 ≤ y0) = 0

σ2t = ω1 + α1r

2t−1 + β1σ

2t−1 if St−1 = I(yt−1 > y0) = 1

where the state or regime of the world St is determined by threshold variable yt−1 which can betreated as exogenous or endogenous and threshold value y0 determines the probabilityp(St = 1) = p(yt−1 > y0) = π. To simplify the theoretical derivation, we assume the thresholdvariable is independent of σ2

t .

As in the standard GARCH(1,1) model we impose the non-negative constraints on all param-eters to ensure the conditional variance to be non-negative. However, the conventional stationaryconditions for GARCH model may not apply here. Since the conditional variance can fall into2 different regimes, it is possible that conditional variance is not stationary in one regime butstationary in the other.

For the threshold variable y, we assume that it is a stationary process. This assumption is notcritical, we just want to ensure that given a threshold value, if we leave one regime, it is possiblethat we will return to that regime in the future. If the threshold variable is not stationary, then itis possible that after a point in time, we will only observe one state of the world, this is not a caseof interest in this paper.

The conditional variance dynamics in the threshold GARCH model we define above is similarto a threshold AR (TAR) model. Knight and Satchell (2009) derive the stationarity conditionsfor TAR model following the work of Quinn (1982). We follow Knight and Satchell (2009) inderiving the stationarity conditions for the conditional variance and the return series accordingly.Proposition 1 gives conditions for the existence of stationary solution of return process as well asthe existence of the mean in the threshold GARCH model. We also examine the conditions forthe existence of higher order moments. Proposition 2 provides conditions for the return processto have a stationary variance and Proposition 3 presents conditions for the existence of the fourthmoment. Since the return processes experience low autocorrelation but squared returns are highlycorrelated, we are also interested in examining the theoretical autocorrelation structure of thesquared return. Proposition 4 expresses the formulas for the squared return autocovariance andautocorrelation functions.

8

2.2 Stationary Return Process

2.2.1 Mean and Variance Stationarity Conditions

Given the assumptions that εt is iid distributed variable with D(0, 1) and is independent of σt, it’seasy to see that the return series is mean stationary with E(rt) = 0. To simplify the expression ofhigher order moments we further assume that εt ∼ iid N(0, 1). Thus the unconditional varianceand the fourth moment of return are given by E(r2

t ) = E(σ2t ) and E(r4

t ) = 3E(σ4t ). Obviously to

examine the stationarity of the return series we need to check the first and second moments of theconditional variance σ2

t . The following propositions give the conditions under which the stationarydistribution of return, the stationary variance, the finite fourth moment of return process, andstationary covariance exist. Proofs of the propositions are provided in Appendix A.

PROPOSITION 1. The return series is strict stationary if ω0 <∞, ω1 <∞, and:

(1− π)E[ln(ε2t−mα0 + β0)] + πE[ln(ε2

t−mα1 + β1)] < 0

�

Remark. If we assume that εt ∼ iidN(0, 1), then ε2t v χ2

(1). We obtain the following analyticalexpression for the above strict stationarity condition:

(1− π)E(ln(ε2t−mα0 + β0)) + πE(ln(ε2

t−mα1 + β1)) < 0⇐⇒

(1− π)F (α0, β0) + πF (α1, β1) < 0

where

F (a, b) = ln(b) +b

2a√π

(−2(γ + ln

2b

a)√πa

b− 2√π2F2([1, 1]; [

3

2, 2];

b

2a) +

2π32

√a2

berfi(

√b

2a)

√b

).

γ is the Euler’s constant = 0.577215665.

9

2F2(a, b; c, d;x) =∑∞

n=0

(a)n(b)n(c)n(d)n

xn

n!, (a)n = a(a+ 1)...(a+ n− 1)

erfi(x) = −ierf(ix), erf(x) is the error function.

We can now examine the first order stationarity conditions for the conditional variance processin our threshold GARCH model.

PROPOSITION 2. The return series will be variance stationary if ω0 <∞, ω1 <∞, and:

[(α0 + β0)(1− π) + (α1 + β1)π] < 1

Then the stationary variance is given by:

V ar(rt) = E(σ2t ) = σ2 =

ω0(1− π) + ω1π

1− [(α0 + β0)(1− π) + (α1 + β1)π].

�

2.2.2 Higher Order Moments and Covariance Stationary Conditions

Examining the second moment of σ2t , we can obtain the fourth moment of returns.

PROPOSITION 3. If and only if the following conditions hold:

ω0 <∞, ω1 <∞, [(α0 + β0)(1− π) + (α1 + β1)π] < 1 andA = [(2α2

0 + (α0 + β0)2)(1− π) + (2α21 + (α1 + β1)2)π] < 1

10

The fourth moment of the stationary distribution exists for the return process in our thresholdmodel and is given by

E(r4t ) = 3E(σ4

t ) = 3c20

1 + a0 + b0

(1− A)(1− (a0 + b0))+ 3c1π(1−π)

c1(1− (a0 + b0)) + 2c0(a1 + b1)(1− A)

(1− A)(1− (a0 + b0)�

Using the results from the first and second moments of σ2t , we can now derive the formulas for

the autocovariance and autocorrelation functions of squared returns:

γ(k) = E(r2t − σ2)(r2

t−k − σ2) and ρ(k) =γ(k)

γ(0).

PROPOSITION 4. If the conditions in proposition 3 hold, and let γ(k) = Cov(r2t , r

2t−k) and

ρ(k) =Cov(r2

t , r2t−k)

V ar(r2t )

. Then, for all k ≥ 2,

γ(k) = (a0 + b0)γ(k − 1) and for all k≥ 1 ρ(k) = (a0 + b0)k−1ρ(1)

where

ρ(1) =c2

0[2a0 − 2a0b0A0 + A0(2a21 + (a1 + b1)2)π(1− π)]

c20(2 + A− 3A2

0) + 3c1π(1− π)[c1(1− A0) + 2c0(a1 + b1)(1− A)]

+(3a0 + b0(1 + 2A0))π(1− π)[c2

1 + 2c0c1(a1 + b1)(1− A)]

c20(2 + A− 3A2

0) + 3c1π(1− π)[c1(1− A0) + 2c0(a1 + b1)(1− A)]

with

11

A = (2α20 + (α0 + β0)2)(1− π) + (2α2

1 + (α1 + β1)2)π

A0 = a0 + b0 = (α0 + β0)(1− π) + (α1 + β1)π

�

2.2.3 The Range of Parameters under Stationary Conditions

We recall that π is the probability that the volatility process is in regime 2, α0, β0 are parametersin regime 1 and α1, β1 are parameters in regime 2. From the stationary conditions derived inthe previous section, we note that since the parameter π enters into the conditions, the sum ofthe parameter values in each regime is no longer required to be less than one. For example tohave a strict stationary return process, we allow the sum of the parameters in both regimes to bebigger than one. However, for GARCH type of models we usually require the finite variance of thereturn process. To obtain a variance stationary process we just need a weighted sum of the sumsof parameters in two regimes to be less than one: [(α0 + β0)(1− π) + (α1 + β1)π] < 1, therefore wemay have a sum of parameters in one regime to be bigger than one. To examine the effect of π onthe range of stationary areas, we graph the stationary areas of parameters in one regime based ondifferent π values for the fixed parameter values in another regime.

We discuss the stationary areas of α1 and β1 when π varies from 0.1, 0.5, to 0.9 for four sets ofparameter values of α0 and β0.4 According to the stationary conditions we derived in last section,the return series will have a variance stationary distribution if (α0 +β0)(1−π)+(α1+β1)π < 1, andthe fourth moment of return series exists if (2α2

0 +(α0 +β0)2)(1−π)+(2α21 +(α1 +β1)2)π < 1. It’s

easy to verify that if π increases, the weight of (α1+β1) increases, therefore the range of (α1+β1) willdecrease for all cases where (α0+β0)<1. In the second graph of Figure 1 we observe the boundariesof the strict stationary area, the variance stationary area, and the fourth moment stationary areamove towards the origin when π increases from 0.1 to 0.9. However when (α0 + β0) = 1, there isno clear patterns for the movement of the stationary areas of (α1 + β1) with different π values, forexample the variance stationary conditions are same for all three π values.

4We set {α0, β0} = {0.25, 0.75},{0.25, 0.5}, {0.25, 0.25}, {0.25, 0} respectively.

12

Figure 1: The Stationary Areas of α1 and β1 given {α0, β0} = {0.25, 0.75},{0.25, 0.5}

In Figure 1 the areas below the solid, dotted, and dashed lines satisfy the three stationaryrestrictions for π varying from 0.1, 0.5, to 0.9 respectively. For each π value, there are three linescorresponding to three stationarity conditions, however when (α0 + β0) = 1, there is no valuesof α1 and β1 that satisfy the fourth-order stationarity condition when π = 0.1. For examplein the first graph from Figure 1, there are only 2 solid lines representing strict stationary andvariance stationary areas. Also since the sum of parameters in regime one is one, the restrictionfor variance stationary distribution requires the sum of parameters in regime two to be less than1 regardless of the value of π, so the boundaries for the variance stationary distribution in thegraph are identical for all different π values (represented by the blue line in the first graph). Foreach π value, the areas are shrinking when we impose further restriction on stationarity. Butthere is no pattern for the stationary areas when π changes, the strict stationary area for π = 0.5

is larger than that for π = 0.9, whereas the fourth-order stationary area for π = 0.9 is largerthan that for π = 0.5. Nevertheless for all three cases in which (α0+β0)<1, we observe a clearpattern that when π increases, the stationary areas shrink. The second graph in Figure 1 assumesthat (α0,β0)=(0.25,0.5) , it excludes the strict stationary condition for π = 0.1 since it has a x-axisintercept beyond 90. It is not very surprising, since when π = 0.5, the strict stationary requirement

13

for α1 is less than 3.43 when β1 = 0.1, if probability π is much smaller, the value of parameterscan be very large. When {α0, β0} = {0.25, 0.25}, {0.25, 0}, the graphs exhibit a similar patternas the graph for {α0, β0} = {0.25, 0.5}. 5

3 A Simulation Study

3.1 The Simulated Paths of Return Series

In the previous section we derive the stationarity conditions for the return series described by ourthreshold GARCH model. Now we proceed with a simulation study to examine the estimationperformance of this model under different stationary conditions.

For the simulation study, we choose 3 sets of parameters for π = 0.1, 0.5, 0.9 respectively. Thevalue of ω0 and ω1 are set to be 0.02 and 0.01 for all cases. The values of α0 and β0 are fixed at 0.25and 0.5, then β1 is selected from the different regions in the stationary areas given α1 = 0.25 asshown in the second graph from Figure 1. We choose the parameters in such a way that the regime1 is always stationary, based on different probabilities with which conditional variance shifts toregime 2, we could have a non-stationary regime 2 but the whole process is still stationary.

Case 1, π = 0.1:1.1 Stationary with 4th moment {α1, β1} = {0.25, 1.5}

1.2 V ariance Stationary {α1, β1} = {0.25, 2.5}

1.3 Strict Stationary {α1, β1} = {0.25, 3}Case 2, π = 0.5:

2.1 Stationary with 4th moment {α1, β1} = {0.25, 0.75}


2.3 Strict Stationary {α1, β1} = {0.25, 1}Case 3, π = 0.9:

5In the legend of the graph, SS=strict stationary, 2S=variance stationary, 4S=4th order stationary.

14

3.1 Stationary with 4th moment {α1, β1} = {0.25, 0.7}


3.3 Strict Stationary {α1, β1} = {0.25, 0.9}Using Case 1.1 as an example, the data generating process is described as follows:

rt = σtεt

σ2t =

{0.02 + 0.25r2

t−1 + 0.5σ2t−1 if yt−1 ≤ y0

0.01 + 0.25r2t−1 + 1.5σ2

t−1 if yt−1 > y0

εt, yt is drawn independently from standard normal distribution, y0 is chosen in a way such thatp(St = 1) = p(yt > y0) = 0.1 , and σ0 is set to 0. We generate 5000 observations using the specifiedparameters, and to eliminate the possible initial value effect, we drop first 3000 observations.

The paths of return series depend crucially on the parameters in volatility process. The fol-lowing figure shows the stationary paths of return series given that parameters are specified as inCase 1.1.6

Figure 2: Simulated Paths of the Return Series under Threshold GARCH model

0 500 1000 1500 2000 2500 3000−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5stationary return series with fourth moment

6The parameters used in the simulated path are: π = 0.1 {ω0, α0, β0} = {0.02, 0.25, 0.5} {ω1, α1, β1} ={0.01, 0.25, 1.5}

15

3.2 The Performance of MLE Estimator

In this section we examine the performance of the maximum likelihood estimator. Given thatthe return series is conditionally normally distributed, the log likelihood function for a sample ofT observations is:

lnLT (θ) = −1/2T∑t=1

lnσ2t − 1/2

T∑t=1

r2t

σ2t

where θ = {ω0, ω1, α0, α1, β0, β1}. We know that to estimate θ, we need to estimate the thresholdvalue y0 so that the above likelihood function can be formulated. Here we estimate y0 by gridsearch, the threshold variable yt is sorted and for each possible threshold value y0 we calculatethe corresponding likelihood and the estimated threshold value is the one which maximizes thelikelihood:

θ̂(y0) = argmaxθ∈Θ

T−1lnLT (θ)

The asymptotic theory for the maximum likelihood estimates of the parameters of the thresholdGARCH model gives rise to difficulties because of the non-differentiability due to the threshold.Therefore we conduct a simulation study to analyze the finite sample properties of the maximumlikelihood estimator.

Firstly, we present the estimation results for 3 sets of parameters in Case 1 when π = 0.1,considering the sample sizes for 500, 1000, and 2000. The MSE is defined as mean squared errors

of estimates from true parameter values MSE =1

T

∑(θ̂− θ)2 for θ = {ω0, ω1, α0, α1, β0, β1}. The

results are based on 1000 replications. For simplicity we estimate the threshold value by searchingover the 19 grid points ranging from the 5th percentile to the 95th percentile point of thresholdvariable in jumps of 5.

16

Table 1: The MLE estimates for parameters in Case 1Par True

Value(1.1)

Estimate MSE TrueValue(1.2)


Estimate MSE

ω0 0.02 .0210 .0001 0.02 .0209 .0000 0.02 .0205 .0001ω1 0.01 .0072 .0045 0.01 .0096 .0077 0.01 .0114 .0094

T=500 α0 0.25 .2389 .0076 0.25 .2487 .0051 0.25 .2508 .0077α1 0.25 .2711 .1194 0.25 .3077 .2158 0.25 .3356 .2801β0 0.5 .4905 .0232 0.5 .4908 .0077 0.5 .4725 .0136β1 1.5 1.4712 .8277 2.5 2.3337 1.0359 3 2.6001 1.5691ω0 0.02 .0206 .0000 0.02 .0204 .0000 0.02 .0238 .0131ω1 0.01 .0097 .0015 0.01 .0107 .0025 0.01 .0136 .0032

T=1000 α0 0.25 .2474 .0021 0.25 .2527 .0023 0.25 .2555 .0045α1 0.25 .2569 .0464 0.25 .2843 .0880 0.25 .3160 .1246β0 0.5 .4949 .0042 0.5 .4929 .0037 0.5 .4776 .0093β1 1.5 1.4762 .2526 2.5 2.3752 .4346 3 2.6380 .9153ω0 0.02 .0202 .0000 0.02 .0200 .0000 0.02 .0196 .0000ω1 0.01 .0094 .0007 0.01 .0100 .0012 0.01 .0127 .0018

T=2000 α0 0.25 .2479 .0011 0.25 .2513 .0016 0.25 .2525 .0038α1 0.25 .2519 .0185 0.25 .2700 .0376 0.25 .2910 .0568β0 0.5 .4993 .0022 0.5 .4952 .0031 0.5 .4780 .0096β1 1.5 1.4917 .1096 2.5 2.4091 .2396 3 2.6786 .6989

Table 1 presents the estimation results for 3 sets of parameters in Case 1. When π = 0.1, sincethe probability that the conditional variance changes to regime 2 is small, we just need the sum ofparameters to be less than 3.25 to fulfill the requirement for a strict stationary distribution in Case1.3. In all 3 sets of parameters, whether they are strict stationary with finite fourth moment (1.1),variance stationary (1.2), or strict stationary (1.3) , the MLE estimator appears to be consistentwith the mean values of the estimates approaching the true parameter values when the sample sizeincreases from 500 to 2000. The MSE decreases when sample size increases. We notice that theMSE for α1 and β1 are substantially larger than that of α0 and β0, this is caused by the natureof non-stationarity in the corresponding regime and small probability to enter that regime. Whenπ = 0.1, only 10% of the observations belong to the regime with α1 and β1. We also note thatthe non-existence of moments results in more biased estimates and fatter tails in the distributionof estimates as we move across the Table 1 from left to right. The left column has finite first,

17

second, and fourth moments, the middle column has finite first and second moments, while theright column has only finite first moment. Without the existence of finite variance, the MSE can bevery large, but here the estimates are reasonably good since only small portion of data is generatedby the non-stationary regime.

Figure 3-5 provide the estimated density of MLE estimates summarized in the above table.The estimated density is computed using kernel smoothing method. The MLE estimates areapproximately consistent even when the variance stationarity condition is violated. As sample sizeincreases from 500 to 2000, the MLE estimates become more efficient with smaller variances andmore concentrated around true parameters. We present the density estimates for sample size of500, 1000, and 2000 in dotted line, dashed line, and dotted and dashed line respectively, while thetrue parameter values are given by the solid line.

18

Figure 3: Kernel Smoothing Density Estimates of MLE for Stationary Returns with Fourth Mo-ment when π = 0.1

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.070

20

40

60

80

100

120

estimates

Kernal Smoothing Density Estimates of MLE ω0 when π=0.1

Sample Size 500Sample Size 1000Sample Size 2000True Value

−0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2 0.250

2

4

6

8

10

12

14

16

estimates


−0.1 0 0.1 0.2 0.3 0.4 0.5 0.60

2

4

6

8

10

12

estimates

Kernal Smoothing Density Estimates of MLE α0 when π=0.1

−1 −0.5 0 0.5 1 1.50

0.5

1

1.5

2

2.5

3

estimates


0 0.2 0.4 0.6 0.8 1 1.20

1

2

3

4

5

6

7

8

9

10

estimates

Kernal Smoothing Density Estimates of MLE β0 when π=0.1

−1 0 1 2 3 4 50

0.2

0.4

0.6

0.8

1

1.2

estimates


19

Figure 4: Kernel Smoothing Density Estimates of MLE for Stationary Returns without FourthMoment when π = 0.1

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.050

20

40

60

80

100

120

140

160

estimates



−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.40

2

4

6

8

10

12

estimates


−0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.450

2

4

6

8

10

12

estimates


−1 −0.5 0 0.5 1 1.5 2 2.50

0.5

1

1.5

2

estimates


−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

2

4

6

8

10

12

estimates


0 1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

estimates


20

Figure 5: Kernel Smoothing Density Estimates of MLE for Strict Stationary Returns when π = 0.1

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080

20

40

60

80

100

120

140

160

estimates



−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.40

1

2

3

4

5

6

7

8

9

10

11

estimates


0 0.1 0.2 0.3 0.4 0.5 0.60

2

4

6

8

10

12

estimates


−1.5 −1 −0.5 0 0.5 1 1.5 2 2.5 30

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

estimates


0.2 0.3 0.4 0.5 0.6 0.7 0.80

2

4

6

8

10

12

estimates


0 1 2 3 4 5 6 70

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

estimates


Similar results are obtained for other 2 cases and reported in Appendix B. Table B1 presentsthe estimation results for 3 sets of parameters in Case 2. MLE estimators are still consistent andefficient as sample size increases. We also observe that the MSE of estimates in each regime arenot substantially different as reported in Table 1. It may be caused by the fact that probabilities ofconditional variance in each regime are equal, and we also expect higher MSE for estimates in non-variance stationary regime. Table B2 presents estimation results in Case 3. When the probability

21

that conditional variance process in regime 2 equals 0.9, even in the variance stationary case wewill no longer have consistent estimator of β1, so we just report the results for Case 3.1 and 3.2and skip the strict stationary Case 3.3. Since regime 2 is more volatile regime, here the highprobability that the conditional variance is in such regime may be the reason that we fail to getconsistent estimator. We also notice that the estimates of parameters in regime 1 turn out tohave larger MSE. It confirms our assertion that the small probability in one regime affects theperformance of estimates in that regime. Figure B1-B3 provide the estimated density of MLEestimates summarized in Table B1. The MLE estimates are approximately consistent even whenthe variance stationarity condition is violated. Figure B4-B5 provide the estimated density of MLEestimates summarized in Table B2.

3.3 Simulation Study for Endogenous Threshold Variable

The well established volume-volatility relationship inspires the use of volume as the threshold vari-able in our threshold GARCH model, however the high correlation between volume and volatilitymakes this threshold variable endogenous. The endogeneity of the threshold variable renders thetheoretical derivation of the stationarity conditions impossible. Therefore we design a simulationstudy to examine the effect of endogeneity of the threshold variable on the properties of the returnseries and the maximum likelihood estimator.

We simply assume that the threshold variable is a linear function of squared returns with somerandom error.

Under the threshold GARCH model, the demeaned return series and the conditional varianceare given by:

rt = σtεt{σ2t = ω0 + α0rt−1 + β0σt−1 if yt−1 ≤ y0

σ2t = ω1 + α1rt−1 + β1σt−1 if yt−1 > y0

yt = ar2t + vt

22

where εt follows the independent and identical normal distribution with mean 0 and variance 1:εt ∼ iidN(0, 1), and vt is an i.i.d. normal variable with mean 0 and variance σ2

v . εt and vt areindependent.

In this simple data generating process the correlation between the squared return r2t and the

threshold variable yt is governed by the coefficient a when σv is small.

yt = ar2t + vt

E(yt) = aE(r2t )

V ar(yt) = a2V ar(r2t ) + σ2

v

Cov(r2t , yt) = E(r2

t yt)− E(r2t )E(yt)

= E(ar4t )− E(r2

t )aE(r2t )

= aV ar(r2t )

Corr(r2t , yt) =

Cov(r2t , yt)√

V ar(r2t )V ar(yt)

=aV ar(r2

t )√V ar(r2

t )(a2V ar(r2

t ) + σ2v)

= (a2V ar(r2

t )2

a2V ar(r2t )

2 + σ2vV ar(r

2t )

)1/2

It is obvious that as a→ 1 and σv → 0, Corr(r2t , yt)→ 1, and as a→ 0, Corr(r2

t , yt)→ 0.We choose the value of a = (0.1, 0.2, 0.3, 0.4), the parameter values in the conditional variance

process are set as {ω0, ω1, α0, α1, β0, β1} = {0.01, 0.02, 0.1, 0.2, 0.55, 0.75}. Correspondingto these parameter values, the average correlation coefficients between the squared return andthe threshold variable are {0.09, 0.22, 0.36, 0.51} for the simulated data sets. We generate 1000bivariate series each with 4000 observations, the first 2000 observations are dropped to eliminatethe initial value problem.

We use MSE to evaluate the performance of estimators. The MSE is defined as mean squared

errors of estimates from true parameter valuesMSE =1

T

∑(θ̂−θ)2 for θ = {ω0, ω1, α0, α1, β0, β1}.

23

We also check the sample standard error for the estimates Std =

√1

T − 1

∑(θ̂ − θ̄)2. The results

are based on 1000 replications. For simplicity we estimate the threshold value by searching overthe 19 grid points range from 5% percentile to 95% percentile points of threshold variable.

Table 2 presents the results of estimates when correlation coefficient ranging from 0.1 to 0.4.

Table 2: The Performance of MLE Estimates with Endogenous Thresholda = 0.1 a = 0.2 a = 0.3 a = 0.4

Par θ̂ MSE Std θ̂ MSE Std θ̂ MSE Std θ̂ MSE Stdω0 .0107 .0001 .0092 .0101 .0003 .0167 .0102 .0001 .0080 .0101 .0001 .0072ω1 .0202 .0007 .0270 .0207 .0004 .0197 .0224 .0003 .0165 .0226 .0003 .0162

T=500 α0 .0894 .0072 .0844 .0932 .0087 .0931 .0942 .0092 .0958 .0951 .0146 .1206α1 .1804 .0330 .1806 .1754 .0168 .1274 .1788 .0134 .1140 .1800 .0098 .0971β0 .5493 .0468 .2165 .5566 .1403 .3747 .5510 .0316 .1778 .5545 .0276 .1663β1 .7695 .3637 .6031 .7634 .2016 .4490 .7438 .1315 .3625 .7311 .0898 2992ω0 .0101 .0000 .0043 .0101 .0000 .0040 .0100 .0000 .0037 .0100 .0000 .0038ω1 .0208 .0001 .0111 .0211 .0001 .0093 .0213 .0001 .0085 .0214 .0001 .0073

T=1000 α0 .0983 .0016 .0396 .0984 .0017 .0418 .0993 .0021 .0464 .0976 .0028 .0533α1 .1888 .0070 .0827 .1903 .0054 .0732 .1923 .0046 .0675 .1941 .0037 .0603β0 .5489 .0114 .1069 .5487 .0096 .0980 .5514 .0077 .0878 .5530 .0070 .0837β1 .7406 .0551 .2347 .7357 .0356 .1883 .7336 .0284 .1679 .7309 .0188 .1359ω0 .0100 .0000 .0026 .0101 .0000 .0026 .0101 .0000 .0025 .0101 .0000 .0024ω1 .0205 .0000 .0063 .0207 .0000 .0055 .0205 .0000 .0048 .0206 .0000 .0044

T=2000 α0 .1010 .0007 .0269 .1008 .0008 .0291 .1020 .0010 .0320 .1016 .0012 .0341α1 .1961 .0029 .0541 .1968 .0023 .0480 .1985 .0019 .0436 .1987 .0017 .0407β0 .5506 .0044 .0663 .5498 .0039 .0622 .5490 .0034 .0580 .5502 .0030 .0546β1 .7358 .0184 .1349 .7370 .0134 .1152 .7380 .0094 .0961 .7392 .0072 .0840

In Table 2, we see that the performance of the MLE estimators improved when sample sizeincreases as well as when the correlation between the squared returns and the threshold variableincreases. When sample size is 1000, the MSE and the standard error of MLE estimator decreasewhen a changes from 0.1 to 0.4, except for one parameter α0. Nevertheless, as the sample size is2000, the changes in standard error and MSE of α0 became smaller when endogeneity increases.

The results seem inconsistent at the first glance, usually we fail to obtain an efficient estimatorwhen dealing with endogenous variables in solving economic problems. However here the threshold

24

variable is not an explanatory variable in the return or the volatility dynamics, it is an informa-tion variable and the higher the endogeneity the more the information provided by the thresholdvariable, therefore the better the performance of the estimator.

3.4 Simulation Study for Forecasting Performance

In the previous sections we show that the parameters from the threshold GARCH model canbe estimated efficiently. Now we examine the forecasting performance of the threshold modelsince the predicting power is critical in determining the success of a volatility model. First,assuming that the data is generated by a threshold model we estimate both the threshold modeland simple GARCH(1,1) model, then construct the forecasts based on estimated parameters fromtwo models, the results are compared according to 5 common measures. Then, we conduct a modelmisspecification test. Assuming that data is generated by GARCH model, but we use estimatedparameters from threshold model to forecast volatility.

3.4.1 Forecasting Performance of Threshold Model with Threshold DGP

First, we assume that the return data is generated by the threshold GARCH model. In Section2 we show that different from GARCH model, threshold model allows the sum of parameters to begreater than one in one regime but keeps the whole process stationary. Therefore we use two setsof parameters to generate stationary return data, one with stationary parameters in both regimes,another one with non-stationary parameters in one regime.

We generate 1000 return series given the stationary parameters as follows:rit =σitεit

σ2it = 0.02 + 0.15σ2

it−1ε2it−1 + 0.55σ2

it−1 if yt−1 ≤ y0

σ2it = 0.01 + 0.05σ2

it−1ε2it−1 + 0.90σ2

it−1 if yt−1 > y0

where the innovations εit are independently and identically distributed standard normal randomvariables, and we use VIX data for the threshold variable yt with y0 = mean(yt) = 19.6696 for allreturn series from i = 1...1000.

We generate 4750 observations for each return series and drop first 1750 observations to elimi-nate the initial value effect.

25

In the sample of 3000 observations, the first 2750 observations are used to estimate the thresholdGARCH model, then the estimated parameters and threshold value are used to construct the one-day ahead forecast for the remaining 250 days. In the simulation study as well as the empiricalapplication, we use the following 5 measures to compare the forecasting performance:

MEi =1

T

∑(σ2

it − σ̂2it)

MPEi =1

T

∑(σ2

it − σ̂2it)/σ̂

2it

RMSEi =

√1

T

∑(σ2

it − σ̂2it)

2

HMSEi =1

T

∑(σ2

it/σ̂2it − 1)2

and R2i obtained from regressing the actual conditional variance σ2

i on the forecasts σ̂2i :

σ2i = a+ bσ̂2

i + vt

For each measure we compute the average over 1000 replications and compare with the samemeasures obtained by the standard GARCH(1,1) model estimation. The results are presented inTable 3.

Table 3: Forecasting Performance Based on Threshold GARCH DGPME MPE RMSE HMSE R2

TS GARCH -.0004 .0049 .0881 .3949 .9256GARCH .0075 .0991 .1712 3.9838 .6905

The results show that if the data are generated by threshold GARCH model, then the forecast-ing performance of threshold GARCH model is much better than the standard GARCH model.Figure 6 gives an example of the comparison of the estimated volatility by threshold GARCHmodel and GARCH model.

26

Figure 6: Forecasting Performance by TS GARCH and GARCH

0 50 100 150 200 2500.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

volat

ility

Volatility Forecasting Comparison by TS GARCH and GARCH

True VolatilityTS GARCHGARCH

The above estimated results of threshold GARCH model is based on a grid search of thresholdvariable over 19 points range from the 5th percentile to the 95th percentile of the threshold variablein jumps of 5%.

In order to obtain an efficient estimate of the threshold value, we should search over all valuesin the sample of threshold variable. Since each grid search involves a minimization problem, itis very computationally costly to search over the entire sample of threshold variable. Due to thecomputational burden, we only search the very limited points in the range of the threshold variable.Table 4 presents the results of estimated parameters over 3 different jump sizes in the grid points.

27

Table 4: Estimation Performance with Different Grid Points in Threshold VariableTrue 5% 2.5% 1%Value θ̂ Std MSE θ̂ Std MSE θ̂ Std MSE

ω0 0.02 .0184 .0061 .0000 .0193 .0061 .0000 .0196 .0059 .0000ω1 0.01 .0127 .0052 .0000 .0125 .0053 .0000 .0126 .0056 .0000α0 0.15 .1343 .0348 .0015 .1397 .0346 .0013 .1394 .0346 .0013α1 0.05 .0530 .0171 .0003 .0524 .0171 .0003 .0517 .0173 .0003β0 0.55 .5895 .1083 .0133 .5711 .1073 .0119 .5654 .1035 .0110β1 0.9 .8829 .0370 .0017 .8840 .0382 .0017 .8845 .0394 .0018y0 19.67 19.87 19.81 19.81

It is clear from Table 4 that as the finer grid intervals are used, the estimated parameters andthreshold value are more closer to that of the true parameter values. Nevertheless, the estimationresults using a coarser grid interval are not much worse than using a finer grid interval, but theestimation process using the finest grid interval requires much more computing time than thatusing a coarser interval. Therefore, in the empirical application we search over 37 points rangefrom the 5th percentile to the 95th percentile point of threshold variable in jumps of 2.5%.

Since the threshold model allows the parameters to be non-stationary in one regime whilekeeping the whole process stationary, we also generate return data according to the followingmodel:

rit =σitεit

σ2it = 0.02 + 0.15σ2

it−1ε2it−1 + 0.55σ2

it−1 if yt−1 ≤ y0

σ2it = 0.01 + 0.25σ2

it−1ε2it−1 + 0.80σ2

it−1 if yt−1 > y0

where the innovations εit are independently and identically distributed standard normal randomvariables, and we use VIX data for the threshold variable yt with y0 = 31 for all return seriesfrom i = 1...1000.

Similarly, we generate 4750 observations for each return series and drop first 1750 observationsto eliminate the initial value effect. In the sample of 3000 observations, the first 2750 observationsare used to estimate the threshold GARCH model, then the estimated parameters and thresholdvalue are used to construct the one-day ahead forecast for the remaining 250 days. Table 5 givesthe results of forecasting comparison based on 5 measures.

28

Table 5: Forecasting Performance Based on Threshold GARCH DGP with Non-Stationary Pa-rameters

ME MPE RMSE HMSE R2

TS GARCH .0259 .0363 2.0194 4.0616 .9708GARCH .2990 .2504 4.7271 38.8685 .8287

The results show that even if the parameters are non-stationary in one regime, both models cangenerate reasonable forecasts based on high R2. Nonetheless, the threshold model still performssignificantly better than GARCH model. The return data has only 10% generated by the non-stationary process, which possibly explains why R2 increases in this case for GARCH model.

3.4.2 Forecasting Performance of Threshold Model with GARCH DGP

We showed that if the data generating process follows the threshold GARCH model, the fore-casting performance is quite good using the threshold model. The average R2 from regressingthe true volatility on the estimated volatility over 1000 replication is more than 90%. Now weperform a model misspecification test on the forecasting power of the threshold GARCH model.If the data generating process follows a standard GARCH process, theoretically we will estimatethe model well therefore also forecast well. To examine this property, we simulate data accordingto a GARCH process:

rit =σitεit

σ2it = 0.02 + 0.05σ2

it−1ε2it−1 + 0.85σ2

it−1

where the innovations εit are independently and identically distributed standard normal randomvariables, and when estimate the threshold model we use an independent standard normalvariable as the threshold variable yt for all return series from i = 1...1000.

We generate 5000 observations for each return series and drop first 2000 observations to elimi-nate the initial value effect. In the remaining sample of 3000 observations, we use 2750 observationsfor in-sample estimation and the rest 250 for out-sample forecasting. Table 6 shows the average of5 forecasting measures over 1000 replications.

29

Table 6: Forecasting Performance Based on GARCH DGPME MPE RMSE HMSE R2

TS GARCH -.0001 .0010 .0724 .2079 .9554GARCH -.0001 .0009 .0719 .2043 .9607

Clearly if the data are generated by a standard GARCH model, the threshold GARCH modelis able to provide accurate forecasts very close to the performance of GARCH model.

Figure 7 provides an example of the comparison of the estimated volatility by threshold GARCHmodel and GARCH model given the GARCH data generating process.

Figure 7: Forecasting Performance Based on GARCH DGP

0 50 100 150 200 250

0.16

0.18

0.2

0.22

0.24

0.26

0.28

volat

ility

Volatility Forecasting Comparison Based on GARCH DGP

True VolatilityTS GARCHGARCH

4 Empirical Study

In this section we apply the threshold model to empirical data and find good fits of thresholdmodel in terms of in-sample estimation as well as out-of-sample forecasting. We begin with a brief

30

description of the data set and follow with the estimation of the threshold model. Then we discussthe estimation and forecasting results.

4.1 The Data

The first data set consists of 20 stocks in the major market index (MMI)7. We obtain thedata of most stocks for the period from Jan. 2, 1970 to Dec. 31, 2008, except for AXP and T.The data for AXP and T start from May 18, 1977 and Jan 2, 1984 respectively. We choose thestocks from MMI because they are well known and highly capitalized stocks representing a broadrange of industries and they generally exhibit a high level of trading activity. Return data areobtained from daily stock file of the Center for Research in Security Prices (CRSP) and accessedfrom Wharton Research Data Services (WRDS).

The exogenous threshold variable we used in this empirical study is the Volatility Index (VIX).The Chicago Board Options Exchange (CBOE) Volatility Index is a key measure of market ex-pectations of near-term (30-day) volatility conveyed by S&P 500 stock index option prices. It isa weighted blend of prices for a range of options on the S&P 500 index. The volatility index iscalculated and disseminated in real-time by CBOE. We obtain the data from CBOE website fromJan. 2 1990 to Nov. 18 2005. Since the volatility index measures the market expectations forthe future volatility, it is reasonable to assume the independence between VIX and the currentvolatility.

The summary statistics for the returns in MMI are presented in Table B3 in Appendix B. Thecolumns report the sample minimum, maximum, mean, standard deviation, coefficient of skewness,and coefficient of kurtosis. We notice that all the return series have large kurtosis comparing to anormal distribution, and most of the returns are negatively skewed.

Since the simulation study of the endogenous threshold model suggests that the MLE estimatorsperform reasonably well for large endogeneity coefficient, we also apply our model to the volumedata. In addition, the use of the threshold model to describe the conditional variance dynamicsis motivated by the volume-volatility correlation, we want to examine whether the endogenousthreshold variable volume provides more information on the regime shifts in the conditional vari-

7The firms in the MMI are American Express (AXP), AT&T (T), Chevron (CHV), Coca-Cola (KO), Disney(DIS), Dow Chemical (DOW), Du Pont (DD), Eastman Kodak (EK), Exxon (XOM), General Electric (GE),General Motors (GM), International Business Machines (IBM), International Paper (IP), Johnson & Johnson (JNJ),McDonald’s (MCD), Merck (MRK), 3M (MMM), Philip Morris (MO), Procter and Gamble (PG), and Sears (S).

31

ance process. Nonetheless the volume variable reveals the trading activities for the individualstocks, while the VIX variable just gives the information for the market as a whole. The volumedata are also obtained from WRDS. To remove the trend in the volume series, we define theadjusted volume series by taking the log of the trading volume and then removing the 100-daymoving average from the log volume series. The resulting series have an average correlation coeffi-cient around 0.5. The statistics for the log volume and the adjusted log volume series are reportedin Table B4 in Appendix B.

To further explore the usefulness of the endogenous threshold, we also obtain the second dataset for 12 most active stocks in NASDAQ8, since the number of trades data is only available forNASDAQ stocks. The number of trades data is available for most of the stocks from Jan. 03,1995 to Dec. 31, 2010 except YHOO, EBAY, NVDA and GOOG, they have the number of tradesavailable from Apr. 15, 1996, Sep. 25, 1998, Jan. 25, 1999, and Aug. 20, 2004 respectively. Thereis always concern about the noise in trading activity variables, volume is one very noisy tradingvariable. We tried to adjust the volume series by taking log and removing time trend for MMIstocks. Here we use another volume variable for NASDAQ stocks. The volume variable is definedas the ratio of trading volume over the total shares outstanding (Volume/SHOUT) for the stock.This is actually the turnover of the daily stock and it is stationary. For the number of trades, ifthere is a clear time trend, we detrend the series by removing the best straight-line fit from theseries. The data descriptions for return, volume/SHOUT and number of trades are available inTable B5 and B6 in Appendix B.

4.2 Estimation of the Threshold GARCH model

In this section we first apply the threshold GARCH model to the above data set which contains20 stocks from MMI. We assume that the return series follows the threshold GARCH model wherethe trigger variable is either exogenous (VIX) or endogenous (volume). Then we estimate thethreshold model using NASDAQ stocks with volume/SHOUT and number of trades as thresholdvariables.

Since the return series in our threshold GARCH model is assumed to be a zero mean process,812 most active stocks in NASDAQ: Microsoft Corporation (MSFT), Cisco Systems, Inc. (CSCO), Intel Cor-

poration (INTC), Yahoo! Inc. (YHOO), Oracle Corporation (ORCL), Apple Inc. (AAPL), Dell Inc. (DELL),NVIDIA Corporation (NVDA), Applied Materials, Inc. (AMAT), Google Inc. (GOOG), QUALCOMM Incorpo-rated (QCOM), eBay Inc. (EBAY).

32

we first remove the mean from the returns. In addition to a constant we also filter the AR effectto order 5:

rt = Rt − µ−5∑j=1

δjRt−j

where Rt is the observed return, µ is the mean, and δj is the coefficient of AR variables.The threshold variable (VIX or volume) is given in this threshold GARCH model, so we also

need to determine the threshold value for this variable. Now the model that needs to be estimatedhas a set of parameters as a function of threshold values.

rt = σtεtσ2t = ω0 + α0r

2t−1 + β0σ

2t−1 if yt−1 ≤ y0

σ2t = ω1 + α1r

2t−1 + β1σ

2t−1 if yt−1 > y0

To estimate the threshold value we divide the sample of threshold variable into 40 intervalsand the 39 grid points correspond to 2.5 percentile point to 97.5 percentile point.

4.3 Estimation Results

4.3.1 The Estimation Results with Exogenous Threshold Variable

The MLE estimates using VIX as the threshold variable are presented in Table 7.

33

Table 7: The Estimates of Threshold GARCH model Using VIX as Threshold Variable

y0 α0 α1 β0 β1 y0 α0 α1 β0 β1

AXP y725 .04180067

.09970189

.94970081

.85850281

GM y675 .03520067

.09160160

.94540120

.90760155

T y65 .03660102

.12690311

.94480137

.79410379

IBM y925 .02850056

.19180495

.96940062

.74790579

CHV y60 .0330098

.08630151

.92760217

.890205

IP y575 .03050109

.12760247

.87130321

.85390235

KO y925 .02790048

.13060352

.97040051

.86130309

JNJ y85 .04690067

.16290755

.95130070

.77230674

DIS y825 .06340180

.10230278

.74020539

.85150307

MCD y60 .03350096

.04080095

.94000140

.95160097

DOW y675 .05230112

.06200120

.92600132

.93750115

MRK y675 .01060129

.07260234

.91970327

.89470338

DD y775 .0194004

.10440247

.97570054

.84160381

MMM y60 .03210164

.10480281

.75240630

.81911291

EK y625 .12900552

.29690964

.20631375

.46560897

MO y15 .01670418

.05140115

.20220199

.93960164

XOM y95 .028007

.09590152

.958801

.87780182

PG y80 .01670042

.14880515

.98090043

.85380385

GE y775 .02130048

.12070227

.97210059

.83970307

S y50 .10370223

.11190369

.78630432

.84710313

(The small numbers under the estimates are the estimated standard errors)

The estimation results in Table 7 are based on 4787 observations from Jan. 02, 1990 when VIXis available. The estimated threshold value and the probability that volatility is in regime 2 isgiven by y0, for example the estimated threshold is y925 for IBM, it means that the threshold valueis the 92.5 percentile point of VIX price, so the volatility is in a volatile regime with probabilityof 7.5%. The parameters in the threshold GARCH model are significant for most of the stocks.For some stocks the sum of estimated parameters is greater than 1 in one regime, but consider theprobability π, we will still have the stationary process. The probability is given by the location ofthe threshold value in the sample space of the threshold variable, it is very clear in our estimationresults since we use 2.5 percentile as the increment. For example the estimated parameters of PGin regime 2 have a sum of 1.0026, but the threshold value estimated is y80, that means there isonly 20% chance that the conditional variance shifts to the regime 2, therefore with a stationaryregime 1 the stationarity condition for return process still holds. We observe that for some stocksthe estimated parameters in 2 regimes are very similar, it is not surprising because we use VIX

34

as a threshold variable for all 20 stocks in our sample, some stocks may follow closely with themarket, while others may be less affected by the market conditions. Nonetheless when we use themarket condition as a threshold, it indeed separates the returns in low volatility regime from thatin the high volatility regime.

We plot the MMM return series in different regimes in Figure 8 and 9. Figure 8 contains thegraphs of return series and squared return series in 2 regimes, while Figure 9 contains the graphsof VIX price series with the threshold value estimated.

Figure 8: Regime Shifts in MMM Return Series

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.002

0.004

0.006

0.008

0.01

0.012

0.014Regime Shifts in MMM Squared Returns

regime 2regime 1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000−0.1

−0.05

0

0.05

0.1

0.15Regime Shifts in MMM Returns

regime 2regime 1

35

Figure 9: VIX price series with threshold value

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

10

20

30

40

50

60

70

80

90VIX price and threshold value

VIX pricethreshold value

From Figure 8 we see that given the threshold value, the high volatility regime identifies the 3periods of clustering of extreme returns. The volatile periods are separated from the less volatileperiods, and the clustering of volatile periods confirm the presence of the disruptive events duringthe periods, such as the clustering of volatile periods at the starting point of the graph correspondsto the 1990-1991 Persian Gulf Crisis, the clustering of volatile periods at the middle of the graphcorresponds to the periods from 1997 Asian crisis to 2000 dot.com bubble, and the clustering atthe end of the graph corresponds to the 2008 subprime mortgage crisis. The use of the VIX as thethreshold variable enable us to find the periods in which the VIX market and the stock are volatilesince the markets tend to move together, meanwhile we will miss the stock specific information sothat some volatile events are ignored simply because of the involatile VIX at that point in time.Nevertheless those stock specific events tend to be non-persistent, we observe that some largenegative returns are not identified as in the high volatility regime simply because it is a rare event.Figure 9 shows the VIX price and regimes in the VIX return. The estimated threshold value forVIX price is 20.17, when the price of VIX is above the threshold value, people view the marketas unstable, therefore we observe large price movement in the VIX prices. The empirical resultsreveal the potential of the threshold variable to identify the regimes in the volatility process henceprovide better forecast.

36

4.3.2 The Estimation with Endogenous Threshold Variable

We also use the trading volume as a threshold variable to estimate the threshold GARCH modelfor MMI stocks, the results are presented in Table 8.

Table 8: The Estimates of Threshold GARCH Model with Volume as Threshold

y0 α0 α1 β0 β1 y0 α0 α1 β0 β1

AXP y30 .15360336

.0457006

.89240197

.95220063

GM y90 .06040108

.02690109

.92260156

1.04820187

T y625 .07290088

.012008

.93050079

1.00490136

IBM y95 .09210159

.00570067

.92190128

1.05150282

CHV y925 .05590088

.0750243

.92680116

.89460443

IP y275 .06350197

.05810094

.89160211

.92950106

KO y475 .14310425

.04570104

.81720411

.93740115

JNJ y30 .07860372

.0610108

.88070202

.9330104

DIS y525 .14180388

.04010093

.82420387

.94560111

MCD y85 .05780107

.02430072

.94350095

.98110138

DOW y825 .08760122

.02990088

.91470107

.97810122

MRK y175 −.1360606

.0369007

1.03380195

.93730129

DD y70 .0670124

.02210056

.93320107

.985008

MMM y95 .04150088

.00030043

.95860097

.98270223

EK y95 .21360552

.02890104

.74860618

.70550704

MO y475 .11280433

.05780175

.70760782

.89890249

XOM y475 .0360124

.06710092

.94370109

.91960107

PG y95 .04350081

.02730092

.95880075

.95380252

GE y10 .10210441

.04720064

.91170229

.95180066

S y925 .15590299

.08820444

.83820249

.8510413


Figure 10 shows the plot of MMM return and squared return in different regimes and theestimated threshold value and volume are presented in Figure 11.

37

Figure 10: Regime Shifts in MMM Return Series

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.002

0.004

0.006

0.008

0.01

0.012

0.014

square

d retu

rn

Regime Shifts in squared return

regime 2regime 1

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000−0.1

−0.05

0

0.05

0.1

0.15

return

regime shifts and return series

regime 2regime 1

38

Figure 11: MMM Volume with Threshold Value

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

volum

e

volume and the threshold value

volumethreshold value

When using volume as the threshold variable, we capture almost all the large positive or nega-tive returns in the volatile periods, however we fail to identify the clustering of the volatile periods.In comparison to the previous graph using VIX as the threshold, the only volatile period confirmedby volume is the 1990-1991 Gulf Crisis. The reason that volume does not provide better identi-fication of regime switching may due to the transformation we made on the volume variable. Itmay be the case that when we transform the volume series to a stationary process the importantinformation is also removed. There is also a problem with the stationarity requirement for themodel, the estimated parameter values for some stocks violate the stationarity conditions.

The estimation results question the use of volume as an effective threshold variable. We nowconsider the other 2 candidates for the endogenous threshold variables: the number of trades andthe volume turnover. Table 9 presents the estimation results for 12 NASDAQ stocks using the twoendogenous threshold variables.

39

Table 9: The Estimates of Threshold GARCH Model for NASDAQ Stocks

Volume/SHOUT Number of TradesStocks y0 α0 α1 β0 β1 Stocks y0 α0 α1 β0 β1

AAPL y95 .08270250

.00510056

.93190191

.88240631

AAPL y60 .1437058

.04410113

.79810502

.9440132

AMAT y90 .06270098

.00670074

.94320079

.9846016

AMAT y30 .03800151

.04320073

.91690282

.95420077

CSCO y80 .11600201

.00930064

.90520161

.98440136

CSCO y825 .11350217

.02580086

.86880201

.98620124

DELL y55 .13140291

.03430081

.89490210

.94530127

DELL y85 .08590149

.00670085

.9199013

.97880136

EBAY y625 .01730056

.11410315

.98500213

.90780070

EBAY y85 .01000133

.10710223

.87280588

.90620178

GOOG y525 .02840106

.15500717

.93631174

.88320157

GOOG y80 16680444

.086607

.60730702

.05012497

INTC y80 .08190155

.02150078

.92450451

.97430089

INTC y65 .0850197

.0370088

.90870179

.95590109

MSFT y85 .11970212

.00930063

.87430159

1.0220222

MSFT y475 .08140180

.08450191

.87260331

.90940176

NVDA y90 .08710246

.00020054

.93240173

.93320673

NVDA y725 .11420377

.00720072

.92540208

.95710162

ORCL y875 .09120135

−.00260067

.92620099

1.05910482

ORCL y675 .06810156

.02770058

.94170116

.96080102

QCOM y70 .1261024

.01630103

.88580205

.96990167

QCOM y35 .15150508

.05580109

.75080562

.93360132

YHOO y725 .01080093

.27050521

.91230330

.80140319

YHOO y475 .14880340

.12380564

.83000358

.81760539


The estimated parameters using volume turnovers behave much better than using adjustedvolume as the threshold. Figure 12 and 13 show how regimes in DELL returns are separated byusing volume turnover and number of trades as threshold variable.

40

Figure 12: Estimated Regimes for DELL using Vol/SHOUT as Threshold

0 500 1000 1500 2000 2500 3000 3500 4000 4500−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25Regime Shifts in DELL Return (Volume/SHOUT)

Regime 1Regime 2

0 500 1000 1500 2000 2500 3000 3500 4000 45000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2Threshold in DELL Volume/SHOUT

Volume/SHOUTThreshold Value

41

Figure 13: Estimated Regimes for DELL using Number of Trades as Threshold

0 500 1000 1500 2000 2500 3000 3500 4000 4500−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25Regime Shifts in DELL Returns (Number of Trades)

Regime 1Regime 2

0 500 1000 1500 2000 2500 3000 3500 4000 4500−1

−0.5

0

0.5

1

1.5

2

2.5x 10

5 Threshold in DELL Number of Trade

Number of TradesThreshold Value

It is obvious that for DELL the volume turnover is more capable of identifying the volatileperiods in the sample. The ratio of volume over shares outstanding is quite low after 2000 dot.combubble, therefore labels the period between 2002 and 2008 as the tranquil period. It is consistentwith the smaller positive and negative outliers in return series during the period. The number oftrades during the same period is also lower than the estimated threshold value which indicate thatthe period is less volatile, however the time around 2000 dot.com bubble is also left out in regime1 that is inconsistent with data. The volume turnover or number of trades convey informationthat can be used to identify the volatility regimes. But the performance of threshold models

42

depends on how relevant and useful the threshold variable is, we see that VIX as an indicatorof market conditions certainly help us to separate the volatile regime from a less volatile regime.We also observe the similar effect when using variables that reveal trading activities of individualstocks. We may find more variables or the combination of variables that provide even more preciseinformation on regime shifting.

4.4 Forecasting Performance of Threshold Model

We have shown the good in-sample fit of the threshold model. In this section we test the out-of-sample predicting power of the threshold model. To construct the volatility forecasts, we estimatethe threshold parameters using in-sample data for MMI stocks and NASDAQ stocks. Since thedetrended log volume does not perform well in estimation, we only consider using VIX as an exoge-nous trigger, while volume turnover and number of trades as endogenous trigger. The forecastinghorizon is still one day ahead forecast for 250 days.

4.4.1 Forecasting Results for MMI Stocks

As described above, we estimate the threshold GARCH model for all 20 stocks in the in-sampleperiod then use the estimated parameters to compute the forecasts for 250 days. We use dailysquared return to proxy the actual variance and compute 5 measures to compare volatility fore-casting power. Table 10 shows the results of 5 measures to evaluate forecasting performance forall 20 stocks.

43

Table 10: Forecasting Results for 20 MMI StocksME1 MPE RMSE HMSE R2

AXP TS GARCH .0569 .3970* .0090 6.0986* .0722*GARCH .0404* .4136 .0064* 6.7201 .0413

T TS GARCH -.004* -.3086* .0006* 3.3388 .0111*GARCH -.0056 -.3472 .0009 2.5066* .0051

CHV TS GARCH .0361* .3499* .0057* 6.2621* .1499*GARCH .0449 .7507 .0071 21.1730 .0890

KO TS GARCH .0198 .3719* .0031 6.4860* .1346*GARCH .0167* .5707 .0026* 17.6212 .0591

DIS TS GARCH .0250* .2001* .0039* 5.8526* .0708GARCH .0349 .5260 .0055 12.2574 .1220*

DOW TS GARCH .0287* .2673* .0046* 10.2805 .0781GARCH .0322 .3842 .0051 8.7226* .0841*

DD TS GARCH .0196* .1871* .0031* 4.1163* .1390*GARCH .0413 .7065 .0065 12.1604 .1179

EK TS GARCH .0715* 1.0969* .0113* 18.287* .0072*GARCH .0768 1.4259 .0121 32.8882 .0059

XOM TS GARCH .0217* .2818* .0034* 5.8539* .1906*GARCH .0335 .6573 .0053 17.8408 .1026

GE TS GARCH .035* .3401* .0055* 11.405* .1509*GARCH .0443 .6309 .0070 15.5842 .1280

GM TS GARCH .27 1.0918 .0431 17.3807* .0551*GARCH .24* 1.0234* .0376* 19.4617 .0509

IBM TS GARCH .0592* .0578* .0001* 2.0658* .1735*GARCH .0916 .1689 .0014 3.2271 .1392

IP TS GARCH .0235* .2692* .0037* 8.0684* .1122*GARCH .0562 .6689 .0089 23.5101 .0961

JNJ TS GARCH .0023* .0034* .0004* 3.4455* .1165*GARCH .0075 .4593 .0012 20.2049 .0705

MCD TS GARCH .0066* .1593* .0011* 5.5311* .1082*GARCH .0104 .3084 .0016 7.1577 .0655

MRK TS GARCH .0263* .5576* .0042* 27.5117* .0278GARCH .0488 1.0612 .0077 37.7295 .0340*

MMM TS GARCH .0109* .088* .0017* 2.6553* .1735*GARCH .0261 .8644 .0041 16.0783 .0916

MO TS GARCH .005* .1373* .0001* 9.0653* .0361*GARCH .0152 .5348 .0024 24.8221 .0137

PG TS GARCH -.0014* .108* .0002* 3.5269* .01466*GARCH .0123 .4415 .0019 8.1946 .1006

S TS GARCH .0202 .5497 .0032 54.4468 .0068*GARCH .0095* .3065* .0015* 27.8286* .0016

1: The value of ME measure should be multiplied by 0.01.*: The model performed better based on a forecasting measure.

44

For 18 out of 20 stocks, the 2008 financial crisis is included in the out-of-sample forecastingperiod. We found in Table 10 that the threshold GARCH model outperforms GARCH model inall 5 measures for 12 out of 18 stocks and provides a better forecast for the volatility surge in theperiod of financial crisis. For the remaining 8 stocks the evidence is mixed, 3 stocks with only onemeasure favoring GARCH model, and only 1 stock with four measures supporting GARCH model.Figure 14 and 15 provide illustrations of the forecasting comparison between threshold GARCHand GARCH models. It is clear in Figure 14 that the forecasts from threshold GARCH modelpick up more variation in squared returns than GARCH model. Also there is a quicker responseto volatility spike by the threshold model forecasts. Even for the stocks with more forecastingmeasures against threshold model, for example in Figure 15, we have forecasts plot for GM, thereare still much more changes in squared returns explained by threshold volatility forecasts thanGARCH forecasts. It suggests that the measures are not always accurate in providing referenceon model comparison. We conclude that the better performance by threshold model is benefitedfrom a large ARCH effect in the volatile periods. When regime changes from low volatility tohigh volatility in the out-of-sample period, the volatility forecast is computed using a large ARCHcoefficient in the volatile period, therefore the effect of the increase in the previous day squaredreturn is magnified in the current day volatility forecast.

We also need to mention that in our experiment only one threshold variable is used to identifythe regime shifting for all 20 stocks, it is possible that some stocks follow the market more closelythan the others, in that case we will get much better forecasts by using threshold model. On theother hand there exist stocks which are more likely to be affected by factors other than marketrisk, in that situation using VIX as a threshold may not help much in volatility forecasting. Butit is not evidence against the use of threshold model, we can always search for the variables thatprovide more information related to the trading activities of individual stocks.

45

Figure 14: Volatility Forecasts Comparison (MMM)

Jan 2008 May 2008 Aug 2008 Dec 20080

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

volat

ility F

orec

asts

Comparing Volatility Forecasts (MMM)

Threshold Model ForecastsGarch Model ForecastsSquared Returns

Figure 15: Volatility Forecasts Comparison (GM)

Jan 2008 May 2008 Aug 2008 Dec 20080

0.02

0.04

0.06

0.08

0.1

0.12

0.14

vola

tility

For

ecas

ts

Comparing Volatility Forecasts (GM)

Threshold Model ForecastsGarch Model ForecastsSquared Returns

46

4.4.2 Using Realized Volatility to Approximate Actual Volatility

The empirical results show a better fit of threshold volatility forecasts with data variation. Howeverthere is still a large proportion of variation in data not explained by theoretical models (See Figure14 and 15 and low R2 in Table 10), with the wider availability of high frequency data, researcherspropose that the poor performance of volatility forecasting models may be caused by the use ofa proxy of actual volatility. In the empirical application we compare the volatility forecast withsquared daily return which is a proxy of actual daily volatility. Though the squared return is aunbiased estimator of true volatility, it is a very noisy measure due to the nature of its distribution.Shown by Lopez (2001), almost 75% of the time the squared de-meaned returns are either 50%larger or smaller than true variance:

Let rt = µ+ εt , εt = σtzt ,given zt ∼ N(0, 1). ThenE(ε2

t |φt−1) = σ2tE(z2

t |φt−1) = σ2t

since z2t ∼ χ2

(1), we actually can show that:Prob(ε2

t ∈ [12σ2t ,

32σ2t ]) = Prob(z2

t ∈ [12, 3

2]) = 0.2588

Based on the above discussion we would like to ask if the poor performance of the volatilityforecasts is actually a result of using bad proxy of true volatility. We use intra-day high frequencydata of IBM to construct the daily realized volatility as a measure of true volatility. The datacontains 938 daily observations from March 03, 2003 to September 24, 2008.

The realized volatility is defined as the sum of squared intraday returns. Theoretically, themeasure provides a consistent estimator of the latent volatility and becomes more efficient if themost frequent interval is used to compute the measure. Empirically, due to the inefficiency of thefinancial market, the noisy trading activities may bias the realized volatility and the problem wors-ens when higher and higher frequencies are used. Here we follow Andersen and Bollerslev (1998)and employ the empirically optimal sampling frequency at 5-minute interval. The construction ofrealized volatility is explained as follows, given the closing price Ph,t at any 5-minute interval atday t, the 5-minute return rh,t is defined as:

rh,t = 100(logPh,t − logPh−1,t) h = 1, 2..., H; t = 1, 2, ..., T

then, the realized volatility is simply the sum of squared 5-minute returns in a day:

RVt =H∑h=1

r2h,t

47

To reduce the effect of noise in the high frequency trading, we further apply the scaling correc-tion to the above realized volatility measure as proposed by Martens (2002):

RVt =H

w∑h=1

r2h,t

where w = 1 + v1v2, and

v1 =10000

T

T∑t=1

(logPH,t − logP0,t)2 v2 =

10000

T

T∑t=1

(logP0,t − logPH,t−1)2

Table B7 in Appendix B provides the sample statistics of the data and the constructed realizedvolatility. It is very obvious that the realized volatility has smaller variance than that of squaredreturns.

Since the sample size is much smaller than that in the earlier empirical study, we only forecast100 daily volatilities using threshold GARCH and GARCH models. The estimating and forecast-ing process is same as in the previous section. Table 11 and Figure 16 present the forecastingperformance of both models.

Table 11: Forecasting Performance of IBM using Realized VolatilityME MPE RMSE HMSE R2

TS GARCH .2534 .1635 2.5337 .3113 .5532GARCH .4954 .2273 4.954 .615 .4095

48

Figure 16: Forecasting Performance of IBM using RV

0 10 20 30 40 50 60 70 80 90 1000

2

4

6

8

10

12

14

16

18

20

volat

ility F

orec

asts

Comparing Volatility Forecasts with Realized Volatilities(IBM)

Threshold Model ForecastsGarch Model ForecastsRealized Volatility

The use of realized volatility to approximate the latent volatility definitely improves the fore-casting performance of both threshold GARCH and simple GARCH models. All 5 measures inTable 5 support the threshold GARCH model and in Figure 5 we are glad to see a much bet-ter fit to the data variation when realized volatility is used. We also confirm that the thresholdGARCH model produces better forecasts when there is a sudden change in market conditions. Ourfindings are similar to that of realized volatility literature in forecasting, the R2 computed fromthreshold GARCH model is 0.5532, which is more than 2 times of that obtained by using dailysquared returns, 0.2026. The improvement of R2 is more significant for GARCH model, with amore than 3 times increase from 0.1268 to 0.4095. The results are consistent with the findings inBlair, Poon, and Taylor (2001), they report a three to four times increase of R2 when intra-day5-minute squared returns are used to approximate the true volatility.

4.4.3 Forecasting Results for NASDAQ Stocks

The exogenous trigger VIX indeed help us improving the performance of volatility forecasts inthreshold model. How about the endogenous triggers, can they also provide better volatility

49

forecasts? Table 12 and 13 show the results of 5 measures to evaluate forecasting performance for12 NASDAQ stocks using volume/SHOUT and number of trades as thresholds.

Table 12: Forecasting Results for 12 NASDAQ Stocks (Volume/SHOUT)ME1 MPE RMSE HMSE R2

AAPL TS GARCH .0194* -.3957* .0031* 1.873 .0327*GARCH .0219 -.4421 .0035 1.4052* .0266

AMAT TS GARCH -.0058* -.1264* .0009* 2.1467 .0201*GARCH -.0071 -.1668 .0011 2.0354* .0123

CSCO TS GARCH .0007* .3494 .0001* 69.7813 .0014*GARCH .0033 .2620* .0005 53.3814* .0013

DELL TS GARCH .0105 -.1631* .0017 2.7195 .0026GARCH .0089* -.1852 .0014* 2.2339* .0126*

EBAY TS GARCH .0064* -.0542* .0010* 3.2268 .0016*GARCH .0119 -.2267 .0019 2.1481* .0002

GOOG TS GARCH .0005* .2711 .0001* 27.9418 .0024GARCH .0032 .1749* .0005 16.6722* .0028*

INTC TS GARCH .0007* -.0364* .0001* 2.6210 .0359*GARCH .0077 -.2104 .0012 2.0591* .0078

MSFT TS GARCH .0053 -.1802 .0008 2.5389* .0263*GARCH .0045* -.1578* .0007* 2.6099 .0142

NVDA TS GARCH .0281* -.2566* .0045* 1.9126* .0052*GARCH .0304 -.2640 .0048 2.4916 .0000

ORCL TS GARCH .0107 -.2503 .0017 2.8931* .0067*GARCH .0062* -.1596* .0010* 3.4079 .0000

QCOM TS GARCH .002* .1684* .0003* 33.6765* .0005GARCH .0056 .2374 .0009 54.6724 .0024*

YHOO TS GARCH .022* -.2622* .0035* 2.6396 .0085*GARCH .0266 -.4202 .0042 1.551* .0029


50

Table 13: Forecasting Results for 12 NASDAQ Stocks (Number of Trades)ME1 MPE RMSE HMSE R2

AAPL TS GARCH .011* -.2938* .0017* 2.1358 .0329*GARCH .0219 -.4421 .0035 1.4052* .0266

AMAT TS GARCH -.0049* -.1373* .0008* 1.9829* .0369*GARCH -.0071 -.1668 .0011 2.0354 .0123

CSCO TS GARCH .0025* .4324 .0004* 84.1723 .0017*GARCH .0033 .2620* .0005 53.3814* .0013

DELL TS GARCH .0054* -.0732* .0008* 3.1744 .0016GARCH .0089 -.1852 .0014 2.2339* .0126*

EBAY TS GARCH .0133 -.2197* .0021 2.2653 .001*GARCH .0119* -.2267 .0019* 2.1481* .0002

GOOG TS GARCH .0009* .0454* .0001* 10.362* .0029*GARCH .0032 .1749 .0005 16.6722 .0028

INTC TS GARCH .0034* -.1390* .0001* 2.0444* .0417*GARCH .0077 -.2104 .0012 2.0591 .0078

MSFT TS GARCH .0012* -.0197* .0002* 4.0226 .0299*GARCH .0045 -.1578 .0007 2.6099* .0142

NVDA TS GARCH .0336 -.3128 .0053 1.4576* .0254*GARCH .0304* -.2640* .0048* 2.4916 .0000

ORCL TS GARCH .0062 -.1968 .0010 3.1696* .0017*GARCH .0062 -.1596* .0010 3.4079 .0000

QCOM TS GARCH .0055* .1559* .0009* 35.8772* .0005GARCH .0056 .2374 .0009 54.6724 .0024*

YHOO TS GARCH .022* -.3702* .0035* 1.7794 .0025GARCH .0266 -.4202 .0042 1.551* .0029*


The forecasting results are always better illustrated by graphs. Figure 17 provides 2 stocksMSFT and DELL using volume turnover as threshold with more measures support GARCH fore-casts, however it is clear that the smaller forecasting errors are mainly due to the nature ofsmoothed GARCH forecasts. We cannot conclude that the GARCH model performs better inforecasting volatility using these measures, because the threshold forecasts have a shape closerto stock’s return variation. Figure 18 gives the volatility forecasts for NVDA and YHOO usingnumber of trades as threshold. Again, it suggests that we cannot simply use forecasting errors

51

to compare volatility models. Here using squared return as true volatility measure also generatesproblem in forecasting comparison since the squared returns are very noisy in the out-of-sampleperiod. Though some measures favor GARCH model instead of threshold model, the thresholdforecasts describe the return variation better than GARCH forecasts.

Figure 17: Forecasting Performance of MSFT and DELL using Vol/SHOUT as Threshold

Jan 2010 May 2010 Aug 2010 Dec 20100

0.5

1

1.5

2

2.5

3x 10

−3 MSFT Volatility Forecasts

Squared ReturnsThreshold ForecastsGARCH Forecasts

0 50 100 150 200 2500

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10

−3 Comparing Volatility Forecasts (DELL)


52

Figure 18: Forecasting Performance of NVDA using Number of Trades as Threshold

Jan 2010 May 2010 Aug 2010 Dec 20100

0.002

0.004

0.006

0.008

0.01

0.012

0.014NVDA Volatility Forecasts


Jan 2010 May 2010 Aug 2010 Dec 20100

1

2

3

4

5

6

7

8x 10

−3 YHOO Volatility Forecasts


53

5 Conclusion

In this paper we define a threshold GARCH model with exogenous or endogenous triggers todescribe the conditional variance dynamics. This model can capture both the regime-switchingfeature and the effect of an information variable such as VIX on the regime shifts of the conditionalvariance process. The model is flexible to accommodate other threshold variables (exogenous orendogenous) that can trigger the regime changes. We derive the theoretical conditions for theexistence of various moments in the return process assuming the threshold variable is exogenous.Although the theoretical asymptotic distribution of the estimator is hard to establish, we showby simulation that the MLE estimator is approximately consistent and efficient. The simulationresults also suggest that the performance of MLE estimator improves as the endogeneity betweenvolatility and threshold variable increases. The model is applied to 20 stocks from MMI and12 stocks from NASDAQ, the results support the use of the threshold variable to identify theregime shifts in the volatility process, especially with VIX as the threshold variable. Using VIX asthe threshold variable, we are able to identify the clustering of volatile periods corresponding tovarious financial crises, we are also able to generate more accurate volatility forecasts that pick upthe volatility spike much faster. The other threshold variables revealing individual stock tradingactivities may provide information other than market turmoil.

54

Appendix A

Proof of Proposition 1

We rewrite the conditional variance equation as:

σ2t =ω0(1− St−1) + ω1St−1 + (α0(1− St−1) + α1St−1)r2

t−1 + (β0(1− St−1) + β1St−1)σ2t−1

= ω0 + (ω1 − ω0)St−1 + [(α0 + (α1 − α0)St−1)ε2t−1 + β0 + (β1 − β0)St−1]σ2

t−1 (1)

Let

c0 = ω0(1− π) + ω1π c1 = ω1 − ω0

a0 = α0(1− π) + α1π a1 = α1 − α0

b0 = β0(1− π) + β1π b1 = β1 − β0

Bt = St − π

Then (1) can be rewritten as:

σ2t = c0 + c1Bt−1 + [ε2

t−1(a0 + a1Bt−1) + b0 + b1Bt−1]σ2t−1 (2)

The transformation makes the random variables in the coefficient have mean zero. Back sub-stitution in (2) results in:

σ2t = c0 + c1Bt−1 +

k−1∑n=1

(c0 + c1Bt−1−n)n∏

m=1

[ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m]

+k∏

m=1

[ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m]σ2

t−k

Let we define Sn(t) as:

Sn(t) =n∏

m=1

[ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m]

55

Then we have1

nln(Sn(t))

a.s.−→ E(ln[ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m])

It’s very clear that if E(ln[ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m]) < 0 , by the independence as-

sumption between ε and y that is if (1 − π)E(ln(ε2t−mα0 + β0)) + πE(ln(ε2

t−mα1 + β1)) < 0, thenthe terms Sn(t) are geometrically bounded as n increases and equation (2) has the solution:

σ2t = c0 + c1Bt−1 +

∞∑n=1

(c0 + c1Bt−1−n)Sn(t) (3)


Taking expectation on both side of equation (3) we have:

E(σ2t ) = E(c0 + c1Bt−1) +

∞∑n=1

E(c0 + c1Bt−1−n)E(Sn(t)) (4)

Given that εt and St are independent, we have:

E(Sn(t)) = E[n∏

m=1

(ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m)]

=n∏

m=1

[E(ε2t−m(a0 + a1Bt−m)) + E(b0 + b1Bt−m)]

= (a0 + b0)n

Provided that a0 + b0 < 1, that is (α0 + β0)(1− π) + (α1 + β1)π < 1, the equation (4) becomes:

56

E(σ2t ) = E(c0 + c1Bt−1) +

∞∑n=1

E(c0 + c1Bt−1−n)E(Sn(t))

= c0 + c0

∞∑n=1

(a0 + b0)n

= c0

∞∑n=0

(a0 + b0)n

=c0

1− (a0 + b0)

=ω0(1− π) + ω1π

1− [(α0 + β0)(1− π) + (α1 + β1)π]


To examine the stationarity conditions for higher order moments of return, we check the secondmoment of σ2

t :

E(σ4t ) = E[(c0 + c1Bt−1) +

∞∑n=1

(c0 + c1Bt−1−n)Sn(t)]2

= E(c0 + c1Bt−1)2 + E[2(c0 + c1Bt−1)∞∑n=1

(c0 + c1Bt−1−n)Sn(t)] + E[∞∑n=1

(c0 + c1Bt−1−n)Sn(t)]2

= c20 + c2

1π(1− π) + 2c20

∞∑n=1

E(Sn(t)) + 2c0c1

∞∑n=1

E(Bt−1−nSn(t)) + 2c0c1

∞∑n=1

E(Bt−1Sn(t))

+2c21E[Bt−1

∞∑n=1

Bt−1−nSn(t)] + E[∞∑n=1

(c0 + c1Bt−1−n)Sn(t)]2

Note that

57

E(Bt−1Sn(t)) = E{Bt−1[ε2t−1(a0 + a1Bt−1) + b0 + b1Bt−1]Sn(t− 1)}

= (a1 + b1)π(1− π)(a0 + b0)n−1

E[∞∑n=1

(c0 + c1Bt−1−n)Sn(t)]2 = E[c0

∞∑n=1

Sn(t) + c1

∞∑n=1

Bt−1−nSn(t)]2

= E[c20(∞∑n=1

Sn(t))2] + E[2c0c1

∞∑n=1

Sn(t)∞∑n=1

Bt−1−nSn(t)]

+E[c1

∞∑n=1

Bt−1−nSn(t)]2

E[c20(∞∑n=1

Sn(t))2] = c20[∞∑n=1

E(S2n(t)) + 2

∞∑n=1

∞∑l=n+1

E(Sn(t)Sl(t)]

E[2c0c1

∞∑n=1

Sn(t)∞∑n=1

Bt−1−nSn(t)] = 0

E[c1

∞∑n=1

Bt−1−nSn(t)]2 = c21[π(1− π)

∞∑n=1

E(S2n(t))

+2∞∑n=1

∞∑l=n+1

E(Bt−1−nBt−1−lSn(t)Sl(t)]

where

58

E(S2n(t)) = E

n∏m=1

(ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m)2

= E

n∏m=1

[ε4t−m(a0 + a1Bt−m)2 + 2ε2

t−m(a0 + a1Bt−m)(b0 + b1Bt−m)

+(b0 + b1Bt−m)2]

= [3a20 + a2

1π(1− π) + 2a0b0 + 2a1b1π(1− π) + b20 + b2

1π(1− π)]n

= [2a20 + (a0 + b0)2 + (2a2

1 + (a1 + b1)2)π(1− π)]n

Let A = 2a20 + (a0 + b0)2 + (2a2

1 + (a1 + b1)2)π(1− π) , then

E(Sn(t)Sl(t) = E[n∏

m=1

(ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m)2

∞∏m=n+1

(ε2t−m(a0 + a1Bt−m) + b0 + b1Bt−m)]

= An(a0 + b0)l−n

Therefore:

E[c20(∞∑n=1

Sn(t))2] = c20(∞∑n=1

An + 2∞∑n=1

∞∑l=n+1

An(a0 + b0)l−n)

E[c1

∞∑n=1

Bt−1−nSn(t)]2 = c21π(1− π)

∞∑n=1

An

Provided that

A = 2a20 + (a0 + b0)2 + (2a2

1 + (a1 + b1)2)π(1− π)

= (2α20 + (α0 + β0)2)(1− π) + (2α2

1 + (α1 + β1)2)π < 1

59

The above equations can be simplified as:

E[c20(∞∑n=1

Sn(t))2] = c20(∞∑n=1

An(1 + 2∞∑j=1

(a0 + b0)j))

= c20

A(1 + (a0 + b0))

(1− A)(1− (a0 + b0))

E[c1

∞∑n=1

Bt−1−nSn(t)]2 = c21π(1− π)

∞∑n=1

An

= c21π(1− π)

A

(1− A)

Substitute back to the expression of E(σ4t ), we have:

E(σ4t ) = c2

0 + c21π(1− π) + 2c2

0

∞∑n=1

(a0 + b0)n + 2c0c1π(1− π)(a1 + b1)∞∑n=1

(a0 + b0)n−1

+c20

A(1 + a0 + b0)

(1− A)(1− (a0 + b0))+ c2

1π(1− π)A

(1− A)

= c20 + c2

1π(1− π) +2c2

0(a0 + b0)

1− (a0 + b0)+ 2c0c1π(1− π)

a1 + b1

1− (a0 + b0)

+c20

A(1 + a0 + b0)

(1− A)(1− (a0 + b0))+ c2

1π(1− π)A

(1− A)

= c20

1 + a0 + b0

(1− A)(1− (a0 + b0))+ c1π(1− π)

c1(1− (a0 + b0)) + 2c0(a1 + b1)(1− A)

(1− A)(1− (a0 + b0)


We have σ2 = E(r2t ) =

c0

1− (a0 + b0), subtract mean from equation (2), we get:

60

σ2t − σ2 = c1Bt−1 + [ε2

t−1(a0 + a1Bt−1) + b0 + b1Bt−1]σ2t−1 − σ2(a0 + b0)

= c1Bt−1 + (a0 + a1Bt−1)r2t−1 + (b0 + b1Bt−1)σ2

t−1 − σ2(a0 + b0)

r2t − σ2 = c1Bt−1 + (a0 + a1Bt−1)r2

t−1 + (b0 + b1Bt−1)σ2t−1 − σ2(a0 + b0)− ht + r2

t

= [a0 + b0 + (a1 + b1)Bt−1](r2t−1 − σ2) + (σ2(a1 + b1) + c1)Bt−1

−(b0 + b1Bt−1)(r2t−1 − σ2

t−1) + r2t − σ2

t

= [a0 + b0 + (a1 + b1)Bt−1](r2t−1 − σ2) + (σ2(a1 + b1) + c1)Bt−1

σ2t (ε

2t − 1)− (b0 + b1Bt−1)σ2

t−1(ε2t−1 − 1)

The above expression of r2t − σ2 is very similar to the equation (3.6) in Ding and Granger

(1996). Interestingly, the above expression represents r2t −σ2 as an random coefficient ARMA(1,1)

process if we write the compound error as an MA(1) process; letting σ2t (ε

2t − 1) = vt:

σ2t (ε

2t − 1)− (b0 + b1Bt−1)σ2

t−1(ε2t−1 − 1) = vt − (b0 + b1Bt−1)vt−1

with E(vt) = 0 and E(vtvs) = 0 for all t 6= s . Therefore, without any formal proof, theinformation already gives us some idea of the behaviors of the auto covariances.

Multiply both sides by (r2t−1 − σ2):

(r2t − σ2)(r2

t−1 − σ2) = [a0 + b0 + (a1 + b1)Bt−1](r2t−1 − σ2)2 + (σ2(a1 + b1) + c1)Bt−1(r2

t−1 − σ2)

−(b0 + b1Bt−1)(r2t−1 − σ2

t−1)(r2t−1 − σ2) + (r2

t − σ2t )(r

2t−1 − σ2)

Taking expectation on both sides

E(r2t − σ2)(r2

t−1 − σ2) = (a0 + b0)E(r2t−1 − σ2)2 − b0E(r2

t−1 − σ2t−1)(r2

t−1 − σ2)

+E(r2t − σ2

t )(r2t−1 − σ2)

γ(1) = (a0 + b0)γ(0)− 2b0E(σ4t−1)

61

where γ(1) = E(r2t − σ2)(r2

t−1 − σ2) is the covariance between r2t and r2

t−1, andγ(0) = E(r2

t−1 − σ2)2 is the variance of r2t−1. If we assume that the second moment of conditional

variance or the fourth moment of residual exists, that is when the conditions in proposition 3satisfied, then it can be shown that:

ρ(1) =c2

0[2a0 − 2a0b0A0 + A0(2a21 + (a1 + b1)2)π(1− π)]

c20(2 + A− 3A2

0) + 3c1π(1− π)[c1(1− A0) + 2c0(a1 + b1)(1− A)]

+(3a0 + b0(1 + 2A0))π(1− π)[c2

1 + 2c0c1(a1 + b1)(1− A)]

c20(2 + A− 3A2

0) + 3c1π(1− π)[c1(1− A0) + 2c0(a1 + b1)(1− A)]

where A0 = a0 + b0

It is easy to show that:

γ(k) = (a0 + b0)γ(k − 1) for k ≥ 2, and ρ(k) = (a0 + b0)k−1ρ(1)

62

Appendix B

Table B1: The MLE estimates for parameters in Case 2Par True

Value(2.1)



Estimate MSE

ω0 0.02 .0229 .0005 0.02 .0226 .0003 0.02 .0201 .0002ω1 0.01 .0090 .0010 0.01 .0096 .0005 0.01 .0107 .0004

T=500 α0 0.25 .2342 .0192 0.25 .2383 .0081 0.25 .2243 .0111α1 0.25 .2415 .0313 0.25 .2370 .0123 0.25 .2265 .0183β0 0.5 .4768 .0649 0.5 .4880 .0159 0.5 .4546 .0291β1 0.75 .7973 .1577 0.9 .9314 .0439 1 .9020 .1091ω0 0.02 .0212 .0002 0.02 .0209 .0001 0.02 .0194 .0001ω1 0.01 .0093 .0003 0.01 .0103 .0001 0.01 .0103 .0001

T=1000 α0 0.25 .2482 .0092 0.25 .2463 .0024 0.25 .2307 .0074α1 0.25 .2477 .0096 0.25 .2483 .0044 0.25 .2364 .0096β0 0.5 .4811 .0375 0.5 .4949 .0042 0.5 .4611 .0235β1 0.75 .7740 .0509 0.9 .9063 .0085 1 .8962 .0914ω0 0.02 .0207 .0001 0.02 .0204 .0000 0.02 .0203 .0000ω1 0.01 .0096 .0001 0.01 .0101 .0000 0.01 .0107 .0001

T=2000 α0 0.25 .2487 .0023 0.25 .2482 .0011 0.25 .2567 .0015α1 0.25 .2473 .0030 0.25 .2477 .0018 0.25 .2599 .0027β0 0.5 .4897 .0090 0.5 .4987 .0018 0.5 .5050 .0021β1 0.75 .7651 .0152 0.9 .9012 .0033 1 .9754 .0061

63

Figure B1: Kernel Smoothing Density Estimates of MLE for Stationary Returns with FourthMoment when π = 0.5

−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.120

10

20

30

40

50

60

70

estimates



−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.120

10

20

30

40

50

60

estimates


−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

1

2

3

4

5

6

7

8

9

estimates


−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.20

1

2

3

4

5

6

7

8

9

estimates


−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

estimates


−0.5 0 0.5 1 1.5 2 2.50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

estimates


64

Figure B2: Kernel Smoothing Density Estimates of MLE for Stationary Returns without FourthMoment when π = 0.5

−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.120

10

20

30

40

50

60

70

80

estimates



−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10

10

20

30

40

50

60

estimates


−0.1 0 0.1 0.2 0.3 0.4 0.5 0.60

2

4

6

8

10

12

estimates


−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

1

2

3

4

5

6

7

8

9

10

estimates


0 0.2 0.4 0.6 0.8 10

1

2

3

4

5

6

7

8

9

estimates


0.5 1 1.50

1

2

3

4

5

6

7

estimates


65

Figure B3: Kernel Smoothing Density Estimates of MLE for Strict Stationary Returns whenπ = 0.5

−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10

10

20

30

40

50

60

70

estimates



−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10

5

10

15

20

25

30

35

40

45

50

55

estimates


−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.70

2

4

6

8

10

12

estimates


−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

1

2

3

4

5

6

7

8

9

estimates


0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

1

2

3

4

5

6

7

8

9

10

estimates


0.4 0.6 0.8 1 1.2 1.4 1.60

1

2

3

4

5

6

estimates


66

Table B2: The MLE estimates for parameters in Case 3Par True

Value(3.1)


Estimate MSE

ω0 0.02 .0175 .0007 0.02 .0210 .0009ω1 0.01 .0118 .0007 0.01 .0123 .0010

T=500 α0 0.25 .2361 .0257 0.25 .2353 .0215α1 0.25 .2437 .0304 0.25 .2412 .0236β0 0.5 .5780 .0973 0.5 .5562 .0528β1 0.7 .7088 .0988 0.75 .7505 .0658ω0 0.02 .0179 .0003 0.02 .0185 .0004ω1 0.01 .0104 .0002 0.01 .0112 .0002

T=1000 α0 0.25 .2482 .0143 0.25 .2444 .0124α1 0.25 .2501 .0119 0.25 .2491 .0080β0 0.5 .5543 .0472 0.5 .5452 .0349β1 0.7 .7075 .0352 0.75 .7425 .0200ω0 0.02 .0180 .0002 0.02 .019 .0002ω1 0.01 .0099 .0001 0.01 .0103 .0001

T=2000 α0 0.25 .2510 .0069 0.25 .2488 .0051α1 0.25 .2489 .0040 0.25 .2475 .0019β0 0.5 .5434 .0264 0.5 .5251 .0156β1 0.7 .7086 .0151 0.75 .7503 .0060

67

Figure B4: Kernel Smoothing Density Estimates of MLE for Stationary Returns with FourthMoment when π = 0.9

−0.05 0 0.05 0.1 0.150

5

10

15

20

25

30

35

40

45

50

estimates



−0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10

10

20

30

40

50

60

70

80

90

estimates


−0.4 −0.2 0 0.2 0.4 0.6 0.8 10

1

2

3

4

5

6

estimates


−0.4 −0.2 0 0.2 0.4 0.6 0.8 10

1

2

3

4

5

6

7

8

9

estimates


−1 −0.5 0 0.5 1 1.5 2 2.5 3 3.5 40

0.5

1

1.5

2

2.5

3

estimates


0 0.5 1 1.5 20

1

2

3

4

5

6

7

estimates


68

Figure B5: Kernel Smoothing Density Estimates of MLE for Stationary Returns without FourthMoment when π = 0.9

−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0.140

5

10

15

20

25

30

35

estimates



−0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.10

20

40

60

80

100

120

estimates


−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

1

2

3

4

5

6

estimates


−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

2

4

6

8

10

12

estimates


−0.5 0 0.5 1 1.50

0.5

1

1.5

2

2.5

3

3.5

estimates


−0.4 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.40

2

4

6

8

10

12

estimates


69

Table B3: Descriptive Statistics for MMI Return Series

Stocks # of observations minimum maximum mean std skewness kurtosis

AXP 7890 -.2623 .1856 5.91e-04 .0215 -.1186 11.0912

T 6274 -.2194 .2025 6.04e-04 .0173 .1174 14.7662

CHV 9845 -.1667 .2085 6.44e-04 .0166 .2082 10.5984

KO 9845 -.2469 .1957 5.70e-04 .0160 -.0172 15.5269

DIS 9845 -.2909 .1910 6.24e-04 .0209 -.1985 13.1012

DOW 9845 -.1932 .1524 5.03e-04 .0181 -.1165 10.0208

DD 9845 -.1827 .1147 4.45e-04 .0166 -.0177 8.1179

EK 9844 -.3024 .2406 1.84e-04 .0191 -.2118 22.0107

XOM 9845 -.2343 .1790 6.77e-04 .0146 .0071 19.2241

GE 9845 -.1749 .1361 5.63e-04 .0164 .0250 9.8170

GM 9844 -.3111 .3511 2.25e-04 .0251 .4792 31.6393

IBM 9842 -.2296 .1316 4.02e-04 .0169 .0614 12.9150

IP 9845 -.2695 .1912 3.37e-04 .0186 -.1214 13.2700

JNJ 9845 -.1835 .1223 5.87e-04 .0155 -.0489 8.9948

MCD 9844 -.1942 .1622 7.35e-04 .0184 0174 9.7235

MRK 9845 -.2678 .1303 5.49e-04 .0166 -.4530 14.5407

MMM 9845 -.2598 .1154 4.49e-04 .0148 -.3594 16.3442

MO 9843 -.2300 .1638 8.36e-04 .0174 -.1798 12.6358

PG 9845 -.3138 .2220 5.85e-04 .0148 -1.1529 42.4104

S 9845 -.2713 .2866 3.47e-04 .0221 -.1551 27.8540

70

Table B4: Descriptive Statistics for MMI Log Volume and Adjusted Log VolumeStocks minimum maximum mean std skewness kurtosisAXP log vol 9.3237 17.8626

adjusted -1.8842 2.6912 0.0355 0.4669 0.3654 4.2570T log vol 10.1581 18.6880

adjusted -1.7562 4.5836 0.0423 0.4361 0.4458 6.0846CHV log vol 9.2686 17.6912

adjusted -2.0112 3.1288 0.0324 0.4316 0.2436 4.0382KO log vol 7.6497 17.4945

adjusted -1.9758 3.1072 0.0350 0.4586 0.3061 4.2879DIS log vol 7.3872 18.5606

adjusted -2.2252 3.0584 0.0352 0.5551 0.3892 4.4298DOW log vol 5.2983 18.2313

adjusted -1.9404 3.0866 0.0358 0.4629 0.3480 4.1680DD log vol 7.6009 17.0011

adjusted -1.9545 2.5677 0.0329 0.4688 0.2987 4.0191EK log vol 9.5539 17.4167

adjusted -1.6952 2.7083 0.0244 0.4732 0.6337 4.6641XOM log vol 9.6421 18.5421

adjusted -1.5396 1.9074 0.0349 0.3720 0.3318 4.0766GE log vol 9.2873 20.0690

adjusted -1.8799 2.1148 0.0411 0.3993 0.3665 3.9990GM log vol 9.8574 18.5060

adjusted -1.9536 3.5558 0.0339 0.4458 0.5592 4.8711

71

Continue of Table B4Stocks minimum maximum mean std skewness kurtosisIBM log vol 9.2003 18.056

adjusted -4.1109 2.3596 0.0284 0.4033 0.3553 5.4605IP log vol 8.3187 16.9424

adjusted -2.0949 3.2737 0.0307 0.5322 0.1732 3.8458JNJ log vol 7.3132 17.7414

adjusted -1.9513 3.1720 0.0410 0.4704 0.3632 4.5897MCD log vol 7.1707 18.2812

adjusted -2.5790 2.9291 0.0328 0.5090 0.1332 4.0342MRK log vol 7.9010 18.7925

adjusted -2.1593 3.3455 0.0356 0.4614 0.4329 4.3055MMM log vol 7.6962 17.0085

adjusted -2.5537 3.3560 0.0307 0.4649 0.3684 4.6877MO log vol 7.2442 18.0183

adjusted -3.8221 2.4339 0.0390 0.4897 0.3673 4.4426PG log vol 7.3778 18.0899

adjusted -2.1488 3.3711 0.0394 0.4612 0.2727 4.8590S log vol 4.6052 19.5139

adjusted -4.0536 3.5914 0.0441 0.6459 0.4310 4.3383

72

Table B5: Descriptive Statistics for NASDAQ Return Series

Stocks # of observations minimum maximum mean std skewness kurtosis

AAPL 4027 -.5187 .3323 .0014 .0324 -.4584 25.2202

AMAT 4027 -.1402 .2563 9.61e-04 .0337 .4766 6.1062

CSCO 4027 -.1621 .2439 9.86e-04 .0289 .3782 8.7627

DELL 4027 -.1894 .2074 .0012 .0299 .0919 6.5666

EBAY 3086 -.2058 .4638 .0017 .0410 1.5657 17.4538

GOOG 1604 -.1161 .1999 .0014 .0232 .7889 11.0641

INTC 4027 -.2203 .2012 7.99e-04 .0271 -.0909 7.9032

MSFT 4027 -.1560 .1957 7.78e-04 .0220 .2407 9.2842

NVDA 3005 -.3523 .4241 .0018 .0462 .5381 11.9781

ORCL 4027 -.2915 .3109 .0011 .0315 .3557 11.5798

QCOM 4027 -.1685 .3871 .0015 .0351 .8636 10.8992

YHOO 3705 -.2184 .4797 .0015 .0413 .7366 11.4068

73

Table B6: Descriptive Statistics for NASDAQ Volume/SHOUT and Number of Trades

Volume/SHOUT Number of Trades

Stocks minimum maximum mean minimum maximum mean

AAPL .0020 .3974 .0264 477 646258 58606

AMAT .0023 .1677 .0205 342 216223 36912

CSCO .0010 .0993 .0104 509 902611 67235

DELL .0014 .1943 .0164 154 304059 39504

EBAY .0021 .2593 .0190 303 346121 39651

GOOG .0039 .6385 .0409 6600 227399 38013

INTC .0013 .0833 .0107 1723 180168 72848

MSFT .0011 .0581 .0071 844 736335 72550

NVDA .0014 .4879 .0385 76 218418 31955

ORCL .0015 .1754 .0085 691 376523 45796

QCOM .0011 .2210 .0181 79 437540 40373

YHOO 2.39e-04 .3279 .0242 27 876185 40028

74

Table B7: The Sample Statistics of HF IBM Datart r2

t RVt

mean .0190 1.4989 1.2499

variance 1.5001 14.1777 2.0106

skewness −.4659 10.4747 5.0680

kurtosis 7.3347 176.5405 42.2725

ACF (1) −.0059 .1132 .6033

ACF (2) .0243 .1072 .5326

ACF (3) .0424 .1516 .4434

ACF (4) −.0139 .0930 .4132

ACF (5) −.0396 .0325 .3771

75

References[1] Andersen, T. G. (1996), “Return Volatility and Trading Volume: An Information Flow Inter-

pretation of Stochastic Volatility,” Journal of Finance, 51, 169-204.

[2] Andersen T.G., Bollerslev T. (1998), “Answering the skeptics: Yes, standard volatility modelsdo provide accurate forecasts”, International Economic Review, 39(4), 885-905.

[3] Andersen T.G., Bollerslev T., Christoffersen, P.F., Diebold, F.X. (2005), “Volatility forecast-ing”, Working paper, Center for Financial Studies.

[4] Andersen T.G., Bollerslev T., Diebold, F.X., Labys P. (2003), “Modeling and forecastingrealized volatility”, Econometrica, 71(2), 579-625.

[5] Blair, B.J., Poon, S-H, Taylor, S.J. (2001), “Forecasting S&P 100 volatility: the incremen-tal information content of implied volatilities and high-frequency index returns”, Journal ofEconometrics, 105, 5-26.

[6] Bollerslev, T. (1986), “Generalized autoregressive conditional heteroskedasticity,” Journal ofEconometrics, 31, 307-327.

[7] Bollerslev, T., Chou, R. Y., and Kroner, K. F. (1992), “ARCH modeling in finance: A reviewof the theory and empirical evidence,” Journal of Econometrics, 52, 5-59.

[8] Bollerslev, T., and Jubinski, P. D. (1999), “Equity trading volume and volatility: Latentinformation arrivals and common Long-Run dependencies,” Journal of Business and EconomicStatistics, 17, 9-21.

[9] Brooks, C. (1998), “Predicting stock index volatility: Can market volume help?”, Journal ofForecasting, 17, 59-80.

[10] Christoffersen, P.F., Diebold, F.X. (2000), “How relevant is volatility forecasting for financialrisk management?”, The Review of Economics and Statistics, 82(1), 12-22.

[11] Clark, P. K. (1973), “A subordinated stochastic process model with finite variance for specu-lative prices,” Econometrica, 41, 135-155.

[12] Ding, Z., and Granger, C. W. J. (1996), “Modeling volatility persistence of speculative returns:A new approach,” Journal of Econometrics, 73, 185-215.

[13] Engle, R. F. (1982), “Autoregressive conditional heteroskedasticity with estimates of the vari-ance of U.K. inflation,” Econometrica, 50, 987-1008.

76

[14] Engle, R.F., Patton A.J. (2001), “What good is a volatility model?”, Quantitative Finance,1(2), 237-245.

[15] Epps, T., and Epps, M. (1976), “The stochastic dependence of security price changes andtransaction volumes: Implications for the mixture-of-distribution hypothesis,” Econometrica,44, 305-321.

[16] Flaming J., Kirby C., Ostdiek B. (2003), “The economic value of volatility timing using realizedvolatility”, Journal of Financial Economics, 67, 473-509.

[17] Franses, P.H., Ghijsels, H. (1999), “Additive outliers, GARCH and forecasting volatility”,International Journal of Forecasting, 15, 1-9.

[18] Gallant, A. R., and Rossi, P. E., and Tauchen, G. (1992), “Stock prices and volume,” TheReview of Financial Studies, 5, 199-242.

[19] Glosten, L. R., Jagannathan, R., and Runkle, D. E. (1993), “On the relation between theexpected value and the volatility of the nominal excess return on stocks,” Journal of Finance,48, 1779-1801.

[20] Gray, S. F. (1996), “Modeling the conditional distribution of interest rates as a regime-switching process,” Journal of Financial Economics, 42, 27-62.

[21] Hamilton, J. D., and Susmel, R. (1994), “Autoregressive conditional heteroskedasticity andchanges in regimes,” Journal of Econometrics, 64, 307-333.

[22] Hansen, P. R., and Lunde, A. (2005), “A forecast comparison of volatility models: Doesanything beat a GARCH (1,1)?” Journal of Applied Econometrics, 20, 873-889.

[23] Karpoff, J. M. (1987), “The relationship between price changes and trading volume: A survey,”Journal of Financial and Quantitative Analysis, 22, 109-126

[24] Klivecka, A. (2004), “Random coefficient GARCH(1,1) model with IID coefficients,” Lithua-nian Mathematical Journal, 44, 374-385.

[25] Klaassen, F. (2002), “Improving GARCH volatility forecasts”, Empirical Economics, 27(2),363-394.

[26] Knight, J., and Satchell, S. E. (2011), “Some new results for threshold AR(1) models”, Journalof Time Series Econometrics, 3(2), Article1.

[27] Kourtellos, A., Stengos, T., and Tan, C. M. (2009), “THRET: Threshold regression withendogenous threshold variables”

77

[28] Lamoureux, C. G., and Lastrapes, W. D. (1990a), “Heteroskedasticity in stock return data:Volume versus GARCH effects,” The Journal of Finance, 45, 221-229.

[29] Lamoureux, C. G., and Lastrapes, W. D. (1990b), “Persistence in variance, structural changeand the GARCH model,” Journal of Business & Economic Statistics, 8(2), 225-234.

[30] Lopez, J.A. (2001), “Evaluating the predictive accuracy of volatility models”, Journal of Fore-casting, 20(2), 87-109.

[31] Martens M. (2002), “Measuring and forecasting S&P 500 index futures volatility using high-frequency data”, Journal of Futures Markets, 22(6), 497-518.

[32] Nelson, D. B. (1991), “Conditional heteroskedasticity in asset pricing: A new approach,”Econometrica 59, 347-370.

[33] Poon, S., and Granger, C. W. J. (2003), “Forecasting volatility in financial markets: A review,”Journal of Economic Literature, 41, 478-539.

[34] Poon, S-H, Granger, C. (2003), “Practical issues in forecasting volatility”, Financial AnalystsJournal, 61(1), 45-56.

[35] Quinn, B. G. (1982), “A note on the existence of strictly stationary solutions to bilinearequations,” Journal of Time Series Analysis, 3, 249-252.

[36] Tauchen, G. E., and Pitts, M. (1983), “The price variability-volume relationship on speculativemarkets,” Econometrica, 51, 485-505.

[37] Rabemananjara, R., and Zakoian, J. M. (1993), “Threshold ARCH models and asymmetriesin volatility,” Journal of Applied Econometrics, 8, 31-49.

[38] Wagner, N. F., and Marsh, T. A. (2005), “Surprise Volume and Heteroskedasticity in EquityMarket Returns,” Available at SSRN: http://ssrn.com/abstract=591206

[39] Yu, J. (2002), “Forecasting volatility in the New Zealand stock market”, Applied FinancialEconomics, 12, 193-202.

[40] Zakoian, J. M. (1994), “Threshold heteroskedastic models,” Journal of Economic Dynamicsand Control, 18(5), 931-955.

78

Documents

Threshold GARCH Model: Theory and Applicationpublish.uwo.ca/~jwu87/files/JobMarketPaper_JingWu.pdf · Threshold GARCH Model: Theory and Application Jing Wu ... than others, a regime